[postgis-users] Neighborhood Function using PL/PGSQL waaay to slow?

collin collin at socrates.Berkeley.EDU
Wed Apr 21 08:53:39 PDT 2004


I'll try making a separate table and joining it with  original.  The 
NULL comment meant that most of the points had no values placed into the 
zmin  field.  The current test table a square 100m to a side, from Napa 
County, US.  The points are extremely dense: 0.25m to 1m apart.  So, 
there is no possibility of not picking up neighbors, even at the edges 
and corners.

If I pick a random point and perform a select using the select below, I 
always get points.   Could there be an issue with bounding boxes?

-- Collin Bode

Paul Ramsey wrote:

> I am not sure what you mean with your NULL comment. Could it be that 
> at  10m a large number of your points are not picking up any 
> neighbors  within 10m?
>
> Regarding a "batch update", you could do the update by joining your  
> original table and your new "inserted" table into a new table using a  
> standard SQL join. With the appropriate indexes this will be much 
> much  faster than the "death of a thousand updates" you had before.
>
> On Tuesday, April 20, 2004, at 05:36 PM, collin wrote:
>
>> Paul, Thanks for responding.  The 100m turned out to be absurdly  
>> large.  I scaled it back to 10m box, which returns roughly 300 
>> points  on average.  However, I found that only 3% of the points 
>> returned any  points at all! These ranged from 124 - 350 points 
>> returned. The rest  of the 10,000 points returned NULL.  I do not 
>> understand this result.
>>
>> I did combine the two statements into one, which now reads:
>>    EXECUTE ''UPDATE test1 SET zmin = '' ||
>>             '' (SELECT min(z) FROM test1 WHERE the_geom && '' ||
>>             '' SetSRID(
>>                  EXPAND(
>>                    GeometryFromText(''''POINT('' || thisrow.x || '' 
>> ''  || thisrow.y || '' '' || thisrow.z ||'')'''',32610)
>>                  ,10 )
>>                ,32610))
>>             WHERE x = '' || thisrow.x || '' AND y = '' || thisrow.y ;
>>
>> If you could give a suggestion/pseudo code on how to do a batch  
>> transfer, I would appreciate it.
>>
>> Collin Bode
>> GIS Informatics Researcher, UC Berkeley
>>
>> Paul Ramsey wrote:
>>
>>> First, how many results is your 100m box returning? If the database  
>>> has to repeatedly sequence scan a pile of values for each 
>>> iteration,  yes, it will be slow. Not much to do there, it is just a 
>>> consequence  of the fact that the "local minimum" problem is hard to 
>>> optimize.
>>> Second, you are doing an insert *and* an update for each iteration.  
>>> Why not either (a) save the update for a batch transfer at the end 
>>> or  (b) combine the SQL into one huge update.
>>>
>>>
>>> collin wrote:
>>>
>>>> Folks,
>>>>
>>>> I posted earlier regarding getting statistics for every point on  
>>>> their neighboring points.  Currently I am trying to find the 
>>>> lowest  alititude within a ~100 meter radius of each point.  To do 
>>>> so, I  wrote a PL/PGSQL function (see below) which iterates through 
>>>> every  row in a 10,000 row table and updates each row with the 
>>>> lowest  altitude within a bounding box of 100meters.
>>>> --------------------------------------------------------------------- 
>>>> --------------------------------
>>>> CREATE OR REPLACE FUNCTION collintest2() RETURNS integer AS '
>>>> DECLARE
>>>>  thisrow RECORD;
>>>>  zzmin double precision;
>>>>  ztot integer := 0;
>>>> BEGIN
>>>>  FOR thisrow IN SELECT * FROM test1 LOOP
>>>>     SELECT INTO zzmin min(z) FROM test1 WHERE
>>>>                                the_geom &&  
>>>> EXPAND(GeometryFromText(''POINT('' || thisrow.x || '' '' ||  
>>>> thisrow.y || '' '' || thisrow.z ||'')'',32610),100);
>>>>     EXECUTE ''UPDATE test1 SET zmin = '' || zzmin || '' WHERE x = 
>>>> ''  || thisrow.x || '' AND y = '' || thisrow.y ;
>>>>     ztot := ztot + 1;
>>>>  END LOOP;
>>>>  RETURN ztot;
>>>> END;
>>>> ' LANGUAGE 'plpgsql';
>>>> --------------------------------------------------------------------- 
>>>> -----------------------------------
>>>>
>>>> EXPLAIN ANALYZE SELECT collintest2();
>>>>            QUERY PLAN
>>>>      
>>>> --------------------------------------------------------------------- 
>>>> ---------------------------
>>>>     Result  (cost=0.00..0.01 rows=1 width=0) (actual  
>>>> time=3318131.024..3318131.027 rows=1 loops=1)
>>>>     Total runtime: 3318131.075 ms
>>>>
>>>> This runs painfully slow, as you can see (331ms per row). Is there  
>>>> another way to do this? I am new to pl/pgsql, though very familiar  
>>>> with sql.  How do I optimize? Use cursors?  I built a gist index 
>>>> on  the_geom, but it doesn't make much of a difference.
>>>>
>>>> Thanks in advance,
>>>>
>>>> Collin Bode
>>>> GIS Informatics Researcher, UC Berkeley
>>>>
>>>>
>>>> The here is the schema for the table being operated on:
>>>> CREATE TABLE test1 (
>>>>    the_geom geometry,
>>>>    zmin double precision,
>>>>    "time" double precision,
>>>>    x double precision,
>>>>    y double precision,
>>>>    z double precision,
>>>>    i double precision,
>>>>    xf double precision,
>>>>    yf double precision,
>>>>    zf double precision,
>>>>    if double precision
>>>>    CONSTRAINT "$1" CHECK ((srid(the_geom) = 32610)),
>>>>    CONSTRAINT "$2" CHECK (((geometrytype(the_geom) = 
>>>> 'POINT'::text)  OR (the_geom IS NULL)))
>>>> );
>>>>
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at postgis.refractions.net
>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>
>>>
>>>
>>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>      Paul Ramsey
>      Refractions Research
>      Email: pramsey at refractions.net
>      Phone: (250) 885-0632
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>



More information about the postgis-users mailing list