[postgis-users] Neighborhood Function using PL/PGSQL waaay to slow?

Paul Ramsey pramsey at refractions.net
Tue Apr 20 19:12:00 PDT 2004


I am not sure what you mean with your NULL comment. Could it be that at  
10m a large number of your points are not picking up any neighbors  
within 10m?

Regarding a "batch update", you could do the update by joining your  
original table and your new "inserted" table into a new table using a  
standard SQL join. With the appropriate indexes this will be much much  
faster than the "death of a thousand updates" you had before.

On Tuesday, April 20, 2004, at 05:36 PM, collin wrote:

> Paul, Thanks for responding.  The 100m turned out to be absurdly  
> large.  I scaled it back to 10m box, which returns roughly 300 points  
> on average.  However, I found that only 3% of the points returned any  
> points at all! These ranged from 124 - 350 points returned. The rest  
> of the 10,000 points returned NULL.  I do not understand this result.
>
> I did combine the two statements into one, which now reads:
>    EXECUTE ''UPDATE test1 SET zmin = '' ||
>             '' (SELECT min(z) FROM test1 WHERE the_geom && '' ||
>             '' SetSRID(
>                  EXPAND(
>                    GeometryFromText(''''POINT('' || thisrow.x || '' ''  
> || thisrow.y || '' '' || thisrow.z ||'')'''',32610)
>                  ,10 )
>                ,32610))
>             WHERE x = '' || thisrow.x || '' AND y = '' || thisrow.y ;
>
> If you could give a suggestion/pseudo code on how to do a batch  
> transfer, I would appreciate it.
>
> Collin Bode
> GIS Informatics Researcher, UC Berkeley
>
> Paul Ramsey wrote:
>
>> First, how many results is your 100m box returning? If the database  
>> has to repeatedly sequence scan a pile of values for each iteration,  
>> yes, it will be slow. Not much to do there, it is just a consequence  
>> of the fact that the "local minimum" problem is hard to optimize.
>> Second, you are doing an insert *and* an update for each iteration.  
>> Why not either (a) save the update for a batch transfer at the end or  
>> (b) combine the SQL into one huge update.
>>
>>
>> collin wrote:
>>
>>> Folks,
>>>
>>> I posted earlier regarding getting statistics for every point on  
>>> their neighboring points.  Currently I am trying to find the lowest  
>>> alititude within a ~100 meter radius of each point.  To do so, I  
>>> wrote a PL/PGSQL function (see below) which iterates through every  
>>> row in a 10,000 row table and updates each row with the lowest  
>>> altitude within a bounding box of 100meters.
>>> --------------------------------------------------------------------- 
>>> --------------------------------
>>> CREATE OR REPLACE FUNCTION collintest2() RETURNS integer AS '
>>> DECLARE
>>>  thisrow RECORD;
>>>  zzmin double precision;
>>>  ztot integer := 0;
>>> BEGIN
>>>  FOR thisrow IN SELECT * FROM test1 LOOP
>>>     SELECT INTO zzmin min(z) FROM test1 WHERE
>>>                                the_geom &&  
>>> EXPAND(GeometryFromText(''POINT('' || thisrow.x || '' '' ||  
>>> thisrow.y || '' '' || thisrow.z ||'')'',32610),100);
>>>     EXECUTE ''UPDATE test1 SET zmin = '' || zzmin || '' WHERE x = ''  
>>> || thisrow.x || '' AND y = '' || thisrow.y ;
>>>     ztot := ztot + 1;
>>>  END LOOP;
>>>  RETURN ztot;
>>> END;
>>> ' LANGUAGE 'plpgsql';
>>> --------------------------------------------------------------------- 
>>> -----------------------------------
>>>
>>> EXPLAIN ANALYZE SELECT collintest2();
>>>            QUERY PLAN
>>>      
>>> --------------------------------------------------------------------- 
>>> ---------------------------
>>>     Result  (cost=0.00..0.01 rows=1 width=0) (actual  
>>> time=3318131.024..3318131.027 rows=1 loops=1)
>>>     Total runtime: 3318131.075 ms
>>>
>>> This runs painfully slow, as you can see (331ms per row). Is there  
>>> another way to do this? I am new to pl/pgsql, though very familiar  
>>> with sql.  How do I optimize? Use cursors?  I built a gist index on  
>>> the_geom, but it doesn't make much of a difference.
>>>
>>> Thanks in advance,
>>>
>>> Collin Bode
>>> GIS Informatics Researcher, UC Berkeley
>>>
>>>
>>> The here is the schema for the table being operated on:
>>> CREATE TABLE test1 (
>>>    the_geom geometry,
>>>    zmin double precision,
>>>    "time" double precision,
>>>    x double precision,
>>>    y double precision,
>>>    z double precision,
>>>    i double precision,
>>>    xf double precision,
>>>    yf double precision,
>>>    zf double precision,
>>>    if double precision
>>>    CONSTRAINT "$1" CHECK ((srid(the_geom) = 32610)),
>>>    CONSTRAINT "$2" CHECK (((geometrytype(the_geom) = 'POINT'::text)  
>>> OR (the_geom IS NULL)))
>>> );
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
      Paul Ramsey
      Refractions Research
      Email: pramsey at refractions.net
      Phone: (250) 885-0632




More information about the postgis-users mailing list