[postgis-users] Question on how to optimize linestring to points distance query

Paul Ramsey pramsey at refractions.net
Mon Jul 9 19:46:38 PDT 2007


No, I'm not suggesting that, but I am suggesting that on the non- 
indexed side it is possible to break up the test feature into  
multiple bounding boxes that can access the indexed side more  
efficiently.

P

On 9-Jul-07, at 6:06 PM, Carl Anderson wrote:

> Paul Ramsey wrote:
>>
>> Perhaps you could try some experiments for us :)
>>
>> Write a PL/PgSQL function that takes in a linestring and returns a  
>> tupleset of two (or four or eight) bounding boxes, that taken  
>> together cover the linestring.  By recursively dividing the master  
>> box into smaller boxes that still cover the line, you should be  
>> able to get bigger index selectivity.
>
> Paul are you suggesting that a geometry can have more than one GIST  
> node?  As I read them GIST compress and decompress as tied directly  
> to a single tuple having a geometry.
>
> Maybe there could be a concept of a multi-bbox  upon which some  
> some index selectivity could be built?
>
> A geometry's bounding box is broken into a grid and each element to  
> tested if it truly contains a part of the original geometry.
> Optimizations may exist.
>
> That would be way more expensive to index but might massively  
> improve query speeds for complex objects.  It would increase index  
> storage size a good deal as a downside.
> It would be a super elegant way to speed index queries for complex  
> entities and have little impact on simple objects.
> It would increase index storage size a good deal as a small  
> downside.  (The entire index works best when it all fits into  
> memory, that is a server tuning issue.)
>
> Can a single geometry datatype have alternate GIST index forms?  My  
> guess is no.  If so no half way rollout testing.
>
> C.
>
>
>>
>> If your tests work OK, this could be something of more general  
>> applicability.
>>
> I get 1 30x improvement in speed breaking Polygons having more than  
> 500 vertices into 1/100th chunks.
> Of course SQL environments are designed to wade through tons of  
> features.  So a 1 - > 100 expansion is no real problem.
>
> I have attached the plpgsql function to do the splitting.  I is  
> tuned around some GEOS issues I run into.
>
>
>> P
>>
>> Stephen Woodbridge wrote:
>>> Hi all,
>>>
>>> I have a linestring and a large number of points in a table. I  
>>> want to find all the points with distance X of the linestring.  
>>> This is a pretty straight forward query like:
>>>
>>> $LINE = "GeomFromText('LINESTRING(...)',4326)"
>>> $X    = <number in meters>
>>>
>>> select distance_sphere($LINE, the_geom) as distance, *
>>>   from mypoints
>>>  where
>>>    expand($LINE, metersToDegrees($X)) && the_geom and
>>>    distance_sphere($LINE, the_geom) < $X
>>>  order by distance;
>>>
>>> Discussion:
>>>
>>> Since the && operator only compares bbox it does not really  
>>> matter if you expand one or the other arguments, and you should  
>>> expand whichever has the fewest items for speed.
>>>
>>> In the case of a linestring that might be diagonal across your  
>>> extents of the_geom, you are likely to select all the points. It  
>>> seems like there should be an optimization or another operator,  
>>> that would allow you to expand the points and test their bbox  
>>> against the linestring. This could be done very quickly with the  
>>> first stage of an Cohen-Sutherland line clipping algorithm or the  
>>> even faster Liang-Barsky Line Clipping algorithm or the Nicholl- 
>>> Lee-Nicholl (NLN) line clipping algorithm. These have a quick  
>>> rejection algorithm between a bbox and a line segment that might  
>>> be crossing it.
>>>
>>> Doing something like this might be more expensive then the &&  
>>> operator but might be much faster overall.
>>>
>>> Questions:
>>>
>>> Is anything like this being done?
>>> Is there a way to compare a linestring to a large number of bbox  
>>> quickly?
>>> Is there a better way to do this than the sql above?
>>>
>>> Thanks,
>>>   -Steve
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>
> -- Function: public.chopintogrid(text, text, text, text, numeric,  
> numeric)
>
> -- DROP FUNCTION public.chopintogrid(text, text, text, text,  
> numeric, numeric);
>
> CREATE OR REPLACE FUNCTION public.chopintogrid(text, text, text,  
> text, numeric, numeric)
>   RETURNS text AS
> $BODY$DECLARE
>  src_tab	ALIAS for $1;
>  dest_tab	ALIAS for $2;
>  shp_fld	ALIAS for $3;
>  tag_fld        ALIAS for $4;
>  offset		ALIAS for $5;
>  size		ALIAS for $6;
>  gx		NUMERIC;
>  gy		NUMERIC;
>  sml_x          NUMERIC;
>  sml_y          NUMERIC;
>  lrg_x          NUMERIC;
>  lrg_y          NUMERIC;
>  step           INTEGER;
>  gbox		GEOMETRY;
>  tag_item 	text;
>  rec            record;
>  shp 		GEOMETRY;
>  t_shp          GEOMETRY;
>  wrk_shp	GEOMETRY;
>  ssrid          INTEGER;
>
> BEGIN
>
> -- create dest tab
>
> -- walk existing records
> FOR rec in EXECUTE 'select '||shp_fld||' as r1,'||tag_fld||' as r2
>          from '||src_tab || '  '
>           LOOP
>   tag_item = rec.r2;
>   ssrid = srid(rec.r1);
> --  shp = buffer(rec.r1,0);
>   shp = rec.r1;
>   sml_x = xmin(shp);
>   sml_y = ymin(shp);
>   lrg_x = xmax(shp);
>   lrg_y = ymax(shp);
>
>   gx = (round((sml_x-offset)/size)*size)+offset - (size/2);
>
> WHILE ( gx <= (lrg_x + size) ) LOOP
>
>   gy = (round((sml_y-offset)/size)*size)+offset - (size/2);
>
>   WHILE ( gy <= (lrg_y + size) ) LOOP
>  -- six steps at one time
>   t_shp = (make_polygon(sml_x, gy - size, lrg_x, gy + ( 6 * size )  
> + (size/1.9999999), srid(shp)));
>   IF ( t_shp && shp AND intersects(t_shp,shp) ) THEN
>     wrk_shp = (filter(intersection(t_shp, shp),'MULTIPOLYGON'));
>   ELSE
>     wrk_shp = NULL;
>   END IF;
>
>   step = 0;
>   WHILE ( wrk_shp is not null and step < 6 ) LOOP
>
>     gbox = (expand(makepoint(gx,gy,srid(shp)),size/2));
>
>     IF ( gbox && wrk_shp AND intersects(gbox, wrk_shp)) THEN
>       gbox = intersection(gbox,wrk_shp);
> -- RAISE WARNING ' wrkshp %', asewkt(wrk_shp);
> -- RAISE WARNING ' isect %', asewkt(gbox);
>       gbox = filter(gbox,'MULTIPOLYGON');
>     ELSE
>       gbox = NULL;
>     END IF;
>
>     IF ( gbox is not NULL ) THEN
> --      RAISE INFO '% %, %',gx,gy,tag_item;
>
>       EXECUTE 'insert into '||dest_tab||' ('||shp_fld||','|| 
> tag_fld||') values (setsrid(geometry('||
>         quote_literal(replace(asewkt(gbox),'SRID=0;',''))||'),'|| 
> ssrid||'),'||quote_literal(tag_item)||')';
> --    ELSE
> --      RAISE INFO '% % ',gx,gy;
>     END IF;
>
>     step = step + 1;
>     gy = gy + size;
>   END LOOP;
> -- RAISE INFO '% of % and %', gy, wrk_shp, lrg_y;
>
>   END LOOP;
>   gx = gx + size;
> END LOOP;
>
> END LOOP;
> return 'ok';
>
> END;$BODY$
>   LANGUAGE 'plpgsql' VOLATILE;
> ALTER FUNCTION public.chopintogrid(text, text, text, text, numeric,  
> numeric) OWNER TO candrsn;
>
> -- drop table political.places_grid_big;
> truncate political.places_grid_big;
> create table political.places_grid_big as select * from  
> political.places limit 0
> ;
> create index places__shape__ind on political.places using gist(shape);
>
> select chopintogrid 
> ('political.places','political.places_grid_big','shape','gid'
> ,0,10000);
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list