[postgis-users] Question on how to optimize linestring to points distance query
Paul Ramsey
pramsey at refractions.net
Mon Jul 9 19:46:38 PDT 2007
No, I'm not suggesting that, but I am suggesting that on the non-
indexed side it is possible to break up the test feature into
multiple bounding boxes that can access the indexed side more
efficiently.
P
On 9-Jul-07, at 6:06 PM, Carl Anderson wrote:
> Paul Ramsey wrote:
>>
>> Perhaps you could try some experiments for us :)
>>
>> Write a PL/PgSQL function that takes in a linestring and returns a
>> tupleset of two (or four or eight) bounding boxes, that taken
>> together cover the linestring. By recursively dividing the master
>> box into smaller boxes that still cover the line, you should be
>> able to get bigger index selectivity.
>
> Paul are you suggesting that a geometry can have more than one GIST
> node? As I read them GIST compress and decompress as tied directly
> to a single tuple having a geometry.
>
> Maybe there could be a concept of a multi-bbox upon which some
> some index selectivity could be built?
>
> A geometry's bounding box is broken into a grid and each element to
> tested if it truly contains a part of the original geometry.
> Optimizations may exist.
>
> That would be way more expensive to index but might massively
> improve query speeds for complex objects. It would increase index
> storage size a good deal as a downside.
> It would be a super elegant way to speed index queries for complex
> entities and have little impact on simple objects.
> It would increase index storage size a good deal as a small
> downside. (The entire index works best when it all fits into
> memory, that is a server tuning issue.)
>
> Can a single geometry datatype have alternate GIST index forms? My
> guess is no. If so no half way rollout testing.
>
> C.
>
>
>>
>> If your tests work OK, this could be something of more general
>> applicability.
>>
> I get 1 30x improvement in speed breaking Polygons having more than
> 500 vertices into 1/100th chunks.
> Of course SQL environments are designed to wade through tons of
> features. So a 1 - > 100 expansion is no real problem.
>
> I have attached the plpgsql function to do the splitting. I is
> tuned around some GEOS issues I run into.
>
>
>> P
>>
>> Stephen Woodbridge wrote:
>>> Hi all,
>>>
>>> I have a linestring and a large number of points in a table. I
>>> want to find all the points with distance X of the linestring.
>>> This is a pretty straight forward query like:
>>>
>>> $LINE = "GeomFromText('LINESTRING(...)',4326)"
>>> $X = <number in meters>
>>>
>>> select distance_sphere($LINE, the_geom) as distance, *
>>> from mypoints
>>> where
>>> expand($LINE, metersToDegrees($X)) && the_geom and
>>> distance_sphere($LINE, the_geom) < $X
>>> order by distance;
>>>
>>> Discussion:
>>>
>>> Since the && operator only compares bbox it does not really
>>> matter if you expand one or the other arguments, and you should
>>> expand whichever has the fewest items for speed.
>>>
>>> In the case of a linestring that might be diagonal across your
>>> extents of the_geom, you are likely to select all the points. It
>>> seems like there should be an optimization or another operator,
>>> that would allow you to expand the points and test their bbox
>>> against the linestring. This could be done very quickly with the
>>> first stage of an Cohen-Sutherland line clipping algorithm or the
>>> even faster Liang-Barsky Line Clipping algorithm or the Nicholl-
>>> Lee-Nicholl (NLN) line clipping algorithm. These have a quick
>>> rejection algorithm between a bbox and a line segment that might
>>> be crossing it.
>>>
>>> Doing something like this might be more expensive then the &&
>>> operator but might be much faster overall.
>>>
>>> Questions:
>>>
>>> Is anything like this being done?
>>> Is there a way to compare a linestring to a large number of bbox
>>> quickly?
>>> Is there a better way to do this than the sql above?
>>>
>>> Thanks,
>>> -Steve
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>
> -- Function: public.chopintogrid(text, text, text, text, numeric,
> numeric)
>
> -- DROP FUNCTION public.chopintogrid(text, text, text, text,
> numeric, numeric);
>
> CREATE OR REPLACE FUNCTION public.chopintogrid(text, text, text,
> text, numeric, numeric)
> RETURNS text AS
> $BODY$DECLARE
> src_tab ALIAS for $1;
> dest_tab ALIAS for $2;
> shp_fld ALIAS for $3;
> tag_fld ALIAS for $4;
> offset ALIAS for $5;
> size ALIAS for $6;
> gx NUMERIC;
> gy NUMERIC;
> sml_x NUMERIC;
> sml_y NUMERIC;
> lrg_x NUMERIC;
> lrg_y NUMERIC;
> step INTEGER;
> gbox GEOMETRY;
> tag_item text;
> rec record;
> shp GEOMETRY;
> t_shp GEOMETRY;
> wrk_shp GEOMETRY;
> ssrid INTEGER;
>
> BEGIN
>
> -- create dest tab
>
> -- walk existing records
> FOR rec in EXECUTE 'select '||shp_fld||' as r1,'||tag_fld||' as r2
> from '||src_tab || ' '
> LOOP
> tag_item = rec.r2;
> ssrid = srid(rec.r1);
> -- shp = buffer(rec.r1,0);
> shp = rec.r1;
> sml_x = xmin(shp);
> sml_y = ymin(shp);
> lrg_x = xmax(shp);
> lrg_y = ymax(shp);
>
> gx = (round((sml_x-offset)/size)*size)+offset - (size/2);
>
> WHILE ( gx <= (lrg_x + size) ) LOOP
>
> gy = (round((sml_y-offset)/size)*size)+offset - (size/2);
>
> WHILE ( gy <= (lrg_y + size) ) LOOP
> -- six steps at one time
> t_shp = (make_polygon(sml_x, gy - size, lrg_x, gy + ( 6 * size )
> + (size/1.9999999), srid(shp)));
> IF ( t_shp && shp AND intersects(t_shp,shp) ) THEN
> wrk_shp = (filter(intersection(t_shp, shp),'MULTIPOLYGON'));
> ELSE
> wrk_shp = NULL;
> END IF;
>
> step = 0;
> WHILE ( wrk_shp is not null and step < 6 ) LOOP
>
> gbox = (expand(makepoint(gx,gy,srid(shp)),size/2));
>
> IF ( gbox && wrk_shp AND intersects(gbox, wrk_shp)) THEN
> gbox = intersection(gbox,wrk_shp);
> -- RAISE WARNING ' wrkshp %', asewkt(wrk_shp);
> -- RAISE WARNING ' isect %', asewkt(gbox);
> gbox = filter(gbox,'MULTIPOLYGON');
> ELSE
> gbox = NULL;
> END IF;
>
> IF ( gbox is not NULL ) THEN
> -- RAISE INFO '% %, %',gx,gy,tag_item;
>
> EXECUTE 'insert into '||dest_tab||' ('||shp_fld||','||
> tag_fld||') values (setsrid(geometry('||
> quote_literal(replace(asewkt(gbox),'SRID=0;',''))||'),'||
> ssrid||'),'||quote_literal(tag_item)||')';
> -- ELSE
> -- RAISE INFO '% % ',gx,gy;
> END IF;
>
> step = step + 1;
> gy = gy + size;
> END LOOP;
> -- RAISE INFO '% of % and %', gy, wrk_shp, lrg_y;
>
> END LOOP;
> gx = gx + size;
> END LOOP;
>
> END LOOP;
> return 'ok';
>
> END;$BODY$
> LANGUAGE 'plpgsql' VOLATILE;
> ALTER FUNCTION public.chopintogrid(text, text, text, text, numeric,
> numeric) OWNER TO candrsn;
>
> -- drop table political.places_grid_big;
> truncate political.places_grid_big;
> create table political.places_grid_big as select * from
> political.places limit 0
> ;
> create index places__shape__ind on political.places using gist(shape);
>
> select chopintogrid
> ('political.places','political.places_grid_big','shape','gid'
> ,0,10000);
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list