[postgis-users] within_distance function

Obe, Regina robe.dnd at cityofboston.gov
Mon Feb 12 07:57:23 PST 2007


I think it would make life a lot easier if we did have it in the main
distro.  People always seem to forget the expand or do it the wrong way
as I have done ... and its hard to explain it to some people why that is
necessary and then it gets a little irritating to write it out if you
are doing it a lot.

Getting back to the explicitness - I guess the main thing that bothered
me initially about my solution I had proposed is the fact that the order
of the geometries matters in some cases.  Its more obvious to see when
you are looking at the full expand && .. why the order matters, but you
lose that with the blackboxing of the function.  Then I realized if I
expand both ways, the planner is smart enough to figure out which expand
is more efficient to do.  So my revised function looks like this

CREATE OR REPLACE FUNCTION DWithin(geom1 geometry, geom2 geometry, dist
double precision)
  RETURNS boolean AS
$BODY$
	SELECT ((expand($1, $3) && $2) AND (expand($2, $3) && $1) AND
distance($1, $2) <= $3) ;
$BODY$
  LANGUAGE 'sql' IMMUTABLE;

--Results are as follows
SELECT l.pid, l.the_geom
FROM landparcels l , landparcels l2
WHERE l2.pid = '1803570000' and DWithin(l2.the_geom, l.the_geom, 1000); 

(finished in 79ms and returns 679 rows)

SELECT l.pid, l.the_geom
FROM landparcels l , landparcels l2
WHERE l2.pid = '1803570000' and DWithin(l.the_geom, l2.the_geom, 1000) ;
(finishes between 69ms, 109ms, 78 ms and returns 679 rows)

Thanks,
Regina


-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Paul
Ramsey
Sent: Monday, February 12, 2007 10:31 AM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] within_distance function

Interesting. I guess the optimizer has an easier time bringing the  
SQL function into the plan.

Now, to be "standards compliant" you should rename your function  
"DWithin()". Should we include such a thing in the main distro? Seems  
a pretty common construction in spatial SQL.

I have often wondered about the wisdom of wrapping the spatial  
booleans in appropriate SQL to auto-generate the necessary index  
calls. On the one hand it makes life easier and more automagic for  
new users.  On the other hand, explicitness of intent is lost for  
others.

P


On 12-Feb-07, at 5:54 AM, Obe, Regina wrote:

> Great suggestion Paul!  My SQL function is running just as fast as the
> expand combination in my code.
>
> My sql function looks like this
> CREATE OR REPLACE FUNCTION within_distancesql(geom1 geometry, geom2
> geometry, dist double precision)
>   RETURNS boolean AS
> $BODY$
> 	SELECT (expand($1, $3) && $2 AND distance($1, $2) <= $3) ;
> $BODY$
>   LANGUAGE 'sql' IMMUTABLE;
>
> ---
> First I realized that I had my queries backward and I wasn't taking  
> full
> advantage of my indexes because when I switched the order of my query
> from
>
> SELECT l.pid, l.the_geom
> FROM landparcels l, landparcels l2
> WHERE l2.pid = '1803570000' and (expand(l.the_geom, 1000) &&  
> l2.the_geom
> AND distance(l.the_geom, l2.the_geom) <= 1000)
>
> to
>
> SELECT l.pid, l.the_geom
> FROM landparcels l, landparcels l2
> WHERE l2.pid = '1803570000' and (expand(l2.the_geom, 1000) &&  
> l.the_geom
> AND distance(l.the_geom, l2.the_geom) <= 1000)
>
> My time went from 609ms to 78ms.  The original with_distance didn't
> matter the order of the arguments since presumably it wasn't taking
> advantage of the gist index.
>
> When I changed my query to
> SELECT l.pid, l.the_geom
> FROM landparcels l , landparcels l2
> WHERE l2.pid = '1803570000' and within_distancesql(l2.the_geom,
> l.the_geom, 1000)
>
> My time when to 78ms.
>
> For the less optimized
> SELECT l.pid, l.the_geom
> FROM landparcels l , landparcels l2
> WHERE l2.pid = '1803570000' and within_distancesql(l.the_geom,
> l2.the_geom, 1000)
>
> time was 609ms
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of  
> Paul
> Ramsey
> Sent: Friday, February 09, 2007 6:49 PM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] within_distance function
>
> Perhaps doing the function as a SQL function rather than a PL/PgSQL
> function. Shot in the dark.
>
> P
>
> Obe, Regina wrote:
>> On further inspection, I realized my within_distance is not using the
>> gist index and that the gain over the just distance check was because
>> of the efficiency of extent comparison instead of distance
>> comparison.
>>
>> I suppose there is no way to trick the optimizer into using an index
>> with the function call.  Something like a macro of some sort so it
>> macro replaces my shorter statement with the more efficient longer
>> statement?
>>
>> ________________________________
>>
>> From: postgis-users-bounces at postgis.refractions.net on behalf of Obe,
>> Regina Sent: Fri 2/9/2007 3:32 PM To: PostGIS Users Discussion
>> Subject: [postgis-users] within_distance function
>>
>>
>> I was hoping to simplify some of my distance queries by creating a
>> within_distance function which looks like
>>
>> CREATE OR REPLACE FUNCTION within_distance(geom1 geometry, geom2
>> geometry, dist double precision) RETURNS boolean AS $BODY$ BEGIN
>> return  (expand(geom1, dist) && geom2 AND distance(geom1, geom2) <=
>> dist) ; END;$BODY$ LANGUAGE 'plpgsql' IMMUTABLE;
>>
>> Upon testing this I noticed the following observations
>>
>> SELECT l.pid, l.the_geom FROM landparcels l , landparcels l2 WHERE
>> l2.pid = '1803570000' and within_distance(l.the_geom, l2.the_geom,
>> 1000) ;
>>
>> runs in 1610ms and returns 679 rows
>>
> ----------------------------------------------------------------------

> --
> ---
>>  SELECT l.pid, l.the_geom FROM landparcels l, landparcels l2 WHERE
>> l2.pid = '1803570000' and (expand(l.the_geom, 1000) && l2.the_geom
>> AND distance(l.the_geom, l2.the_geom) <= 1000)
>>
>> runs in 609 ms and returns 679 rows
>>
>>
> ----------------------------------------------------------------------

> --
> --
>>  SELECT l.pid, l.the_geom FROM landparcels l, landparcels l2 WHERE
>> l2.pid = '1803570000' and (distance(l.the_geom, l2.the_geom) <= 1000)
>>
>>
>> runs in 5437 ms and returns 679 rows
>>
>> ----- I ran the test a couple of times with similar results.  So
>> apparently as far as I can tell, my function is probably using my
>> gist indexes, but why is it 1/2 the speed of the regular expand call?
>>
>>
>>
>>
>> ________________________________
>>
>>
>>
>> The substance of this message, including any attachments, may be
>> confidential, legally privileged and/or exempt from disclosure
>> pursuant to Massachusetts law. It is intended solely for the
>> addressee. If you received this in error, please contact the sender
>> and delete the material from any computer.
>>
>>
>>
>>
>>
> ----------------------------------------------------------------------

> --
>>
>>
>> _______________________________________________ postgis-users mailing
>> list postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> -- 
>
>    Paul Ramsey
>    Refractions Research
>    http://www.refractions.net
>    pramsey at refractions.net
>    Phone: 250-383-3022
>    Cell: 250-885-0632
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list