[postgis-devel] KNN Centroid Example

Paragon Corporation lr at pcorp.us
Mon Sep 26 21:23:59 PDT 2011


No that's fine - just wanted to document it correctly. 

> -----Original Message-----
> From: postgis-devel-bounces at postgis.refractions.net 
> [mailto:postgis-devel-bounces at postgis.refractions.net] On 
> Behalf Of Paul Ramsey
> Sent: Monday, September 26, 2011 11:57 PM
> To: PostGIS Development Discussion
> Cc: PostGIS Development Discussion
> Subject: Re: [postgis-devel] KNN Centroid Example
> 
> I am using the square of the centroid distance, since it 
> saves me a call into sqrt during the index traversals. If you 
> think it should be a simple distance that can be arranged, 
> but to the extent that it's just a number being used for 
> sorting, the square suffices fine.
> 
> P.
> 
> On 2011-09-26, at 7:51 PM, "Paragon Corporation" <lr at pcorp.us> wrote:
> 
> > What algorithm does it use for non-points.  Thought you 
> went with centroid.
> > IF so how come these don't match or is it centroid of the box?
> > 
> > 
> > SELECT ST_Distance(line,pt) as dist, line <-> pt as knn_dist,
> >    ST_Distance(ST_Centroid(line),pt) As dist_cent
> >    FROM (SELECT 'SRID=3005;LINESTRING(1011102 450541, 1011103 
> > 450641)'::geometry As line,
> >         'SRID=3005;POINT(1011102 450541)'::geometry As pt) As foo;
> > 
> > 
> > Gives me:
> > 
> > dist | knn_dist |    dist_cent
> > ------+----------+------------------
> >    0 |  2500.25 | 50.0024999375613
> > 
> > 
> > 
> >> -----Original Message-----
> >> From: postgis-devel-bounces at postgis.refractions.net
> >> [mailto:postgis-devel-bounces at postgis.refractions.net] On 
> Behalf Of 
> >> Paul Ramsey
> >> Sent: Monday, September 26, 2011 6:59 PM
> >> To: PostGIS Development Discussion
> >> Subject: [postgis-devel] KNN Centroid Example
> >> 
> >> So, I did a few KNN queries on my wee voting areas table 
> (8000 polys).
> >> 
> >> First the correct answer:
> >> 
> >> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as 
> >> d,edabbr, vaabbr from va2005 order by d limit 10;
> >> 
> >>        d         | edabbr | vaabbr
> >> ------------------+--------+--------
> >>                0 | ALQ    | 128
> >> 5541.57712511724 | ALQ    | 129A
> >> 5579.67450712005 | ALQ    | 001
> >>  6083.4207708641 | ALQ    | 131
> >>  7691.2205404848 | ALQ    | 003
> >> 7900.75451037313 | ALQ    | 122
> >> 8694.20710669982 | ALQ    | 129B
> >> 9564.24289057111 | ALQ    | 130
> >>  12089.665931705 | ALQ    | 127
> >> 18472.5531479404 | ALQ    | 002
> >> (10 rows)
> >> 
> >> Then the KNN raw answer:
> >> 
> >> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as 
> >> d,edabbr, vaabbr from va2005 order by geom <->
> >> 'SRID=3005;POINT(1011102 450541)' limit 10;
> >> 
> >>        d         | edabbr | vaabbr
> >> ------------------+--------+--------
> >>                0 | ALQ    | 128
> >> 5579.67450712005 | ALQ    | 001
> >> 5541.57712511724 | ALQ    | 129A
> >> 8694.20710669982 | ALQ    | 129B
> >> 9564.24289057111 | ALQ    | 130
> >>  6083.4207708641 | ALQ    | 131
> >>  12089.665931705 | ALQ    | 127
> >>  24795.264503022 | ALQ    | 124
> >> 24587.6584922302 | ALQ    | 123
> >> 26764.2555463114 | ALQ    | 125
> >> (10 rows)
> >> 
> >> Note the misordering in the actual distances and the different 
> >> entries that actually show up in the top 10.
> >> 
> >> Finally the hybrid:
> >> 
> >> with index_query as (
> >>  select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as 
> >> d,edabbr, vaabbr  from va2005  order by geom <-> 
> >> 'SRID=3005;POINT(1011102 450541)' limit
> >> 100) select * from index_query order by d limit 10;
> >> 
> >>        d         | edabbr | vaabbr
> >> ------------------+--------+--------
> >>                0 | ALQ    | 128
> >> 5541.57712511724 | ALQ    | 129A
> >> 5579.67450712005 | ALQ    | 001
> >>  6083.4207708641 | ALQ    | 131
> >>  7691.2205404848 | ALQ    | 003
> >> 7900.75451037313 | ALQ    | 122
> >> 8694.20710669982 | ALQ    | 129B
> >> 9564.24289057111 | ALQ    | 130
> >>  12089.665931705 | ALQ    | 127
> >> 18472.5531479404 | ALQ    | 002
> >> (10 rows)
> >> 
> >> And again we have the right entries and the right ordering. 
> >> The trouble is the magic number (100) in the subquery. Still, it's 
> >> fast and it uses the index, so, what the hey!
> >> 
> >> P.
> >> _______________________________________________
> >> postgis-devel mailing list
> >> postgis-devel at postgis.refractions.net
> >> http://postgis.refractions.net/mailman/listinfo/postgis-devel
> >> 
> > 
> > 
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
> 





More information about the postgis-devel mailing list