[postgis-devel] KNN Centroid Example

Nicklas Avén nicklas.aven at jordogskog.no
Tue Sep 27 00:03:15 PDT 2011


Paul

I was playing with that too, when I worked on the distance functions. To
not call sqrt before finding the closest vertex points. Then just
calculate the distance before returning the result.
But I ran into very strange behaviors. If I recall right I found that I
was playing to close to the limits of double precision capacities. 10000
km in meters times 2 gives 14 zeroes. 

/Nicklas



On Mon, 2011-09-26 at 20:57 -0700, Paul Ramsey wrote:
> I am using the square of the centroid distance, since it saves me a call into sqrt during the index traversals. If you think it should be a simple distance that can be arranged, but to the extent that it's just a number being used for sorting, the square suffices fine.
> 
> P.
> 
> On 2011-09-26, at 7:51 PM, "Paragon Corporation" <lr at pcorp.us> wrote:
> 
> > What algorithm does it use for non-points.  Thought you went with centroid.
> > IF so how come these don't match or is it centroid of the box?
> > 
> > 
> > SELECT ST_Distance(line,pt) as dist, line <-> pt as knn_dist,
> >    ST_Distance(ST_Centroid(line),pt) As dist_cent
> >    FROM (SELECT 'SRID=3005;LINESTRING(1011102 450541, 1011103
> > 450641)'::geometry As line,
> >         'SRID=3005;POINT(1011102 450541)'::geometry As pt) As foo;
> > 
> > 
> > Gives me:
> > 
> > dist | knn_dist |    dist_cent
> > ------+----------+------------------
> >    0 |  2500.25 | 50.0024999375613
> > 
> > 
> > 
> >> -----Original Message-----
> >> From: postgis-devel-bounces at postgis.refractions.net 
> >> [mailto:postgis-devel-bounces at postgis.refractions.net] On 
> >> Behalf Of Paul Ramsey
> >> Sent: Monday, September 26, 2011 6:59 PM
> >> To: PostGIS Development Discussion
> >> Subject: [postgis-devel] KNN Centroid Example
> >> 
> >> So, I did a few KNN queries on my wee voting areas table (8000 polys).
> >> 
> >> First the correct answer:
> >> 
> >> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') 
> >> as d,edabbr, vaabbr from va2005 order by d limit 10;
> >> 
> >>        d         | edabbr | vaabbr
> >> ------------------+--------+--------
> >>                0 | ALQ    | 128
> >> 5541.57712511724 | ALQ    | 129A
> >> 5579.67450712005 | ALQ    | 001
> >>  6083.4207708641 | ALQ    | 131
> >>  7691.2205404848 | ALQ    | 003
> >> 7900.75451037313 | ALQ    | 122
> >> 8694.20710669982 | ALQ    | 129B
> >> 9564.24289057111 | ALQ    | 130
> >>  12089.665931705 | ALQ    | 127
> >> 18472.5531479404 | ALQ    | 002
> >> (10 rows)
> >> 
> >> Then the KNN raw answer:
> >> 
> >> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') 
> >> as d,edabbr, vaabbr from va2005 order by geom <-> 
> >> 'SRID=3005;POINT(1011102 450541)' limit 10;
> >> 
> >>        d         | edabbr | vaabbr
> >> ------------------+--------+--------
> >>                0 | ALQ    | 128
> >> 5579.67450712005 | ALQ    | 001
> >> 5541.57712511724 | ALQ    | 129A
> >> 8694.20710669982 | ALQ    | 129B
> >> 9564.24289057111 | ALQ    | 130
> >>  6083.4207708641 | ALQ    | 131
> >>  12089.665931705 | ALQ    | 127
> >>  24795.264503022 | ALQ    | 124
> >> 24587.6584922302 | ALQ    | 123
> >> 26764.2555463114 | ALQ    | 125
> >> (10 rows)
> >> 
> >> Note the misordering in the actual distances and the 
> >> different entries that actually show up in the top 10.
> >> 
> >> Finally the hybrid:
> >> 
> >> with index_query as (
> >>  select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') 
> >> as d,edabbr, vaabbr
> >>  from va2005
> >>  order by geom <-> 'SRID=3005;POINT(1011102 450541)' limit 
> >> 100) select * from index_query order by d limit 10;
> >> 
> >>        d         | edabbr | vaabbr
> >> ------------------+--------+--------
> >>                0 | ALQ    | 128
> >> 5541.57712511724 | ALQ    | 129A
> >> 5579.67450712005 | ALQ    | 001
> >>  6083.4207708641 | ALQ    | 131
> >>  7691.2205404848 | ALQ    | 003
> >> 7900.75451037313 | ALQ    | 122
> >> 8694.20710669982 | ALQ    | 129B
> >> 9564.24289057111 | ALQ    | 130
> >>  12089.665931705 | ALQ    | 127
> >> 18472.5531479404 | ALQ    | 002
> >> (10 rows)
> >> 
> >> And again we have the right entries and the right ordering. 
> >> The trouble is the magic number (100) in the subquery. Still, 
> >> it's fast and it uses the index, so, what the hey!
> >> 
> >> P.
> >> _______________________________________________
> >> postgis-devel mailing list
> >> postgis-devel at postgis.refractions.net
> >> http://postgis.refractions.net/mailman/listinfo/postgis-devel
> >> 
> > 
> > 
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
> 





More information about the postgis-devel mailing list