[postgis-devel] KNN Centroid Example

Paul Ramsey pramsey at opengeo.org
Mon Sep 26 15:59:14 PDT 2011


So, I did a few KNN queries on my wee voting areas table (8000 polys).

First the correct answer:

select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as d,edabbr, vaabbr
from va2005
order by d limit 10;

        d         | edabbr | vaabbr
------------------+--------+--------
                0 | ALQ    | 128
 5541.57712511724 | ALQ    | 129A
 5579.67450712005 | ALQ    | 001
  6083.4207708641 | ALQ    | 131
  7691.2205404848 | ALQ    | 003
 7900.75451037313 | ALQ    | 122
 8694.20710669982 | ALQ    | 129B
 9564.24289057111 | ALQ    | 130
  12089.665931705 | ALQ    | 127
 18472.5531479404 | ALQ    | 002
(10 rows)

Then the KNN raw answer:

select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as d,edabbr, vaabbr
from va2005
order by geom <-> 'SRID=3005;POINT(1011102 450541)' limit 10;

        d         | edabbr | vaabbr
------------------+--------+--------
                0 | ALQ    | 128
 5579.67450712005 | ALQ    | 001
 5541.57712511724 | ALQ    | 129A
 8694.20710669982 | ALQ    | 129B
 9564.24289057111 | ALQ    | 130
  6083.4207708641 | ALQ    | 131
  12089.665931705 | ALQ    | 127
  24795.264503022 | ALQ    | 124
 24587.6584922302 | ALQ    | 123
 26764.2555463114 | ALQ    | 125
(10 rows)

Note the misordering in the actual distances and the different entries
that actually show up in the top 10.

Finally the hybrid:

with index_query as (
  select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as
d,edabbr, vaabbr
  from va2005
  order by geom <-> 'SRID=3005;POINT(1011102 450541)' limit 100)
select *
from index_query
order by d limit 10;

        d         | edabbr | vaabbr
------------------+--------+--------
                0 | ALQ    | 128
 5541.57712511724 | ALQ    | 129A
 5579.67450712005 | ALQ    | 001
  6083.4207708641 | ALQ    | 131
  7691.2205404848 | ALQ    | 003
 7900.75451037313 | ALQ    | 122
 8694.20710669982 | ALQ    | 129B
 9564.24289057111 | ALQ    | 130
  12089.665931705 | ALQ    | 127
 18472.5531479404 | ALQ    | 002
(10 rows)

And again we have the right entries and the right ordering. The
trouble is the magic number (100) in the subquery. Still, it's fast
and it uses the index, so, what the hey!

P.



More information about the postgis-devel mailing list