[postgis-devel] KNN Centroid Example
Paul Ramsey
pramsey at cleverelephant.ca
Mon Sep 26 20:57:07 PDT 2011
I am using the square of the centroid distance, since it saves me a call into sqrt during the index traversals. If you think it should be a simple distance that can be arranged, but to the extent that it's just a number being used for sorting, the square suffices fine.
P.
On 2011-09-26, at 7:51 PM, "Paragon Corporation" <lr at pcorp.us> wrote:
> What algorithm does it use for non-points. Thought you went with centroid.
> IF so how come these don't match or is it centroid of the box?
>
>
> SELECT ST_Distance(line,pt) as dist, line <-> pt as knn_dist,
> ST_Distance(ST_Centroid(line),pt) As dist_cent
> FROM (SELECT 'SRID=3005;LINESTRING(1011102 450541, 1011103
> 450641)'::geometry As line,
> 'SRID=3005;POINT(1011102 450541)'::geometry As pt) As foo;
>
>
> Gives me:
>
> dist | knn_dist | dist_cent
> ------+----------+------------------
> 0 | 2500.25 | 50.0024999375613
>
>
>
>> -----Original Message-----
>> From: postgis-devel-bounces at postgis.refractions.net
>> [mailto:postgis-devel-bounces at postgis.refractions.net] On
>> Behalf Of Paul Ramsey
>> Sent: Monday, September 26, 2011 6:59 PM
>> To: PostGIS Development Discussion
>> Subject: [postgis-devel] KNN Centroid Example
>>
>> So, I did a few KNN queries on my wee voting areas table (8000 polys).
>>
>> First the correct answer:
>>
>> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)')
>> as d,edabbr, vaabbr from va2005 order by d limit 10;
>>
>> d | edabbr | vaabbr
>> ------------------+--------+--------
>> 0 | ALQ | 128
>> 5541.57712511724 | ALQ | 129A
>> 5579.67450712005 | ALQ | 001
>> 6083.4207708641 | ALQ | 131
>> 7691.2205404848 | ALQ | 003
>> 7900.75451037313 | ALQ | 122
>> 8694.20710669982 | ALQ | 129B
>> 9564.24289057111 | ALQ | 130
>> 12089.665931705 | ALQ | 127
>> 18472.5531479404 | ALQ | 002
>> (10 rows)
>>
>> Then the KNN raw answer:
>>
>> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)')
>> as d,edabbr, vaabbr from va2005 order by geom <->
>> 'SRID=3005;POINT(1011102 450541)' limit 10;
>>
>> d | edabbr | vaabbr
>> ------------------+--------+--------
>> 0 | ALQ | 128
>> 5579.67450712005 | ALQ | 001
>> 5541.57712511724 | ALQ | 129A
>> 8694.20710669982 | ALQ | 129B
>> 9564.24289057111 | ALQ | 130
>> 6083.4207708641 | ALQ | 131
>> 12089.665931705 | ALQ | 127
>> 24795.264503022 | ALQ | 124
>> 24587.6584922302 | ALQ | 123
>> 26764.2555463114 | ALQ | 125
>> (10 rows)
>>
>> Note the misordering in the actual distances and the
>> different entries that actually show up in the top 10.
>>
>> Finally the hybrid:
>>
>> with index_query as (
>> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)')
>> as d,edabbr, vaabbr
>> from va2005
>> order by geom <-> 'SRID=3005;POINT(1011102 450541)' limit
>> 100) select * from index_query order by d limit 10;
>>
>> d | edabbr | vaabbr
>> ------------------+--------+--------
>> 0 | ALQ | 128
>> 5541.57712511724 | ALQ | 129A
>> 5579.67450712005 | ALQ | 001
>> 6083.4207708641 | ALQ | 131
>> 7691.2205404848 | ALQ | 003
>> 7900.75451037313 | ALQ | 122
>> 8694.20710669982 | ALQ | 129B
>> 9564.24289057111 | ALQ | 130
>> 12089.665931705 | ALQ | 127
>> 18472.5531479404 | ALQ | 002
>> (10 rows)
>>
>> And again we have the right entries and the right ordering.
>> The trouble is the magic number (100) in the subquery. Still,
>> it's fast and it uses the index, so, what the hey!
>>
>> P.
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
More information about the postgis-devel
mailing list