[postgis-devel] KNN Centroid Example

Paul Ramsey pramsey at opengeo.org
Tue Sep 27 09:55:17 PDT 2011


I don't think the manhattan distance will be properly sortable (things
with the same actual distances have different manhattan distances)
P.

On Tue, Sep 27, 2011 at 9:52 AM,  <maplabs at light42.com> wrote:
> you could use the manhatten distance (dX+dY) also, and its closer to the
> actual distance
>   -Brian
>
> On Mon, 26 Sep 2011 20:57:07 -0700, Paul Ramsey
>  wrote:
> I am using the square of the centroid distance, since it saves me a call
> into sqrt during the index traversals. If you think it should be a simple
> distance that can be arranged, but to the extent that it's just a number
> being used for sorting, the square suffices fine.
>>
>> P.
>> On 2011-09-26, at 7:51 PM, "Paragon Corporation" <lr at pcorp.us> wrote:
>>
>> > What algorithm does it use for non-points.  Thought you went with
>> > centroid. > IF so how come these don't match or is it centroid of the box?
>> > > > SELECT ST_Distance(line,pt) as dist, line <-> pt as knn_dist,
>> >    ST_Distance(ST_Centroid(line),pt) As dist_cent
>> >    FROM (SELECT 'SRID=3005;LINESTRING(1011102 450541, 1011103
>> > 450641)'::geometry As line,
>> >         'SRID=3005;POINT(1011102 450541)'::geometry As pt) As foo;
>> > > > Gives me:
>> > > dist | knn_dist |    dist_cent
>> > ------+----------+------------------
>> >    0 |  2500.25 | 50.0024999375613
>> > > > >> -----Original Message-----
>> >> From: postgis-devel-bounces at postgis.refractions.net >>
>> >> [mailto:postgis-devel-bounces at postgis.refractions.net] On >> Behalf Of Paul
>> >> Ramsey
>> >> Sent: Monday, September 26, 2011 6:59 PM
>> >> To: PostGIS Development Discussion
>> >> Subject: [postgis-devel] KNN Centroid Example
>> >> >> So, I did a few KNN queries on my wee voting areas table (8000
>> >> >> polys). >> >> First the correct answer:
>> >> >> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') >> as
>> >> >> d,edabbr, vaabbr from va2005 order by d limit 10;
>> >> >>        d         | edabbr | vaabbr
>> >> ------------------+--------+--------
>> >>                0 | ALQ    | 128
>> >> 5541.57712511724 | ALQ    | 129A
>> >> 5579.67450712005 | ALQ    | 001
>> >>  6083.4207708641 | ALQ    | 131
>> >>  7691.2205404848 | ALQ    | 003
>> >> 7900.75451037313 | ALQ    | 122
>> >> 8694.20710669982 | ALQ    | 129B
>> >> 9564.24289057111 | ALQ    | 130
>> >>  12089.665931705 | ALQ    | 127
>> >> 18472.5531479404 | ALQ    | 002
>> >> (10 rows)
>> >> >> Then the KNN raw answer:
>> >> >> select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') >> as
>> >> >> d,edabbr, vaabbr from va2005 order by geom <-> >> 'SRID=3005;POINT(1011102
>> >> >> 450541)' limit 10;
>> >> >>        d         | edabbr | vaabbr
>> >> ------------------+--------+--------
>> >>                0 | ALQ    | 128
>> >> 5579.67450712005 | ALQ    | 001
>> >> 5541.57712511724 | ALQ    | 129A
>> >> 8694.20710669982 | ALQ    | 129B
>> >> 9564.24289057111 | ALQ    | 130
>> >>  6083.4207708641 | ALQ    | 131
>> >>  12089.665931705 | ALQ    | 127
>> >>  24795.264503022 | ALQ    | 124
>> >> 24587.6584922302 | ALQ    | 123
>> >> 26764.2555463114 | ALQ    | 125
>> >> (10 rows)
>> >> >> Note the misordering in the actual distances and the >> different
>> >> >> entries that actually show up in the top 10. >> >> Finally the hybrid:
>> >> >> with index_query as (
>> >>  select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') >> as
>> >> d,edabbr, vaabbr
>> >>  from va2005
>> >>  order by geom <-> 'SRID=3005;POINT(1011102 450541)' limit >> 100)
>> >> select * from index_query order by d limit 10;
>> >> >>        d         | edabbr | vaabbr
>> >> ------------------+--------+--------
>> >>                0 | ALQ    | 128
>> >> 5541.57712511724 | ALQ    | 129A
>> >> 5579.67450712005 | ALQ    | 001
>> >>  6083.4207708641 | ALQ    | 131
>> >>  7691.2205404848 | ALQ    | 003
>> >> 7900.75451037313 | ALQ    | 122
>> >> 8694.20710669982 | ALQ    | 129B
>> >> 9564.24289057111 | ALQ    | 130
>> >>  12089.665931705 | ALQ    | 127
>> >> 18472.5531479404 | ALQ    | 002
>> >> (10 rows)
>> >> >> And again we have the right entries and the right ordering. >> The
>> >> >> trouble is the magic number (100) in the subquery. Still, >> it's fast and
>> >> >> it uses the index, so, what the hey!
>> >> >> P. >> _______________________________________________
>> >> postgis-devel mailing list
>> >> postgis-devel at postgis.refractions.net
>> >> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>> >> > > > _______________________________________________
>> > postgis-devel mailing list
>> > postgis-devel at postgis.refractions.net
>> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>
>>
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>



More information about the postgis-devel mailing list