[postgis-devel] KNN and Semantics

Paul Ramsey pramsey at opengeo.org
Mon Sep 26 14:40:56 PDT 2011


Trade-offs abound.
centroid-vs-centroid has the advantage of ease of understanding
box-vs-box has the advantage of being over-determined. you could at
least theoretically do an index-assisted knn pull wrapped as a
subquery inside a standard st_distance test. however, that goes back
to magic numbers again (how many items should one pull in the indexed
query) so...
There don't seem to be any great soln's here.
P.

On Mon, Sep 26, 2011 at 2:35 PM, Paragon Corporation <lr at pcorp.us> wrote:
> Okay I get the issue now forgot it was for order by stuff. I guess from my
> perspective bounding box would be better though that's not ideal either.
> for long linestrings which is what I'm usually dealing with centroid is
> useless and for small polygons the bounding box is a better approximation
> than centroid
> for thumbnail check.
>
> It's ST_Distance BTW -- get with the program Paul.  Your beloved distance is
> gone. No more. history vamush. :)
>
> Thanks,
> Regina
>
>
>> -----Original Message-----
>> From: postgis-devel-bounces at postgis.refractions.net
>> [mailto:postgis-devel-bounces at postgis.refractions.net] On
>> Behalf Of Paul Ramsey
>> Sent: Monday, September 26, 2011 4:25 PM
>> To: PostGIS Development Discussion
>> Subject: Re: [postgis-devel] KNN and Semantics
>>
>> KNNGist walks the index tree to provide ordered results.
>> Great.
>> But it's walking the tree, so the results have to be based on boxes.
>> In the case of points, the box == the point, so the ambiguity
>> collapses.
>> In the case of everything else, it does not.
>>
>> So, when someone does
>>
>> select * from mytable order by geom <-> 'polygon()'::geometry;
>>
>> what should they get back to provide the minimum surprise?
>> Because they *will* get back results that differ from
>>
>> select * from mytable order by distance(geom, 'polygon()'::geometry);
>>
>> sometimes substantially.
>>
>> P.
>>
>> On Mon, Sep 26, 2011 at 1:19 PM, Paragon Corporation
>> <lr at pcorp.us> wrote:
>> >
>> >> So, here's the deal, the KNN search works exclusively against the
>> >> index. So, it only has boxes available to make decisions (not
>> >> entirely true, it actually has the full geometry of the query key,
>> >> but not of the index keys). That means it can return an
>> exact answer
>> >> for point-on-point queries, but for everything else it'll be a box
>> >> approximation. So the n-nearest-boxes.
>> >>
>> >> There are lots of ways to attack the problem... we can do pure
>> >> nearest-boxes. We could also convert all the boxes to
>> points, and do
>> >> nearest-centroids. This might be easiest to explain,
>> potentially. The
>> >> trouble is, we're going to be returning an approximation for
>> >> everything except points, so the question is (in my mind) which
>> >> approximation is easiest to visualize and work with?
>> >>
>> >
>> > Not sure I understand your question Paul?   I don't see how nearest
>> > centroids helps much when you are talking about largish
>> polygons.  I
>> > was thinking this would just make the ST_Expand like stuff
>> faster? Oh
>> > perhaps I misunderstood that.
>> >
>> > Thanks,
>> > Regina
>> >
>> >
>> >
>> > _______________________________________________
>> > postgis-devel mailing list
>> > postgis-devel at postgis.refractions.net
>> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
>> >
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>



More information about the postgis-devel mailing list