[postgis-tickets] [PostGIS] #2703: KNN gist with recheck for 9.5?
PostGIS
trac at osgeo.org
Sun May 24 21:15:37 PDT 2015
#2703: KNN gist with recheck for 9.5?
--------------------------+---------------------------
Reporter: robe | Owner: pramsey
Type: enhancement | Status: new
Priority: high | Milestone: PostGIS 2.2.0
Component: postgis | Version: trunk
Resolution: | Keywords:
--------------------------+---------------------------
Comment (by robe):
I'm having trouble getting the built in geometry types to bulk the same
way, but then again theirs is harder to test, because they don't have a
distance funciton, just the <-> distance operator (so hard to do a
compare). In running the tests though, I realized the problem may not be
the lateral, but that the lateral allows for testing multiple records at a
time.
For example the one in the regress I put in with the index in place:
{{{
SELECT a.gid, b.gid As match, RANK() OVER(PARTITION BY a.gid ORDER BY
ST_Distance(a.geom, b.geom) ) As true_rn, b.rn As knn_rn
FROM knn_recheck_geom As a
LEFT JOIN
LATERAL ( SELECT gid, geom, RANK() OVER(ORDER BY a.geom
<-> g.geom) As rn
FROM knn_recheck_geom As g WHERE a.gid <> g.gid
ORDER BY a.geom <-> g.geom LIMIT 5) As b ON true
WHERE a.gid IN(50000,50001,70000,61000)
ORDER BY a.gid, b.rn;
-- returns
gid | match | true_rn | knn_rn
------+--------+---------+--------
50000 | 48969 | 1 | 1
50000 | 51031 | 2 | 2
50000 | 47938 | 3 | 3
50000 | 52062 | 4 | 4
50000 | 46907 | 5 | 5
50001 | 53093 | 5 | 1
50001 | 48970 | 1 | 2
50001 | 51032 | 2 | 3
50001 | 47939 | 3 | 4
50001 | 52063 | 4 | 5
61000 | 62031 | 1 | 1
61000 | 59969 | 1 | 1
61000 | 63062 | 3 | 3
61000 | 58938 | 4 | 4
61000 | 46908 | 5 | 5
70000 | 64093 | 4 | 1
70000 | 57907 | 5 | 2
70000 | 600002 | 1 | 3
70000 | 600001 | 1 | 3
70000 | 600004 | 1 | 3
}}}
Note the problem children are 50001, 70000 (the true distance sort and knn
sort should match but they don't. Now if I reduce to just testing one
problem child:
{{{
SELECT a.gid, b.gid As match, RANK() OVER(PARTITION BY a.gid ORDER BY
ST_Distance(a.geom, b.geom) ) As true_rn, b.rn As knn_rn -- ,
ST_Distance(a.geom, b.geom) , ST_GeometryType(a.geom)
FROM knn_recheck_geom As a
LEFT JOIN
LATERAL ( SELECT gid, geom, RANK() OVER(ORDER BY a.geom
<-> g.geom) As rn
FROM knn_recheck_geom As g WHERE a.gid <> g.gid
ORDER BY a.geom <-> g.geom LIMIT 5) As b ON true
WHERE a.gid IN(70000)
ORDER BY a.gid, b.rn;
-- the order agrees --
gid | match | true_rn | knn_rn
-------+--------+---------+--------
70000 | 600001 | 1 | 1
70000 | 600002 | 1 | 1
70000 | 600004 | 1 | 1
70000 | 71031 | 4 | 4
70000 | 68969 | 5 | 5
(5 rows)
}}}
If I do both problem children, the first one is right the second one is
screwed up.
--
Ticket URL: <http://trac.osgeo.org/postgis/ticket/2703#comment:19>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list