[postgis-tickets] [PostGIS] #2703: KNN gist with recheck for 9.5?

PostGIS trac at osgeo.org
Sun May 24 21:15:37 PDT 2015


#2703: KNN gist with recheck for 9.5?
--------------------------+---------------------------
  Reporter:  robe         |      Owner:  pramsey
      Type:  enhancement  |     Status:  new
  Priority:  high         |  Milestone:  PostGIS 2.2.0
 Component:  postgis      |    Version:  trunk
Resolution:               |   Keywords:
--------------------------+---------------------------

Comment (by robe):

 I'm having trouble getting the built in geometry types to bulk the same
 way, but then again theirs is harder to test, because they don't have a
 distance funciton, just the <-> distance operator (so hard to do a
 compare).  In running the tests though, I realized the problem may not be
 the lateral, but that the lateral allows for testing multiple records at a
 time.

 For example the one in the regress I put in with the index in place:

 {{{
 SELECT a.gid, b.gid As match, RANK() OVER(PARTITION BY a.gid ORDER BY
 ST_Distance(a.geom, b.geom) ) As true_rn, b.rn  As knn_rn
 FROM knn_recheck_geom As a
         LEFT JOIN
                 LATERAL ( SELECT  gid, geom, RANK() OVER(ORDER BY a.geom
 <-> g.geom) As rn
                         FROM knn_recheck_geom As g WHERE a.gid <> g.gid
 ORDER BY a.geom <-> g.geom LIMIT 5) As b ON true
         WHERE a.gid IN(50000,50001,70000,61000)
 ORDER BY a.gid, b.rn;

 -- returns
  gid  | match  | true_rn | knn_rn
 ------+--------+---------+--------
 50000 |  48969 |       1 |      1
 50000 |  51031 |       2 |      2
 50000 |  47938 |       3 |      3
 50000 |  52062 |       4 |      4
 50000 |  46907 |       5 |      5
 50001 |  53093 |       5 |      1
 50001 |  48970 |       1 |      2
 50001 |  51032 |       2 |      3
 50001 |  47939 |       3 |      4
 50001 |  52063 |       4 |      5
 61000 |  62031 |       1 |      1
 61000 |  59969 |       1 |      1
 61000 |  63062 |       3 |      3
 61000 |  58938 |       4 |      4
 61000 |  46908 |       5 |      5
 70000 |  64093 |       4 |      1
 70000 |  57907 |       5 |      2
 70000 | 600002 |       1 |      3
 70000 | 600001 |       1 |      3
 70000 | 600004 |       1 |      3

 }}}

 Note the problem children are 50001, 70000 (the true distance sort and knn
 sort should match but they don't.  Now if I reduce to just testing one
 problem child:


 {{{
 SELECT a.gid, b.gid As match, RANK() OVER(PARTITION BY a.gid ORDER BY
 ST_Distance(a.geom, b.geom) ) As true_rn, b.rn  As knn_rn  -- ,
 ST_Distance(a.geom, b.geom) , ST_GeometryType(a.geom)
 FROM knn_recheck_geom As a
         LEFT JOIN
                 LATERAL ( SELECT  gid, geom, RANK() OVER(ORDER BY a.geom
 <-> g.geom) As rn
                         FROM knn_recheck_geom As g WHERE a.gid <> g.gid
 ORDER BY a.geom <-> g.geom LIMIT 5) As b ON true
         WHERE a.gid IN(70000)
 ORDER BY a.gid, b.rn;

 -- the order agrees --
   gid  | match  | true_rn | knn_rn
 -------+--------+---------+--------
  70000 | 600001 |       1 |      1
  70000 | 600002 |       1 |      1
  70000 | 600004 |       1 |      1
  70000 |  71031 |       4 |      4
  70000 |  68969 |       5 |      5
 (5 rows)

 }}}

 If I do both problem children, the first one is right the second one is
 screwed up.

--
Ticket URL: <http://trac.osgeo.org/postgis/ticket/2703#comment:19>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list