[postgis-devel] Wrong answer from new knn with recheck

Nicklas Avén nicklas.aven at jordogskog.no
Sun May 24 01:29:36 PDT 2015


The same thing happens for more houses in the table. So it is not that
particular house it is about.

This might be a PostgreSQL problem. I think that I can prove that what
happens is this (I will try to explain it):

Sometimes, not every time, the first unused item that is rechecked
appears as the closest item on the next rechecked comparison.

Look at the two examples below. At the first run I return only the
closest road to each house. The problem is both at row 2 (t_gid=2) and
row 6 (t_gid = 6). note the r_gid values on the problematic rows

Then at the second run I ask for the 2 closest roads to each house.
Then the r_gid values from the problematic rows in run1 shows as the
second closest road in run 2 for the previous house. 

RUN 1: 

SELECT t.navn,t.gid t_gid, r.gid r_gid, st_distance(r.geom, t.geom) 
FROM (select * from turisthytte) t CROSS JOIN LATERAL
(SELECT r.gid, r.geom FROM roads r ORDER BY t.geom <->r.geom LIMIT 1) r
limit 10;

               navn               | t_gid | r_gid  |   st_distance    
----------------------------------+-------+--------+------------------
 Gramstad                         |     1 | 988075 | 90.2225986190226
 Sidhusgohpi                      |     2 | 986169 |  1379886.0666003
 Gården Li                        |     3 | 942867 | 968.288199312332
 Haukenestårnet                   |     4 |  76588 | 404.810783875252
 Hannevoldhytta                   |     5 | 759879 | 246.152262737727
 Deatnomuotki                     |     6 | 759880 | 1221715.24537982
 Fogdehytta                       |     7 |  76347 | 136082.340278048
 Mule Varde                       |     8 |  76348 | 135425.235637218
 Fjellvassbu                      |     9 | 941387 | 305599.118262984
 Krokbua                          |    10 | 423774 | 198.915851325086


RUN 2:

SELECT t.navn,t.gid t_gid, r.gid r_gid, st_distance(r.geom, t.geom) 
FROM (select * from turisthytte) t CROSS JOIN LATERAL
(SELECT r.gid, r.geom FROM roads r ORDER BY t.geom <->r.geom LIMIT 2) r
limit 12;


               navn               | t_gid |  r_gid  |   st_distance    
----------------------------------+-------+---------+------------------
 Gramstad                         |     1 |  988075 | 90.2225986190226
 Gramstad                         |     1 |  986169 | 128.356279805393
 Sidhusgohpi                      |     2 |  982379 | 1379790.96016902
 Sidhusgohpi                      |     2 | 1724598 |  27839.424840718
 Gården Li                        |     3 |  942867 | 968.288199312332
 Gården Li                        |     3 |  941387 | 1083.96363188574
 Haukenestårnet                   |     4 |   76588 | 404.810783875252
 Haukenestårnet                   |     4 |   76347 | 541.305515942029
 Hannevoldhytta                   |     5 |  759879 | 246.152262737727
 Hannevoldhytta                   |     5 |  759880 | 292.257309134795
 Deatnomuotki                     |     6 |  759871 | 1221757.34981649
 Deatnomuotki                     |     6 |  760212 | 1222588.13809164


Is PostgreSQL to blame?

/Nicklas




On Sun, 2015-05-24 at 07:02 +0200, Nicklas Avén wrote:
> Then i gives the right answer.
> 
> /Nicklas
> 
> 
> Sent from my Cat® phone.
> 
> Den 24 maj 2015 01:57 skrev Paul Ramsey <pramsey at cleverelephant.ca>:
> >
> > What happens when you run the query using the suspect house as a literal instead of doing the fun lateral query? 
> >
> > P.  
> >
> >
> > --  
> > http://postgis.net  
> > http://cleverelephant.ca 
> >
> >
> > On May 23, 2015 at 1:36:55 PM, Nicklas Avén (nicklas.aven at jordogskog.no) wrote: 
> > > Since I cannot reach trac I post it here. 
> > >  
> > > I have been playing some with latest trunk and latest postgresql. 
> > >  
> > > Since I cannot reach trac I cannot check that it is absolute latest 
> > > trunk but I have r13544 
> > >  
> > > PostgreSQL I pulled from git yesterday. 
> > >  
> > > I have 2 data sets. one with roads and one with houses. 
> > >  
> > > Attached is data that reproduces the problem. 
> > >  
> > > table r is the roads and table t is the houses. 
> > >  
> > >  
> > > If I then run this query: 
> > >  
> > > SELECT t.navn, r.gid, st_distance(r.geom, t.geom) 
> > > FROM t CROSS JOIN LATERAL 
> > > (SELECT r.gid, r.geom FROM r ORDER BY t.geom <->r.geom LIMIT 2) r; 
> > >  
> > > I get this result: 
> > > navn | gid | st_distance 
> > > ----------------------------------+---------+------------------ 
> > > Gramstad | 988075 | 90.2225986190226 
> > > Gramstad | 986169 | 128.356279805393 
> > > Sidhusgohpi | 982379 | 1379790.96016902 
> > > Sidhusgohpi | 1724598 | 27839.424840718 
> > >  
> > >  
> > > What is totally wrong here is the third row. That road is 1379 km away. 
> > > The last row gives the right answer. The closest road to that house is 
> > > 27 km away. 
> > >  
> > > With the full data sets I can play with the limit and see how roads from 
> > > the first house in some way is following to the second house. road nr 
> > > 982379 is just 215 meters away from house nr 1. So something is not 
> > > letting go. 
> > >  
> > >  
> > > Thanks 
> > >  
> > > Nicklas Avén 
> > >  
> > >  
> > >  
> > > _______________________________________________ 
> > > postgis-devel mailing list 
> > > postgis-devel at lists.osgeo.org 
> > > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel 
> >
> >
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
> 





More information about the postgis-devel mailing list