[postgis-devel] [PostGIS] #3949: Infinite (?) loop in distance calculation

Fri Dec 22 05:46:29 PST 2017

Sounds very good.

And I think it is worth to sacrify 10 % speed in some occations for the
predictability.

And caching the tree must be a very good thing. 

When you say 100x faster, do you then mean in cases like the one in the
ticket?

If so, it isn't primary the amount of vertes points that is the bottle
neck for todays code but other things like distance between the
geometries and if there is a lot of vertex points in the part of the
border that is hard for the algorithm to filter on (the part facing the
other geometry, specially concave cases)

But I think it doesn't matter if todays code performs on pair or even
better in some cases. Predictability is more important.

I will give it a test run if the code is ready for that.

/Nicklas

On Fri, 2017-12-22 at 05:35 -0800, Paul Ramsey wrote:
> So, I now have tree distance code that seems to work, thanks to this
> little bug, which caused me to find the error in my tree code. I'd
> given up on it, figured the tree approach was a dead end.
> 
> https://github.com/pramsey/postgis/tree/svn-trunk-distance
> 
> In very limited testing, I see it running quite a bit faster (100x)
> than the existing code on large objects, but slightly slower on small
> objects (10%). I was thinking on merging it to trunk but as a
> completely independent function for now (ST_DistanceRectTree()) so
> that it can be tested and the functional limits of it determined.
> Then
> we can discuss how we want to integrate it into generic
> ST_Distance(),
> maybe with a object size test, or maybe as a cached approach
> (skipping
> the tree built step in cases with repeated inputs would probably be a
> win).
> 
> Thoughts?
> 
> P
> 
> 
> On Fri, Dec 22, 2017 at 4:15 AM, PostGIS <trac at osgeo.org> wrote:
> > 
> > Comment (by nicklas):
> > 
> >  I agree.
> > 
> >  But worst case is different for different methods.
> >  The example Paul provided here is among the worst cases for the
> > algorithm
> >  we use now (when bboxes doesn't overlap, then it falls back to
> > brute
> >  force)
> > 
> >  If the shoreline of both geometries would have been convex it
> > would have
> >  been quite a lot faster, but now one is concave and all points in
> > the
> >  concave part needs to be tested more or less.
> > 
> >  So for this case that is a pain for that algorithm it loses the
> > battle
> >  with about 1.08 (79 sec vs 85 sec)
> > 
> >  But for the case with the same geometries further away it wins by
> > using
> >  approx 0.04 of the time brute force method uses. And with the case
> > when we
> >  turn the other side of the geometries against each other it wins
> > by using
> >  approx 0.0005 of the brute force method.
> > 
> >  So in this case I think the 8 % worse result in some cases is ok.
> > BUT, the
> >  problem is that it is unpredictable. Users will like Paul think
> > that
> >  something is wrong.
> > 
> >  But we have lived with this code now since 2010 and I don't think
> > there
> >  have been many posts on the list about that issue. But I have hit
> > that
> >  wall a few times myself making queries much slower than I expect.
> > And it
> >  can be very frustrating when testing a sample to get an idea of
> > the query
> >  time and then it takes 100 times longer.
> > 
> > --
> > Ticket URL: <https://trac.osgeo.org/postgis/ticket/3949#comment:9>
> > PostGIS <http://trac.osgeo.org/postgis/>
> > The PostGIS Trac is used for bug, enhancement & task tracking, a
> > user and developer wiki, and a view into the subversion code
> > repository of PostGIS project.
> 
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel