[geos-devel] Possible speed improvement for overlay operations

Martin Davis mtnclimb at gmail.com
Wed Dec 5 10:35:53 PST 2018

On Wed, Dec 5, 2018 at 10:02 AM Paul van der Linden <
paul.doskabouter at gmail.com> wrote:

> > Thanks for the test data.  In my Java environment intersection computes
> in
> > 3.8 s  :)    That is definitely one of the gnarlier polygon geometries
> I've
> > seen - even worse than muskeg lakes :)
> >
> Wow 3.8 sec is that the old version or the one with the envelope check?
> Yes.  It turned out that JTS already had the envelope check in place (at
least for polygon rings - I just added a check at the top-level too, but it
doesn't make any difference, since there are only a few polygons in this

> > Also, the covers predicate computes in 1.3 s, so adding that test into
> the
> > path would definitely be beneficial - for this particular case.  This is
> > pretty data-specific, though - it depends how many cases in the overall
> > computation allow short-circuiting via that test.  Did you try the
> PostGIS
> > query using the combination of predicates and intersection?
> >
> Certainly wouldn't add that if it takes about 30% of the time, and I'm not
> sure if I have the full picture of all the edgecases
> Did try quite some combinations of predicate and intersections. predicares
> took about 40s, intersection about 60

Fair enough, if adding predicates doesn't improve the overall computation

> In the mean time, I learnt from the code that in case of overlap there
> will be expensive computations done, so I rewrote (actually: planning to :)
> ) my queries from st_intersection(geo,"world-country") to
> st_difference(geo,"country"). As the "county" is much smaller, a lot of
> stuff can be calculated just based on envelope.

Hmmm - is this because st_difference includes an envelope check?  In
PostGIS or GEOS?

> Did a test, and one of the queries that took over 30 hours was now done in
> 18 hours, so that's quite an improvement. Only drawback is that having some
> queries with a st_difference and some with st_intersection isn't helping in
> the clarity of the whole workflow... Added benefit is that with the
> st_difference, I don't get line and point geometries that are on the
> boundary

Returning lines and points from area intersections is a design decision
made in the very first release of JTS.  I always wonder if that was the
best decision, or whether the default should have been to return only
geometry of the largest dimension, and provide another method if all are
required.  Seems like it would be useful to add such a function to PostGIS,
at least.  (Actually would be nice to be able to specify which dimension(s)
were of interest, to support all use cases.).

> What would help in this case is probably an idea I posted here:
> https://lists.osgeo.org/pipermail/postgis-users/2018-December/043029.html

I am following that thread.   I'm hoping that it would be possible to
implement a smarter, caching ST_IntersectionPrepared (or some such name)
which would allow this very simple query:

select ST_IntersectionPrepared(geomA, geomB) R
from A join B on A.geom && B.geom
where R != empty/null

This would include the contains check internally, for maximum performance
and reuse of intermediate results.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/geos-devel/attachments/20181205/7e825c83/attachment.html>

More information about the geos-devel mailing list