[geos-devel] Possible speed improvement for overlay operations

Paul van der Linden paul.doskabouter at gmail.com
Wed Dec 5 10:02:46 PST 2018


> Thanks for the test data.  In my Java environment intersection computes in
> 3.8 s  :)    That is definitely one of the gnarlier polygon geometries
I've
> seen - even worse than muskeg lakes :)
>

Wow 3.8 sec is that the old version or the one with the envelope check?
Still puzzling why postgres @work, devpc at home and your system differ by
that much...
Thought cpu's are more or less comparable these days...

> Also, the covers predicate computes in 1.3 s, so adding that test into the
> path would definitely be beneficial - for this particular case.  This is
> pretty data-specific, though - it depends how many cases in the overall
> computation allow short-circuiting via that test.  Did you try the PostGIS
> query using the combination of predicates and intersection?
>

Certainly wouldn't add that if it takes about 30% of the time, and I'm not
sure if I have the full picture of all the edgecases
Did try quite some combinations of predicate and intersections. predicares
took about 40s, intersection about 60


> So maybe a pre-evaluation covers check is best left as an option to the
> Overlay code, so that intermediate structures can be reused.  In other
> words, provide an optional flag on the Overlay operation which does a
> covers check before computing the full result, utilizing the internal
> segment indexing to perform the check quickly.
>
> And as you say, if this can be prepared it should make an even bigger
> difference.  This requires knowing which side of the operation is the one
> that should be prepared - is this obvious in your data context?

Yes, that's the one covering the world.
In the mean time, I learnt from the code that in case of overlap there will
be expensive computations done, so I rewrote (actually: planning to :) ) my
queries from st_intersection(geo,"world-country") to
st_difference(geo,"country"). As the "county" is much smaller, a lot of
stuff can be calculated just based on envelope.
Did a test, and one of the queries that took over 30 hours was now done in
18 hours, so that's quite an improvement. Only drawback is that having some
queries with a st_difference and some with st_intersection isn't helping in
the clarity of the whole workflow... Added benefit is that with the
st_difference, I don't get line and point geometries that are on the
boundary

What would help in this case is probably an idea I posted here:
https://lists.osgeo.org/pipermail/postgis-users/2018-December/043029.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/geos-devel/attachments/20181205/c7eb8ec5/attachment.html>


More information about the geos-devel mailing list