[geos-devel] Possible speed improvement for overlay operations
Paul van der Linden
paul.doskabouter at gmail.com
Wed Dec 5 10:02:46 PST 2018
> Thanks for the test data. In my Java environment intersection computes in
> 3.8 s :) That is definitely one of the gnarlier polygon geometries
> seen - even worse than muskeg lakes :)
Wow 3.8 sec is that the old version or the one with the envelope check?
Still puzzling why postgres @work, devpc at home and your system differ by
Thought cpu's are more or less comparable these days...
> Also, the covers predicate computes in 1.3 s, so adding that test into the
> path would definitely be beneficial - for this particular case. This is
> pretty data-specific, though - it depends how many cases in the overall
> computation allow short-circuiting via that test. Did you try the PostGIS
> query using the combination of predicates and intersection?
Certainly wouldn't add that if it takes about 30% of the time, and I'm not
sure if I have the full picture of all the edgecases
Did try quite some combinations of predicate and intersections. predicares
took about 40s, intersection about 60
> So maybe a pre-evaluation covers check is best left as an option to the
> Overlay code, so that intermediate structures can be reused. In other
> words, provide an optional flag on the Overlay operation which does a
> covers check before computing the full result, utilizing the internal
> segment indexing to perform the check quickly.
> And as you say, if this can be prepared it should make an even bigger
> difference. This requires knowing which side of the operation is the one
> that should be prepared - is this obvious in your data context?
Yes, that's the one covering the world.
In the mean time, I learnt from the code that in case of overlap there will
be expensive computations done, so I rewrote (actually: planning to :) ) my
queries from st_intersection(geo,"world-country") to
st_difference(geo,"country"). As the "county" is much smaller, a lot of
stuff can be calculated just based on envelope.
Did a test, and one of the queries that took over 30 hours was now done in
18 hours, so that's quite an improvement. Only drawback is that having some
queries with a st_difference and some with st_intersection isn't helping in
the clarity of the whole workflow... Added benefit is that with the
st_difference, I don't get line and point geometries that are on the
What would help in this case is probably an idea I posted here:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the geos-devel