pramsey at cleverelephant.ca
Mon Oct 2 14:53:07 PDT 2017
So, in pursuit of spare CPU cycles for the good folks at Carto, I've been
putting various workloads through a profiler. Some of the results you've
seen in recent tweaks committed.
One thing I found as a general rule was that as the number of simple
objects goes up, the more noticeable the overhead of things like memory
allocation / deallocation becomes. So the overhead of just constructing an
LWPOINT on a point, when you're slamming through 1M of them, is not nothing.
For some workloads, like generating MVT (there it is) we call numerous
geometry processing functions in a row. For some functions, that expect
const LWGEOM inputs, that can result in a lot of cloning of objects, and
hence, memory management. For other functions, that expect to modify things
in place, the coordinates are just altered right there.
This has always been a little ugliness in our API, and for a while we
settled on defaulting to making copies, so the API was at least getting
cleaner. But also getting slower.
I built a prototype for generating MVT that does all the coordinate
processing in place, and for bulk simple objects it ran about 30% faster
than the existing code. There's a benefit to be had to in working in place.
I propose that we... cut the baby in half.
* All liblwgeom functions that do in place coordinate modification should
be re-named to *_in_place() and a bare version that takes a const LWGEOM
and returns a copy as necessary be added.
* All liblwgeom functions that already take const LWGEOM be supplemented
with *_in_place() variants so that it's possible to build processing chains
that are capable of doing processing with a minimal allocation/deallocation
Functions on my list immediately:
I will, however, do a full review and attempt to ensure all non-GEOS
functions have in_place variants available in liblwgeom.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the postgis-devel