[postgis-devel] lwgeom_*_in_place()

Paul Ramsey pramsey at cleverelephant.ca
Mon Oct 2 17:51:09 PDT 2017


Well, I'm leaning on the tooling I have in OSX, which includes a sampling
profiler in the Instruments.app that hides within the XCode bundle. Then I
just get the pg_backend_pid() of the process I'm using to generate load and
point it at that one process and put the load on it, generally for a minute
or so to give the sampler ample time to find all the points of pain. For
the Linux crew I'm not sure what the equivalent would be. Certainly
compared to old skool things like gprof, a sampling profiler is massively
better, in terms of ease of use and depth of analysis.

P

On Mon, Oct 2, 2017 at 5:19 PM, Darafei "Komяpa" Praliaskouski <
me at komzpa.net> wrote:

> Hello,
>
> Can you please share the way you are using profiler on postgis?
> I'd like to profile sorting by geometry. For now I've found for myself
>
> objdump -S --disassemble /usr/lib/liblwgeom-2.5.so.0 |less
>
> ... but that doesn't scale well enough. Still allows to see that
> geometry/geography if's reading bits of header with something hidden three
> files away under #define aren't a simple thing.
>
> For _in_place suggestion, I think it's enough to make sure it's done for
> the critical path of MVT and codified somewhere as a guideline.
>
> вт, 3 окт. 2017 г. в 0:53, Paul Ramsey <pramsey at cleverelephant.ca>:
>
>> Hey devs,
>> So, in pursuit of spare CPU cycles for the good folks at Carto, I've been
>> putting various workloads through a profiler. Some of the results you've
>> seen in recent tweaks committed.
>>
>> One thing I found as a general rule was that as the number of simple
>> objects goes up, the more noticeable the overhead of things like memory
>> allocation / deallocation becomes. So the overhead of just constructing an
>> LWPOINT on a point, when you're slamming through 1M of them, is not nothing.
>>
>> For some workloads, like generating MVT (there it is) we call numerous
>> geometry processing functions in a row. For some functions, that expect
>> const LWGEOM inputs, that can result in a lot of cloning of objects, and
>> hence, memory management. For other functions, that expect to modify things
>> in place, the coordinates are just altered right there.
>>
>> This has always been a little ugliness in our API, and for a while we
>> settled on defaulting to making copies, so the API was at least getting
>> cleaner. But also getting slower.
>>
>> I built a prototype for generating MVT that does all the coordinate
>> processing in place, and for bulk simple objects it ran about 30% faster
>> than the existing code. There's a benefit to be had to in working in place.
>>
>> I propose that we... cut the baby in half.
>>
>> * All liblwgeom functions that do in place coordinate modification should
>> be re-named to *_in_place() and a bare version that takes a const LWGEOM
>> and returns a copy as necessary be added.
>> * All liblwgeom functions that already take const LWGEOM be supplemented
>> with *_in_place() variants so that it's possible to build processing chains
>> that are capable of doing processing with a minimal allocation/deallocation
>> footprint.
>>
>> Functions on my list immediately:
>>
>>   * remove_repeated_points
>>   * simplify
>>   * affine
>>   * grid
>>   * transform
>>
>> I will, however, do a full review and attempt to ensure all non-GEOS
>> functions have in_place variants available in liblwgeom.
>>
>> Thoughts?
>>
>> P.
>>
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20171002/5a370c8e/attachment.html>


More information about the postgis-devel mailing list