[postgis-devel] Much slower processing on GEOS 3.9.0 versus 3.8.0 for geodesic area calculation - Upgrade to PROJ 7.2.1 causes issue

Marco Boeringa marco at boeringa.demon.nl
Fri Apr 30 01:05:08 PDT 2021


You meant to write 100x instead of 10x (22ms vs. 240us)? At least 100x 
is definitely also the performance hit I am facing in the processing...

I am just curious, as not having any experience with C/C++ and compiler 
settings, as Mike Tavis also wrote:

"A possibility for the slowdown is that one of the components was
compiled without optimization. E.g. with PROJ, doing "cmake .."
without specifying "-DCMAKE_BUILD_TYPE=Release" would build an
un-optimized PROJ."

is it *really* realistic that a missed compiler optimization could cause 
a 100x performance regression? It just seems a lot...?

I would also be interested to see some insights of Raúl, as he worked on 
the non-merged "transaction level" caching feature 
(https://github.com/postgis/postgis/commit/3ced96fc1a79831a26c33311a17d1f32e3c5c732). 
I could see that be of a real benefit in the kind of use cases I am 
dealing with, where multiple SQL statements are batched inside a single 
transaction (hundreds / thousands per transaction). Would it still be an 
option to implement something like that? It just seems a waste to 
re-initiliaze the transformation object at each SQL statement, if it is 
a quite common use case to have multiple SQL statements batched in a 
single transaction.

Lastly, I am still a bit worried about the fact that 'SELECT 
postgis_full_version()' returns the old '6.3.1' PROJ version number 
after upgrading PROJ to '7.2.1'. Is there any logical explanation for 
this, e.g. 'postgis_full_version' only ever returning the version number 
of the PROJ it was originally compiled against, so never displaying the 
upgrade?

Marco

Op 28-4-2021 om 20:44 schreef Paul Ramsey:
> I would not read a great deal into the 10x difference, the proj machinery seems to recognize the null-transformation doesn't need any special look-ups. I remain interested in the measurements of inter-version differences that have yet to be done. As you say, you're observing a 6->7 difference (maybe) which I'd like to see isolated by testing with this utilty program. This could *still* be little more interesting than a packaging difference on your system (did the packager forget to add optimizations?).
>
> All the caching, etc, already was put in place during the 5->6 transition, and the code on the postgis side only differentiates between 6 (new fancy) and <6 (old proj4 style).
>
> P
>
>> On Apr 28, 2021, at 11:40 AM, Marco Boeringa <marco at boeringa.demon.nl> wrote:
>>
>> There are couple of interesting points raised by these latest observations:
>>
>> - These timing differences (22ms vs. 240us) seem to correspond very well with the 100x performance regression I observed, assuming the bulk of the cost of the total call is in fact on the transform initialization, and not the actual area calculation.
>>
>> - Another unanswered question this all raises, is why this is happening at the 6.x --> 7.x transition of PROJ? All other information Raúl posted and linked to, seemed to indicate the penalty should already have been paid at the 5.x --> 6.x transition, yet 6.3.1 seems to be running quite fine. So was/is there some lesser known caching / optimization behavior in PostgreSQL itself, that was still functional in 6.x avoiding the cost, while changes in PROJ 7.x now finally reveal the true cost of the non-cached transform object?
>>
>> Marco
>>
>> Op 28-4-2021 om 18:59 schreef Paul Ramsey:
>>> Just to try and further remove variables from this investigation, I've put a little program here, which can be built and run against any Proj >= 6
>>>
>>> https://gist.github.com/pramsey/493b2490a8736fd8c00e30efa62e4ec3
>>>
>>> It just runs the proj_create_crs_to_crs() function a number of times and figures out the average invocation cost. You can change the from/to path by altering the commandline values (see the comment at top for build instructions and run instructions).
>>>
>>> Testing against proj8, I find that setting up a full projection is quite costly (22ms)
>>>
>>> Proj version '8.0.0'
>>> Using 'epsg:4326' as from-srid
>>> Using 'epsg:26910' as to-srid
>>> Ran 1000 iterations, 22031.7 us per iteration
>>>
>>> While setting up a null-transformation is quite cheap (240us)
>>>
>>> Proj version '8.0.0'
>>> Using 'epsg:4326' as from-srid
>>> Using 'epsg:4326' as to-srid
>>> Ran 1000 iterations, 249.991 us per iteration
>>>
>>> The odd thing (unfortunately for trying to understand this issue) is that it is precisely this null transformation which is the setup penalty when doing the geometry::geography cast. It just instantiates a epsg:4326->epsg:4326 transform so it can interrogate the objects and see if they are geodetic.
>>>
>>> Now, maybe the null case got optimzed in version 8? I dunno.
>>>
>>> I now leave running this little test against multiple proj versions from 6 to 8 as an exercise for someone to try out. Maybe initialization costs leapt up in one version or another.
>>>
>>> P.
>>>


More information about the postgis-devel mailing list