[postgis-devel] Much slower processing on GEOS 3.9.0 versus 3.8.0 for geodesic area calculation

Marco Boeringa marco at boeringa.demon.nl
Tue Apr 27 02:17:31 PDT 2021


Yes, my time is also finite, and I agree things must be as efficient and 
clear as possible.

However, that also requires a bit of introductory back-and-forth 
questioning. Everybody knows the story of the help desk employee that 
spends several hours attempting to "fix" an issue with a non-functional 
printer, only to discover the client on the phone forgot to put the plug 
in the socket... Lets first determine we've put the 
"plug-in-the-socket", which requires the kind of targeted questioning 
Raúl did.

And I already cut the problem down to two very clear things:

- Issue started after upgrading to GEOS 3.9.0 from 3.8.0, but I need to 
verify if any (dependent) packages were updated as well in the process 
per Raúl's question. Job to be done by me for which I will report back, 
although I think it likely that GEOS maintainers should be able to 
answer this question at least partly themselves: were any dependencies 
changed to new versions to support the GEOS 3.9.0 release?

- ST_Area(), as a very basic function, shows the issue.

This combination may make little sense, but I have now reproduced it 
twice. I wish I could turn water in wine, but this is what it is, so 
this is at least the starting point of a possible solution to the 
question of what is wrong.

One question that still is on my mind, is if we can exclude an issue 
caused by the PostgreSQL / PostGIS 3.1.1 installation. In a previous 
issue I posted to the PostGIS mailing list recently, it became clear the 
official 3.1.1 PostGIS install of 
http://apt.postgresql.org/pub/repos/apt/ 
<http://apt.postgresql.org/pub/repos/apt/> has *not* been compiled with 
GEOS 3.9.0 support. As Paul wrote:

"It's curious, but the implication is that the postgis on your machine 
was built against an earlier GEOS. The version number that postgis 
reports for GEOS is generated at run-time, while the test of whether to 
include support for various functions is determined at compile-time. 
Hence you can see a "GEOS 3.9" version in your postgis, but still not 
have access to functions that are 3.9+ only."

This was a response to a message by me where I received an error message 
attempting to use a new PostGIS 3.1.1 specific function 
("ST_ReducePrecision"), that indicated this lack of GEOS 3.9.0 
compilation. Is that a potential source of more problems than just not 
being able to access a new function definition? Are there any other 
potential knock-on effects of this, or can we exclude this as being of 
any relevance to this new performance regression issue?

Marco

Op 26-4-2021 om 17:41 schreef Paul Ramsey:
> So, to be clearer about how to interface with -devel with your issues:
>
> - data, in a dump file, as small as possible to demonstrate the issue, 
> preferably just one table, generated with pg_dump -d database -t table 
> --no-owner, ideally small enough to attach to a trac ticket
> - the ideal size of data is "no data at all", many bugs can be reduced 
> to one geometry that can be part of the SQL call that exercises the 
> condition
> - SQL, in a sql file, again as small as possible, one statement 
> ideally, with as few functions as possible
>
> The goal here is to allow a dev to copy your SQL and drop it into a 
> console in one cut-paste step, because given our finite time base, 
> spending a lot of time trying to replicate a situation and then 
> getting a "null result" where everything still works fine is hugely 
> annoying. I'm not going to attempt to replicate your setup as 
> described because there are way too many degrees of freedom, I'm 
> almost guaranteed to get a non replicable situation, and have wasted 
> whatever time I invest in it.
>
> People who invest the time to reduce their problem to a cut'paste 
> example get responses and fixes surprisingly fast. Those who don't 
> wonder why OSS developers are such standoffish bastards. There's a 
> continuum of quality between "it's broken" and "here's a one line SQL 
> statement that does exactly this thing which is not what is supposed 
> to happen", but the closer you get to the far end, the faster and 
> better things will go. I recognize that regressions and multi-version 
> issues are the hardest, but here too time reducing the problem to the 
> minimum degrees of freedom will result in attention.
>
> In answer to your question: GEOS has no direct linkage to proj, and it 
> makes little sense that ST_Area(geometry::geography) would be 
> sensitive to GEOS version, as neither (the function nor the cast) 
> enters the GEOS code line at all. (They do, however, both enter the 
> proj code line.)
>
> P.
>
>
> On Mon, Apr 26, 2021 at 8:11 AM Marco Boeringa 
> <marco at boeringa.demon.nl <mailto:marco at boeringa.demon.nl>> wrote:
>
>     One more thing to note related to this issue is that the UPDATE
>     taking place is in a high multi-threaded application, with
>     individual UPDATE statements for each record, however batched in
>     hundreds or thousands of records inside a single transaction.
>
>     There is a known issue with PROJ 6+ where creating a
>     transformation object is considerably more costly than in PROJ <=5
>     (https://trac.osgeo.org/postgis/ticket/4372
>     <https://trac.osgeo.org/postgis/ticket/4372>), but both
>     installations I tested for GEOS 3.8.0 and GEOS 3.9.0 are running
>     PROJ 6.3.1, as the 'SELECT postgis_full_version()' showed,
>     *unless* the answer to my last question:
>
>     * Is GEOS dependency on PROJ dynamic like PostGIS, or is it baked
>     in? What I mean is, if I see 'SELECT postgis_full_version' return
>     'PROJ="6.3.1"' as it does in my installation, is GEOS / libgeos
>     then using the exact same version 6.3.1, or might it be using an
>     older version, e.g. PROJ 5.x?
>
>     is that GEOS *does* have its own PROJ version, and my non-updated
>     original installation of GEOS 3.8.0 is using a lower than PROJ 6
>     version, while the GEOS 3.9.0 is on 6 or higher. Anyone can answer
>     the above?
>
>     Marco
>
>     Op 26-4-2021 om 15:52 schreef Marco Boeringa:
>>
>>     Hi Raúl,
>>
>>     "Does the slowdown also appears if you only do the casting to
>>     geography (no ST_Area)?"
>>
>>     I was in the process of restoring a VM and restarting a process.
>>     Answering this question will require me to stop that process and
>>     restore the VM again from backup, which is quite time consuming.
>>     It will therefor take some time to answer this.
>>
>>     "When you update GEOS, are you updating any other package at the
>>     same time?"
>>
>>     I only selected 'libgeos-3.9.0' from the Synaptic Package
>>     Manager. I can't remember if it automatically selected other
>>     dependencies of that package to update as well, but I certainly
>>     didn't select anything else myself at the same time. You probably
>>     have a better idea of what dependencies might have been updated
>>     in other to support 'libgeos-3.9.0'.
>>
>>     I do have one question for you as well:
>>
>>     * Is GEOS dependency on PROJ dynamic like PostGIS, or is it baked
>>     in? What I mean is, if I see 'SELECT postgis_full_version' return
>>     'PROJ="6.3.1"' as it does in my installation, is GEOS / libgeos
>>     then using the exact same version 6.3.1, or might it be using an
>>     older version, e.g. PROJ 5.x?
>>
>>     Regards,
>>
>>     Marco
>>
>>     Op 26-4-2021 om 14:42 schreef Raúl Marín:
>>>     Hi Marco,
>>>
>>>     Two questions:
>>>     * Does the slowdown also appears if you only do the casting to
>>>     geography (no ST_Area)?
>>>     * When you update GEOS, are you updating any other package at
>>>     the same time?
>>>
>>>     Regards,
>>>     Raúl.
>>>
>>>
>>>     On 2021-04-26 12:03, Marco Boeringa wrote:
>>>>
>>>>     Hi all,
>>>>
>>>>     I am seeing a very significant performance regression (104x)
>>>>     when attempting to calculate geodesic area using PostGIS after
>>>>     upgrading GEOS from 3.8.0 to 3.9.0:
>>>>
>>>>     POSTGIS 3.1.1 / GEOS 3.8.0: 1m52s
>>>>     POSTGIS 3.1.1 / GEOS 3.9.0: 3h(!)14m45s
>>>>
>>>>     This makes it impossible for me to process planet level
>>>>     OpenStreetMap data, and I had to restore a backup to get back
>>>>     to a usable situation.
>>>>
>>>>     The actual code run is nothing more than what the PostGIS
>>>>     manual shows:
>>>>
>>>>     "float ST_Area(geography geog, boolean use_spheroid=true);"
>>>>
>>>>     Note that the OpenStreetMap data, as imported by osm2pgsql, is
>>>>     stored in PostGIS in WGS1984 / SRID 4326 as 'geometry', so in
>>>>     order to use it in 'ST_Area', the data is simply cast to
>>>>     'geography':
>>>>
>>>>     "ST_Area(way::geography,true)"
>>>>
>>>>     Nothing else happens in this part of the processing when the
>>>>     performance regression is visible. In fact, other more
>>>>     complicated parts of the code that call more advanced functions
>>>>     like 'ST_SimplifyVW' / 'ST_Transform' / 'ST_ChaikinSmoothing'
>>>>     run at expected / normal speed in GEOS 3.9.0 compared to 3.8.0,
>>>>     so it is not an overall slower processing that is happening.
>>>>
>>>>     The postgis_full_version results for these setups are:
>>>>
>>>>     - POSTGRESQL 13.2 / POSTGIS 3.1.1 / GEOS 3.8.0:
>>>>
>>>>     PostGIS version: POSTGIS="3.1.1 aaf4c79" [EXTENSION]
>>>>     PGSQL="130" GEOS="3.8.0-CAPI-1.13.1 " PROJ="6.3.1"
>>>>     LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3"
>>>>     WAGYU="0.5.0 (Internal)"
>>>>
>>>>      - POSTGRESQL 13.2 / POSTGIS 3.1.1 / GEOS 3.9.0:
>>>>
>>>>     PostGIS version: POSTGIS="3.1.1 aaf4c79" [EXTENSION]
>>>>     PGSQL="130" GEOS="3.9.0-CAPI-1.16.2" PROJ="6.3.1"
>>>>     LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3"
>>>>     WAGYU="0.5.0 (Internal)"
>>>>
>>>>     I am not entirely sure why 'postgis_full_version' is showing
>>>>     'PGSQL="130"' instead of 'PGSQL="132"', as I am definitely on
>>>>     13.2, 'SELECT version()' returns:
>>>>
>>>>     PostgreSQL 13.2 (Ubuntu 13.2-1.pgdg20.04+1) on
>>>>     x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
>>>>     9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
>>>>
>>>>     Note that the installation of PostgreSQL / PostGIS is based on
>>>>     the plain vanilla one as available from the official apt
>>>>     repository for PostgreSQL:
>>>>
>>>>     http://apt.postgresql.org/pub/repos/apt/
>>>>     <http://apt.postgresql.org/pub/repos/apt/>
>>>>
>>>>     while GEOS was updated to 3.9.0 from the UbuntuGIS repository:
>>>>
>>>>     https://launchpad.net/~ubuntugis/+archive/ppa/
>>>>     <https://launchpad.net/~ubuntugis/+archive/ppa/>
>>>>
>>>>     *** STEPS to reproduce ***
>>>>
>>>>     - Load some OpenStreetMap as WGS1984 / SRID 4326 using osm2pgsql
>>>>
>>>>     - Ensure you have the default POSTGRESQL 13.2 / POSTGIS 3.1.1 /
>>>>     GEOS 3.8.0 version installed as from the official repository
>>>>
>>>>     - Create a column to store the geodesic area data
>>>>
>>>>     - UPDATE the column using "ST_Area(way::geography,true)" and
>>>>     note timing
>>>>
>>>>     - Now update GEOS to 3.9.0 from the UbuntuGIS repository
>>>>
>>>>     - Ensure GEOS is properly at 3.9.0 by calling 'SELECT
>>>>     postgis_full_version()'
>>>>
>>>>     - Run UPDATE a second time with "ST_Area(way::geography,true)"
>>>>     and again note timing
>>>>
>>>>     This should show a major difference.
>>>>
>>>>     Marco
>>>>
>>>>
>>>>
>>>>
>>>>     _______________________________________________
>>>>     postgis-devel mailing list
>>>>     postgis-devel at lists.osgeo.org  <mailto:postgis-devel at lists.osgeo.org>
>>>>     https://lists.osgeo.org/mailman/listinfo/postgis-devel  <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
>>>
>>>
>>>
>>>     _______________________________________________
>>>     postgis-devel mailing list
>>>     postgis-devel at lists.osgeo.org  <mailto:postgis-devel at lists.osgeo.org>
>>>     https://lists.osgeo.org/mailman/listinfo/postgis-devel  <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
>>
>>     _______________________________________________
>>     postgis-devel mailing list
>>     postgis-devel at lists.osgeo.org  <mailto:postgis-devel at lists.osgeo.org>
>>     https://lists.osgeo.org/mailman/listinfo/postgis-devel  <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
>     _______________________________________________
>     postgis-devel mailing list
>     postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
>     https://lists.osgeo.org/mailman/listinfo/postgis-devel
>     <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20210427/e79c2a34/attachment-0001.html>


More information about the postgis-devel mailing list