[postgis-devel] Much slower processing on GEOS 3.9.0 versus 3.8.0 for geodesic area calculation
Marco Boeringa
marco at boeringa.demon.nl
Tue Apr 27 02:17:31 PDT 2021
Yes, my time is also finite, and I agree things must be as efficient and
clear as possible.
However, that also requires a bit of introductory back-and-forth
questioning. Everybody knows the story of the help desk employee that
spends several hours attempting to "fix" an issue with a non-functional
printer, only to discover the client on the phone forgot to put the plug
in the socket... Lets first determine we've put the
"plug-in-the-socket", which requires the kind of targeted questioning
Raúl did.
And I already cut the problem down to two very clear things:
- Issue started after upgrading to GEOS 3.9.0 from 3.8.0, but I need to
verify if any (dependent) packages were updated as well in the process
per Raúl's question. Job to be done by me for which I will report back,
although I think it likely that GEOS maintainers should be able to
answer this question at least partly themselves: were any dependencies
changed to new versions to support the GEOS 3.9.0 release?
- ST_Area(), as a very basic function, shows the issue.
This combination may make little sense, but I have now reproduced it
twice. I wish I could turn water in wine, but this is what it is, so
this is at least the starting point of a possible solution to the
question of what is wrong.
One question that still is on my mind, is if we can exclude an issue
caused by the PostgreSQL / PostGIS 3.1.1 installation. In a previous
issue I posted to the PostGIS mailing list recently, it became clear the
official 3.1.1 PostGIS install of
http://apt.postgresql.org/pub/repos/apt/
<http://apt.postgresql.org/pub/repos/apt/> has *not* been compiled with
GEOS 3.9.0 support. As Paul wrote:
"It's curious, but the implication is that the postgis on your machine
was built against an earlier GEOS. The version number that postgis
reports for GEOS is generated at run-time, while the test of whether to
include support for various functions is determined at compile-time.
Hence you can see a "GEOS 3.9" version in your postgis, but still not
have access to functions that are 3.9+ only."
This was a response to a message by me where I received an error message
attempting to use a new PostGIS 3.1.1 specific function
("ST_ReducePrecision"), that indicated this lack of GEOS 3.9.0
compilation. Is that a potential source of more problems than just not
being able to access a new function definition? Are there any other
potential knock-on effects of this, or can we exclude this as being of
any relevance to this new performance regression issue?
Marco
Op 26-4-2021 om 17:41 schreef Paul Ramsey:
> So, to be clearer about how to interface with -devel with your issues:
>
> - data, in a dump file, as small as possible to demonstrate the issue,
> preferably just one table, generated with pg_dump -d database -t table
> --no-owner, ideally small enough to attach to a trac ticket
> - the ideal size of data is "no data at all", many bugs can be reduced
> to one geometry that can be part of the SQL call that exercises the
> condition
> - SQL, in a sql file, again as small as possible, one statement
> ideally, with as few functions as possible
>
> The goal here is to allow a dev to copy your SQL and drop it into a
> console in one cut-paste step, because given our finite time base,
> spending a lot of time trying to replicate a situation and then
> getting a "null result" where everything still works fine is hugely
> annoying. I'm not going to attempt to replicate your setup as
> described because there are way too many degrees of freedom, I'm
> almost guaranteed to get a non replicable situation, and have wasted
> whatever time I invest in it.
>
> People who invest the time to reduce their problem to a cut'paste
> example get responses and fixes surprisingly fast. Those who don't
> wonder why OSS developers are such standoffish bastards. There's a
> continuum of quality between "it's broken" and "here's a one line SQL
> statement that does exactly this thing which is not what is supposed
> to happen", but the closer you get to the far end, the faster and
> better things will go. I recognize that regressions and multi-version
> issues are the hardest, but here too time reducing the problem to the
> minimum degrees of freedom will result in attention.
>
> In answer to your question: GEOS has no direct linkage to proj, and it
> makes little sense that ST_Area(geometry::geography) would be
> sensitive to GEOS version, as neither (the function nor the cast)
> enters the GEOS code line at all. (They do, however, both enter the
> proj code line.)
>
> P.
>
>
> On Mon, Apr 26, 2021 at 8:11 AM Marco Boeringa
> <marco at boeringa.demon.nl <mailto:marco at boeringa.demon.nl>> wrote:
>
> One more thing to note related to this issue is that the UPDATE
> taking place is in a high multi-threaded application, with
> individual UPDATE statements for each record, however batched in
> hundreds or thousands of records inside a single transaction.
>
> There is a known issue with PROJ 6+ where creating a
> transformation object is considerably more costly than in PROJ <=5
> (https://trac.osgeo.org/postgis/ticket/4372
> <https://trac.osgeo.org/postgis/ticket/4372>), but both
> installations I tested for GEOS 3.8.0 and GEOS 3.9.0 are running
> PROJ 6.3.1, as the 'SELECT postgis_full_version()' showed,
> *unless* the answer to my last question:
>
> * Is GEOS dependency on PROJ dynamic like PostGIS, or is it baked
> in? What I mean is, if I see 'SELECT postgis_full_version' return
> 'PROJ="6.3.1"' as it does in my installation, is GEOS / libgeos
> then using the exact same version 6.3.1, or might it be using an
> older version, e.g. PROJ 5.x?
>
> is that GEOS *does* have its own PROJ version, and my non-updated
> original installation of GEOS 3.8.0 is using a lower than PROJ 6
> version, while the GEOS 3.9.0 is on 6 or higher. Anyone can answer
> the above?
>
> Marco
>
> Op 26-4-2021 om 15:52 schreef Marco Boeringa:
>>
>> Hi Raúl,
>>
>> "Does the slowdown also appears if you only do the casting to
>> geography (no ST_Area)?"
>>
>> I was in the process of restoring a VM and restarting a process.
>> Answering this question will require me to stop that process and
>> restore the VM again from backup, which is quite time consuming.
>> It will therefor take some time to answer this.
>>
>> "When you update GEOS, are you updating any other package at the
>> same time?"
>>
>> I only selected 'libgeos-3.9.0' from the Synaptic Package
>> Manager. I can't remember if it automatically selected other
>> dependencies of that package to update as well, but I certainly
>> didn't select anything else myself at the same time. You probably
>> have a better idea of what dependencies might have been updated
>> in other to support 'libgeos-3.9.0'.
>>
>> I do have one question for you as well:
>>
>> * Is GEOS dependency on PROJ dynamic like PostGIS, or is it baked
>> in? What I mean is, if I see 'SELECT postgis_full_version' return
>> 'PROJ="6.3.1"' as it does in my installation, is GEOS / libgeos
>> then using the exact same version 6.3.1, or might it be using an
>> older version, e.g. PROJ 5.x?
>>
>> Regards,
>>
>> Marco
>>
>> Op 26-4-2021 om 14:42 schreef Raúl Marín:
>>> Hi Marco,
>>>
>>> Two questions:
>>> * Does the slowdown also appears if you only do the casting to
>>> geography (no ST_Area)?
>>> * When you update GEOS, are you updating any other package at
>>> the same time?
>>>
>>> Regards,
>>> Raúl.
>>>
>>>
>>> On 2021-04-26 12:03, Marco Boeringa wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I am seeing a very significant performance regression (104x)
>>>> when attempting to calculate geodesic area using PostGIS after
>>>> upgrading GEOS from 3.8.0 to 3.9.0:
>>>>
>>>> POSTGIS 3.1.1 / GEOS 3.8.0: 1m52s
>>>> POSTGIS 3.1.1 / GEOS 3.9.0: 3h(!)14m45s
>>>>
>>>> This makes it impossible for me to process planet level
>>>> OpenStreetMap data, and I had to restore a backup to get back
>>>> to a usable situation.
>>>>
>>>> The actual code run is nothing more than what the PostGIS
>>>> manual shows:
>>>>
>>>> "float ST_Area(geography geog, boolean use_spheroid=true);"
>>>>
>>>> Note that the OpenStreetMap data, as imported by osm2pgsql, is
>>>> stored in PostGIS in WGS1984 / SRID 4326 as 'geometry', so in
>>>> order to use it in 'ST_Area', the data is simply cast to
>>>> 'geography':
>>>>
>>>> "ST_Area(way::geography,true)"
>>>>
>>>> Nothing else happens in this part of the processing when the
>>>> performance regression is visible. In fact, other more
>>>> complicated parts of the code that call more advanced functions
>>>> like 'ST_SimplifyVW' / 'ST_Transform' / 'ST_ChaikinSmoothing'
>>>> run at expected / normal speed in GEOS 3.9.0 compared to 3.8.0,
>>>> so it is not an overall slower processing that is happening.
>>>>
>>>> The postgis_full_version results for these setups are:
>>>>
>>>> - POSTGRESQL 13.2 / POSTGIS 3.1.1 / GEOS 3.8.0:
>>>>
>>>> PostGIS version: POSTGIS="3.1.1 aaf4c79" [EXTENSION]
>>>> PGSQL="130" GEOS="3.8.0-CAPI-1.13.1 " PROJ="6.3.1"
>>>> LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3"
>>>> WAGYU="0.5.0 (Internal)"
>>>>
>>>> - POSTGRESQL 13.2 / POSTGIS 3.1.1 / GEOS 3.9.0:
>>>>
>>>> PostGIS version: POSTGIS="3.1.1 aaf4c79" [EXTENSION]
>>>> PGSQL="130" GEOS="3.9.0-CAPI-1.16.2" PROJ="6.3.1"
>>>> LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3"
>>>> WAGYU="0.5.0 (Internal)"
>>>>
>>>> I am not entirely sure why 'postgis_full_version' is showing
>>>> 'PGSQL="130"' instead of 'PGSQL="132"', as I am definitely on
>>>> 13.2, 'SELECT version()' returns:
>>>>
>>>> PostgreSQL 13.2 (Ubuntu 13.2-1.pgdg20.04+1) on
>>>> x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
>>>> 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
>>>>
>>>> Note that the installation of PostgreSQL / PostGIS is based on
>>>> the plain vanilla one as available from the official apt
>>>> repository for PostgreSQL:
>>>>
>>>> http://apt.postgresql.org/pub/repos/apt/
>>>> <http://apt.postgresql.org/pub/repos/apt/>
>>>>
>>>> while GEOS was updated to 3.9.0 from the UbuntuGIS repository:
>>>>
>>>> https://launchpad.net/~ubuntugis/+archive/ppa/
>>>> <https://launchpad.net/~ubuntugis/+archive/ppa/>
>>>>
>>>> *** STEPS to reproduce ***
>>>>
>>>> - Load some OpenStreetMap as WGS1984 / SRID 4326 using osm2pgsql
>>>>
>>>> - Ensure you have the default POSTGRESQL 13.2 / POSTGIS 3.1.1 /
>>>> GEOS 3.8.0 version installed as from the official repository
>>>>
>>>> - Create a column to store the geodesic area data
>>>>
>>>> - UPDATE the column using "ST_Area(way::geography,true)" and
>>>> note timing
>>>>
>>>> - Now update GEOS to 3.9.0 from the UbuntuGIS repository
>>>>
>>>> - Ensure GEOS is properly at 3.9.0 by calling 'SELECT
>>>> postgis_full_version()'
>>>>
>>>> - Run UPDATE a second time with "ST_Area(way::geography,true)"
>>>> and again note timing
>>>>
>>>> This should show a major difference.
>>>>
>>>> Marco
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> postgis-devel mailing list
>>>> postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
>>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
>>>
>>>
>>>
>>> _______________________________________________
>>> postgis-devel mailing list
>>> postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
> <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20210427/e79c2a34/attachment-0001.html>
More information about the postgis-devel
mailing list