[geos-devel] [GEOS] #997: ST_Union performance degradation from GEOS 3.6.x -> 3.7.x
GEOS
geos-trac at osgeo.org
Thu Oct 10 08:16:45 PDT 2019
#997: ST_Union performance degradation from GEOS 3.6.x -> 3.7.x
-------------------------+--------------------------
Reporter: greenlaw | Owner: geos-devel@…
Type: defect | Status: new
Priority: major | Milestone:
Component: Default | Version: 3.7.0
Severity: Significant | Keywords: union
-------------------------+--------------------------
I [https://lists.osgeo.org/pipermail/postgis-
users/2019-October/043688.html posted] about this earlier this week on the
postgis-users mailing list.
I've encountered a severe performance issue with GEOS >=3.7.0 when calling
`ST_Union(ST_Buffer(...))` on some large polygons (EPSG:3857) with lots of
vertices. I've been testing primarily on PostgreSQL 9.5 but have
reproduced the issue with multiple versions of PostGIS (2.2 to 2.5).
This appears to be a separate issue from #867. That ticket references a
recent JTS performance enhancement which was ported to GEOS and merged to
master, but I tested with the latest version (which contains that fix),
3.8.0rc3, and it didn't make a difference (actually performance was
worse).
With PostgreSQL 9.5, PostGIS 2.3.7, and GEOS 3.6.4, my query takes ~100
sec to run, but after upgrading GEOS to 3.7.0, the query duration balloons
to ~70 minutes. With 3.8.0rc3, duration again increased (to ~147 minutes).
Exact timings for different version combinations is pasted below.
I will attach a pgdump of the table (21 rows, 23mb uncompressed).
Steps to reproduce:
1. Have PostgreSQL 9.5 instance with PostGIS 2.2 - 2.5 and GEOS >= 3.7.0
2. Import the pgdump
{{{
psql -h hostname -p port -U postgres -d dbname -f
pgdump_nws_haz_hang_201910.sql
}}}
3. Execute the query
{{{
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
}}}
The query will likely take between 1 - 3 hours to complete on most
systems.
Timings for specific version combinations are pasted below.
3.6.1 = ~85 sec:
{{{
POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.1-CAPI-1.10.1 r0" PROJ="Rel.
4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
LIBXML="2.9.1" LIBJSON="0.11" RASTER
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual
time=35922.055..85310.959 rows=5 loops=1)
Group Key: category, vteccode, prod_type, validtime, starttime, endtime
-> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10
width=706) (actual time=0.014..0.053 rows=21 loops=1)
Planning time: 0.460 ms
Execution time: 85314.516 ms
}}}
3.6.2 = ~90 sec:
{{{
POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.2-CAPI-1.10.2 4d2925d6"
PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
LIBXML="2.9.1" LIBJSON="0.11" RASTER
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual
time=34092.861..90393.216 rows=5 loops=1)
Group Key: category, vteccode, prod_type, validtime, starttime, endtime
-> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10
width=706) (actual time=0.012..0.047 rows=21 loops=1)
Planning time: 0.315 ms
Execution time: 90393.369 ms
}}}
3.6.3 = ~89 sec:
{{{
POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.3-CAPI-1.10.3 80c13047"
PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
LIBXML="2.9.1" LIBJSON="0.11" RASTER
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual
time=33833.006..88916.065 rows=5 loops=1)
Group Key: category, vteccode, prod_type, validtime, starttime, endtime
-> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10
width=706) (actual time=0.011..0.047 rows=21 loops=1)
Planning time: 0.306 ms
Execution time: 88916.182 ms
}}}
3.6.4 = ~100 sec:
{{{
POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.4-CAPI-1.10.4 ba90ca5"
PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
LIBXML="2.9.1" LIBJSON="0.11" RASTER
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual
time=35376.572..99728.566 rows=5 loops=1)
Group Key: category, vteccode, prod_type, validtime, starttime, endtime
-> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10
width=706) (actual time=0.024..0.052 rows=21 loops=1)
Planning time: 0.408 ms
Execution time: 99728.803 ms
}}}
3.7.0 = ~70 min:
{{{
POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.0-CAPI-1.11.0 673b9939"
PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
LIBXML="2.9.1" LIBJSON="0.11" RASTER
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual
time=30353.481..4204487.862 rows=5 loops=1)
Group Key: category, vteccode, prod_type, validtime, starttime, endtime
-> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10
width=706) (actual time=0.017..0.053 rows=21 loops=1)
Planning time: 0.362 ms
Execution time: 4204488.069 ms
}}}
3.7.1 = ~72 min:
{{{
POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.1-CAPI-1.11.1 27a5e771"
PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
LIBXML="2.9.1" LIBJSON="0.11" RASTER
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual
time=31342.188..4323753.205 rows=5 loops=1)
Group Key: category, vteccode, prod_type, validtime, starttime, endtime
-> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10
width=706) (actual time=0.021..0.055 rows=21 loops=1)
Planning time: 0.446 ms
Execution time: 4323755.076 ms
}}}
3.7.2 = ~70 min:
{{{
POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.2-CAPI-1.11.2 b55d2125"
PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
LIBXML="2.9.1" LIBJSON="0.11" RASTER
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual
time=33643.593..4195304.854 rows=5 loops=1)
Group Key: category, vteccode, prod_type, validtime, starttime, endtime
-> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10
width=706) (actual time=0.019..0.052 rows=21 loops=1)
Planning time: 0.374 ms
Execution time: 4195305.021 ms
}}}
3.8.0rc3 = ~147 min:
{{{
POSTGIS="2.5.3 r17699" [EXTENSION] PGSQL="95" GEOS="3.8.0rc3-CAPI-1.13.1 "
PROJ="Rel. 6.2.0, September 1st, 2019" GDAL="GDAL 3.0.1, released
2019/06/28" LIBXML="2.9.1" LIBJSON="0.11" LIBPROTOBUF="1.0.2" RASTER
EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
category,vteccode,prod_type,validtime,starttime,endtime;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual
time=42941.634..8834235.340 rows=5 loops=1)
Group Key: category, vteccode, prod_type, validtime, starttime, endtime
-> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10
width=706) (actual time=0.016..0.048 rows=21 loops=1)
Planning time: 0.198 ms
Execution time: 8834235.446 ms
}}}
--
Ticket URL: <https://trac.osgeo.org/geos/ticket/997>
GEOS <http://trac.osgeo.org/geos>
GEOS (Geometry Engine - Open Source) is a C++ port of the Java Topology Suite (JTS).
More information about the geos-devel
mailing list