[geos-devel] [GEOS] #997: ST_Union performance degradation from GEOS 3.6.x -> 3.7.x

GEOS geos-trac at osgeo.org
Thu Oct 10 08:16:45 PDT 2019


#997: ST_Union performance degradation from GEOS 3.6.x -> 3.7.x
-------------------------+--------------------------
 Reporter:  greenlaw     |      Owner:  geos-devel@…
     Type:  defect       |     Status:  new
 Priority:  major        |  Milestone:
Component:  Default      |    Version:  3.7.0
 Severity:  Significant  |   Keywords:  union
-------------------------+--------------------------
 I [https://lists.osgeo.org/pipermail/postgis-
 users/2019-October/043688.html posted] about this earlier this week on the
 postgis-users mailing list.

 I've encountered a severe performance issue with GEOS >=3.7.0 when calling
 `ST_Union(ST_Buffer(...))` on some large polygons (EPSG:3857) with lots of
 vertices.  I've been testing primarily on PostgreSQL 9.5 but have
 reproduced the issue with multiple versions of PostGIS (2.2 to 2.5).

 This appears to be a separate issue from #867. That ticket references a
 recent JTS performance enhancement which was ported to GEOS and merged to
 master, but I tested with the latest version (which contains that fix),
 3.8.0rc3, and it didn't make a difference (actually performance was
 worse).

 With PostgreSQL 9.5, PostGIS 2.3.7, and GEOS 3.6.4, my query takes ~100
 sec to run, but after upgrading GEOS to 3.7.0, the query duration balloons
 to ~70 minutes. With 3.8.0rc3, duration again increased (to ~147 minutes).
 Exact timings for different version combinations is pasted below.

 I will attach a pgdump of the table (21 rows, 23mb uncompressed).

 Steps to reproduce:

 1. Have PostgreSQL 9.5 instance with PostGIS 2.2 - 2.5 and GEOS >= 3.7.0

 2. Import the pgdump

 {{{
 psql -h hostname -p port -U postgres -d dbname -f
 pgdump_nws_haz_hang_201910.sql
 }}}

 3. Execute the query

 {{{
 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
 }}}

 The query will likely take between 1 - 3 hours to complete on most
 systems.


 Timings for specific version combinations are pasted below.

 3.6.1 = ~85 sec:
 {{{
 POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.1-CAPI-1.10.1 r0" PROJ="Rel.
 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
 LIBXML="2.9.1" LIBJSON="0.11" RASTER

 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                        QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=10.30..10.45 rows=10 width=706) (actual
 time=35922.055..85310.959 rows=5 loops=1)
    Group Key: category, vteccode, prod_type, validtime, starttime, endtime
    ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.10 rows=10
 width=706) (actual time=0.014..0.053 rows=21 loops=1)
  Planning time: 0.460 ms
  Execution time: 85314.516 ms
 }}}

 3.6.2 = ~90 sec:
 {{{
 POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.2-CAPI-1.10.2 4d2925d6"
 PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
 LIBXML="2.9.1" LIBJSON="0.11" RASTER

 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                        QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=10.30..10.45 rows=10 width=706) (actual
 time=34092.861..90393.216 rows=5 loops=1)
    Group Key: category, vteccode, prod_type, validtime, starttime, endtime
    ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.10 rows=10
 width=706) (actual time=0.012..0.047 rows=21 loops=1)
  Planning time: 0.315 ms
  Execution time: 90393.369 ms
 }}}

 3.6.3 = ~89 sec:
 {{{
 POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.3-CAPI-1.10.3 80c13047"
 PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
 LIBXML="2.9.1" LIBJSON="0.11" RASTER

 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                        QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=10.30..10.45 rows=10 width=706) (actual
 time=33833.006..88916.065 rows=5 loops=1)
    Group Key: category, vteccode, prod_type, validtime, starttime, endtime
    ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.10 rows=10
 width=706) (actual time=0.011..0.047 rows=21 loops=1)
  Planning time: 0.306 ms
  Execution time: 88916.182 ms
 }}}

 3.6.4 = ~100 sec:
 {{{
 POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.4-CAPI-1.10.4 ba90ca5"
 PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
 LIBXML="2.9.1" LIBJSON="0.11" RASTER

 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                        QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=10.30..10.45 rows=10 width=706) (actual
 time=35376.572..99728.566 rows=5 loops=1)
    Group Key: category, vteccode, prod_type, validtime, starttime, endtime
    ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.10 rows=10
 width=706) (actual time=0.024..0.052 rows=21 loops=1)
  Planning time: 0.408 ms
  Execution time: 99728.803 ms
 }}}

 3.7.0 = ~70 min:
 {{{
 POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.0-CAPI-1.11.0 673b9939"
 PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
 LIBXML="2.9.1" LIBJSON="0.11" RASTER

 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                         QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=10.30..10.45 rows=10 width=706) (actual
 time=30353.481..4204487.862 rows=5 loops=1)
    Group Key: category, vteccode, prod_type, validtime, starttime, endtime
    ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.10 rows=10
 width=706) (actual time=0.017..0.053 rows=21 loops=1)
  Planning time: 0.362 ms
  Execution time: 4204488.069 ms
 }}}

 3.7.1 = ~72 min:
 {{{
 POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.1-CAPI-1.11.1 27a5e771"
 PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
 LIBXML="2.9.1" LIBJSON="0.11" RASTER

 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                        QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=10.30..10.45 rows=10 width=706) (actual
 time=31342.188..4323753.205 rows=5 loops=1)
    Group Key: category, vteccode, prod_type, validtime, starttime, endtime
    ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.10 rows=10
 width=706) (actual time=0.021..0.055 rows=21 loops=1)
  Planning time: 0.446 ms
  Execution time: 4323755.076 ms
 }}}

 3.7.2 = ~70 min:
 {{{
 POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.2-CAPI-1.11.2 b55d2125"
 PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25"
 LIBXML="2.9.1" LIBJSON="0.11" RASTER

 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                        QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=10.30..10.45 rows=10 width=706) (actual
 time=33643.593..4195304.854 rows=5 loops=1)
    Group Key: category, vteccode, prod_type, validtime, starttime, endtime
    ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.10 rows=10
 width=706) (actual time=0.019..0.052 rows=21 loops=1)
  Planning time: 0.374 ms
  Execution time: 4195305.021 ms
 }}}

 3.8.0rc3 = ~147 min:
 {{{
 POSTGIS="2.5.3 r17699" [EXTENSION] PGSQL="95" GEOS="3.8.0rc3-CAPI-1.13.1 "
 PROJ="Rel. 6.2.0, September 1st, 2019" GDAL="GDAL 3.0.1, released
 2019/06/28" LIBXML="2.9.1" LIBJSON="0.11" LIBPROTOBUF="1.0.2" RASTER

 EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
 100.0))::geometry(Geometry,
 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
 public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                        QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=10.30..10.45 rows=10 width=706) (actual
 time=42941.634..8834235.340 rows=5 loops=1)
    Group Key: category, vteccode, prod_type, validtime, starttime, endtime
    ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.10 rows=10
 width=706) (actual time=0.016..0.048 rows=21 loops=1)
  Planning time: 0.198 ms
  Execution time: 8834235.446 ms
 }}}

-- 
Ticket URL: <https://trac.osgeo.org/geos/ticket/997>
GEOS <http://trac.osgeo.org/geos>
GEOS (Geometry Engine - Open Source) is a C++ port of the Java Topology Suite (JTS).


More information about the geos-devel mailing list