<div dir="ltr">I've run into an odd performance issue calling ST_Union with some large polygons on RHEL7/CentOS7 systems, but I'm not able to reproduce it using the <a href="https://hub.docker.com/r/mdillon/postgis/">mdillon/postgis</a> docker images. I'm using PostgreSQL 9.5 with PostGIS 2.5.3 (however problem also seems to occur with 2.2, 2.3, and 2.4 on RHEL/CentOS 7).<div><br></div><div>The problematic polygons represent detailed shoreline, so contain a lot of vertices. In trying to dissolve/simplify the geometry, I'm doing a preliminary ST_Union(ST_Buffer(...)) (before calling ST_Simplify). The issue appears to be with the ST_Union operation, as the buffer on its own completes in 1-2 minutes. The WKB geometry is stored in Web Mercator (EPSG:3857) projection.</div><div><br></div><div>A zipped pgdump of the table (21 rows) can be downloaded <a href="https://drive.google.com/open?id=1dmZigm0sJwE_QviUzXrWfkJBP9L3vypN">here</a> (~11mb compressed, ~23mb uncompressed). <br></div><div><br></div><div>Steps to reproduce:</div><div><br></div><div>1) Have a CentOS7 or RHEL7 PostgreSQL 9.5 instance with PostGIS 2.2, 2.3, 2.4, or 2.5</div><div> - I've tested with VMs built from scratch and with the <a href="https://github.com/CrunchyData/crunchy-containers">Crunchy Containers suite's</a> CentOS 7 images</div><div>2) Download & unzip the pgdump</div><div>3) Import the table</div><div> psql -h hostname -p port -U postgres -d dbname -f pgdump_nws_haz_hang_201910.sql</div><div>4) Run the query</div><div> psql
-h hostname -p port -U postgres -d dbname </div><div> dbname=> EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime;<br><div><br></div><div><br></div><div>Using the mdillon/postgis docker container (tested with PostGIS 2.5 on PG 9.5, 9.6, 10, 11) based on Debian, the command completes for me in about 3 minutes. Here's the EXPLAIN ANALYZE output from one of those successful docker runs with PG 11 & PostGIS 2.5.2 (also includes a WHERE clause that can be ignored as it matches all rows):</div><div><br></div></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div>postgres=# EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 WHERE vteccode='GLW' GROUP BY category,vteccode,prod_type,validtime,starttime,endtime;</div></div><div><div> QUERY PLAN </div></div><div><div>-----------------------------------------------------------------------------------------------------------------------------</div></div><div><div> GroupAggregate (cost=10.13..10.18 rows=1 width=706) (actual time=0.823..125929.139 rows=5 loops=1)</div></div><div><div> Group Key: category, vteccode, prod_type, validtime, starttime, endtime</div></div><div><div> -> Sort (cost=10.13..10.14 rows=1 width=706) (actual time=0.069..0.095 rows=21 loops=1)</div></div><div><div> Sort Key: category, prod_type, validtime, starttime, endtime</div></div><div><div> Sort Method: quicksort Memory: 30kB</div></div><div><div> -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.12 rows=1 width=706) (actual time=0.009..0.020 rows=21 loops=1)</div></div><div><div> Filter: ((vteccode)::text = 'GLW'::text)</div></div><div><div> Planning Time: 0.176 ms</div></div><div><div> Execution Time: 125929.277 ms</div></div></blockquote><div><div><br></div><div>On my original RHEL7 system, the first time I tried to reproduce it, the command eventually completed for me in just under 3 hours, but I have since tried it multiple times on different CentOS7 and RHEL7 systems where it does not complete even after 2 days.</div><div><br></div><div>Version info from my latest reproduction test:</div><div><br></div><div> POSTGIS="2.5.3 r17699" [EXTENSION] PGSQL="95" GEOS="3.7.2-CAPI-1.11.2 b55d2125" PROJ="Rel. 6.2.0, September 1st, 2019" GDAL="GDAL 3.0.1, released 2019/06/28" LIBXML="2.9.1" LIBJSON="0.11" LIBPROTOBUF="1.0.2" RASTER</div><div><br></div><div>A few anecdotal stack traces seems to indicate it's spending most of the time in the following libgeos calls:</div><div><br></div></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div>geos::operation::overlay::snap::LineStringSnapper::snapTo</div><div> -> geos::operation::overlay::snap::LineStringSnapper::snapVertices</div><div> -> geos::operation::overlay::snap::LineStringSnapper::findVertexToSnap</div><div> -> geos::operation::overlay::snap::LineStringSnapper::snapSegments</div><div> -> geos::operation::overlay::snap::LineStringSnapper::findSegmentToSnap</div></div><div> -> geos::algorithm::CGAlgorithms::distancePointLine</div></blockquote><div><div><br></div></div><div>A single full stack trace is also pasted below.</div><div><br></div><div>If anyone could provide any insight, it would be greatly appreciated!</div><div><br></div><div>Thanks,</div><div>Jason Greenlaw</div><div><br></div><div><br></div><div><br></div><div>-------------------------------</div><div>#0 0x00007fe977134773 in geos::operation::overlay::snap::LineStringSnapper::findVertexToSnap(geos::geom::Coordinate const&, std::_List_iterator<geos::geom::Coordinate>, std::_List_iterator<geos::geom::Coordinate>) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#1 0x00007fe97713485e in geos::operation::overlay::snap::LineStringSnapper::snapVertices(geos::geom::CoordinateList&, std::vector<geos::geom::Coordinate const*, std::allocator<geos::geom::Coordinate const*> > const&) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#2 0x00007fe9771352cf in geos::operation::overlay::snap::LineStringSnapper::snapTo(std::vector<geos::geom::Coordinate const*, std::allocator<geos::geom::Coordinate const*> > const&) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#3 0x00007fe9771338cd in geos::operation::overlay::snap::SnapTransformer::transformCoordinates(geos::geom::CoordinateSequence const*, geos::geom::Geometry const*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#4 0x00007fe9770d8ab1 in geos::geom::util::GeometryTransformer::transformLinearRing(geos::geom::LinearRing const*, geos::geom::Geometry const*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#5 0x00007fe9770d951a in geos::geom::util::GeometryTransformer::transformPolygon(geos::geom::Polygon const*, geos::geom::Geometry const*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#6 0x00007fe9770d9288 in geos::geom::util::GeometryTransformer::transformMultiPolygon(geos::geom::MultiPolygon const*, geos::geom::Geometry const*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#7 0x00007fe9770d8d8a in geos::geom::util::GeometryTransformer::transform(geos::geom::Geometry const*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#8 0x00007fe9771334d6 in geos::operation::overlay::snap::GeometrySnapper::snapTo(geos::geom::Geometry const&, double) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#9 0x00007fe9770c7784 in std::unique_ptr<geos::geom::Geometry, std::default_delete<geos::geom::Geometry> > geos::geom::SnapOp<geos::operation::overlay::overlayOp>(geos::geom::Geometry const*, geos::geom::Geometry const*, geos::operation::overlay::overlayOp) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#10 0x00007fe9770c827f in std::unique_ptr<geos::geom::Geometry, std::default_delete<geos::geom::Geometry> > geos::geom::BinaryOp<geos::operation::overlay::overlayOp>(geos::geom::Geometry const*, geos::geom::Geometry const*, geos::operation::overlay::overlayOp) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#11 0x00007fe9770c6684 in geos::geom::Geometry::Union(geos::geom::Geometry const*) const () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#12 0x00007fe97714195a in geos::operation::geounion::CascadedPolygonUnion::unionActual(geos::geom::Geometry*, geos::geom::Geometry*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#13 0x00007fe977141df5 in geos::operation::geounion::CascadedPolygonUnion::unionUsingEnvelopeIntersection(geos::geom::Geometry*, geos::geom::Geometry*, geos::geom::Envelope const&) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#14 0x00007fe9771420fa in geos::operation::geounion::CascadedPolygonUnion::unionOptimized(geos::geom::Geometry*, geos::geom::Geometry*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#15 0x00007fe977142201 in geos::operation::geounion::CascadedPolygonUnion::binaryUnion(geos::operation::geounion::GeometryListHolder*, unsigned long, unsigned long) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#16 0x00007fe9771424b0 in geos::operation::geounion::CascadedPolygonUnion::unionTree(geos::index::strtree::ItemsList*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#17 0x00007fe9771425f8 in geos::operation::geounion::CascadedPolygonUnion::Union() () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#18 0x00007fe977142829 in geos::operation::geounion::CascadedPolygonUnion::Union(std::vector<geos::geom::Polygon*, std::allocator<geos::geom::Polygon*> >*) () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#19 0x00007fe977143cc0 in geos::operation::geounion::UnaryUnionOp::Union() () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#20 0x00007fe9770c61ee in geos::geom::Geometry::Union() const () from /usr/geos37/lib64/<a href="http://libgeos-3.7.2.so">libgeos-3.7.2.so</a><br>#21 0x00007fe97872f84b in GEOSUnaryUnion_r () from /usr/geos37/lib64/libgeos_c.so.1<br>#22 0x00007fe978971cd7 in pgis_union_geometry_array () from /usr/pgsql-9.5/lib/<a href="http://postgis-2.5.so">postgis-2.5.so</a><br>#23 0x00007fe978963c10 in PGISDirectFunctionCall1 () from /usr/pgsql-9.5/lib/<a href="http://postgis-2.5.so">postgis-2.5.so</a><br>#24 0x00007fe978963c7d in pgis_geometry_union_finalfn () from /usr/pgsql-9.5/lib/<a href="http://postgis-2.5.so">postgis-2.5.so</a><br>#25 0x00000000005bede4 in finalize_aggregate ()<br>#26 0x00000000005bf089 in finalize_aggregates ()<br>#27 0x00000000005bf7dc in ExecAgg ()<br>#28 0x00000000005b35e8 in ExecProcNode ()<br>#29 0x00000000005b0670 in standard_ExecutorRun ()<br>#30 0x0000000000568853 in ExplainOnePlan ()<br>#31 0x0000000000568b45 in ExplainOneQuery ()<br>#32 0x0000000000568f4d in ExplainQuery ()<br>#33 0x00000000006b5109 in standard_ProcessUtility ()<br>#34 0x00000000006b2791 in PortalRunUtility ()<br>#35 0x00000000006b351d in FillPortalStore ()<br>#36 0x00000000006b3f6f in PortalRun ()<br>#37 0x00000000006b1bb3 in PostgresMain ()<br>#38 0x00000000004695f9 in ServerLoop ()<br>#39 0x0000000000659d67 in PostmasterMain ()<br>#40 0x000000000046a21e in main ()<br></div></div>