[postgis-users] ST_Union(ST_Buffer(...)) hangs for large polygons on RHEL7/CentOS7 / PG 9.5

Jason Greenlaw - NOAA Affiliate jason.greenlaw at noaa.gov
Tue Oct 8 09:04:43 PDT 2019


I've run into an odd performance issue calling ST_Union with some large
polygons on RHEL7/CentOS7 systems, but I'm not able to reproduce it using
the mdillon/postgis <https://hub.docker.com/r/mdillon/postgis/> docker
images.  I'm using PostgreSQL 9.5 with PostGIS 2.5.3  (however problem also
seems to occur with 2.2, 2.3, and 2.4 on RHEL/CentOS 7).

The problematic polygons represent detailed shoreline, so contain a lot of
vertices.  In trying to dissolve/simplify the geometry, I'm doing a
preliminary ST_Union(ST_Buffer(...))  (before calling ST_Simplify).  The
issue appears to be with the ST_Union operation, as the buffer on its own
completes in 1-2 minutes.  The WKB geometry is stored in Web Mercator
(EPSG:3857) projection.

A zipped pgdump of the table (21 rows) can be downloaded here
<https://drive.google.com/open?id=1dmZigm0sJwE_QviUzXrWfkJBP9L3vypN> (~11mb
compressed, ~23mb uncompressed).

Steps to reproduce:

1) Have a CentOS7 or RHEL7 PostgreSQL 9.5 instance with PostGIS 2.2, 2.3,
2.4, or 2.5
    - I've tested with VMs built from scratch and with the Crunchy
Containers suite's <https://github.com/CrunchyData/crunchy-containers> CentOS
7 images
2) Download & unzip the pgdump
3) Import the table
        psql -h hostname -p port -U postgres -d dbname -f
pgdump_nws_haz_hang_201910.sql
4) Run the query
        psql  -h hostname -p port -U postgres -d dbname
        dbname=> EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;


Using the mdillon/postgis docker container (tested with PostGIS 2.5 on PG
9.5, 9.6, 10, 11) based on Debian, the command completes for me in about 3
minutes.  Here's the EXPLAIN ANALYZE output from one of those
successful docker runs with PG 11 & PostGIS 2.5.2 (also includes a WHERE
clause that can be ignored as it matches all rows):

postgres=# EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry,
100.0))::geometry(Geometry,
3857),category,vteccode,prod_type,validtime,starttime,endtime FROM
public.nws_haz_hang_201910 WHERE vteccode='GLW' GROUP BY
 category,vteccode,prod_type,validtime,starttime,endtime;
                                                         QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=10.13..10.18 rows=1 width=706) (actual
time=0.823..125929.139 rows=5 loops=1)
   Group Key: category, vteccode, prod_type, validtime, starttime, endtime
   ->  Sort  (cost=10.13..10.14 rows=1 width=706) (actual time=0.069..0.095
rows=21 loops=1)
         Sort Key: category, prod_type, validtime, starttime, endtime
         Sort Method: quicksort  Memory: 30kB
         ->  Seq Scan on nws_haz_hang_201910  (cost=0.00..10.12 rows=1
width=706) (actual time=0.009..0.020 rows=21 loops=1)
               Filter: ((vteccode)::text = 'GLW'::text)
 Planning Time: 0.176 ms
 Execution Time: 125929.277 ms


On my original RHEL7 system, the first time I tried to reproduce it, the
command eventually completed for me in just under 3 hours, but I have since
tried it multiple times on different CentOS7 and RHEL7 systems where it
does not complete even after 2 days.

Version info from my latest reproduction test:

    POSTGIS="2.5.3 r17699" [EXTENSION] PGSQL="95" GEOS="3.7.2-CAPI-1.11.2
b55d2125" PROJ="Rel. 6.2.0, September 1st, 2019" GDAL="GDAL 3.0.1, released
2019/06/28" LIBXML="2.9.1" LIBJSON="0.11" LIBPROTOBUF="1.0.2" RASTER

A few anecdotal stack traces seems to indicate it's spending most of the
time in the following libgeos calls:

geos::operation::overlay::snap::LineStringSnapper::snapTo
    -> geos::operation::overlay::snap::LineStringSnapper::snapVertices
        ->
geos::operation::overlay::snap::LineStringSnapper::findVertexToSnap
    -> geos::operation::overlay::snap::LineStringSnapper::snapSegments
        ->
geos::operation::overlay::snap::LineStringSnapper::findSegmentToSnap
            -> geos::algorithm::CGAlgorithms::distancePointLine


A single full stack trace is also pasted below.

If anyone could provide any insight, it would be greatly appreciated!

Thanks,
Jason Greenlaw



-------------------------------
#0  0x00007fe977134773 in
geos::operation::overlay::snap::LineStringSnapper::findVertexToSnap(geos::geom::Coordinate
const&, std::_List_iterator<geos::geom::Coordinate>,
std::_List_iterator<geos::geom::Coordinate>) () from /usr/geos37/lib64/
libgeos-3.7.2.so
#1  0x00007fe97713485e in
geos::operation::overlay::snap::LineStringSnapper::snapVertices(geos::geom::CoordinateList&,
std::vector<geos::geom::Coordinate const*,
std::allocator<geos::geom::Coordinate const*> > const&) () from
/usr/geos37/lib64/libgeos-3.7.2.so
#2  0x00007fe9771352cf in
geos::operation::overlay::snap::LineStringSnapper::snapTo(std::vector<geos::geom::Coordinate
const*, std::allocator<geos::geom::Coordinate const*> > const&) () from
/usr/geos37/lib64/libgeos-3.7.2.so
#3  0x00007fe9771338cd in
geos::operation::overlay::snap::SnapTransformer::transformCoordinates(geos::geom::CoordinateSequence
const*, geos::geom::Geometry const*) () from /usr/geos37/lib64/
libgeos-3.7.2.so
#4  0x00007fe9770d8ab1 in
geos::geom::util::GeometryTransformer::transformLinearRing(geos::geom::LinearRing
const*, geos::geom::Geometry const*) () from /usr/geos37/lib64/
libgeos-3.7.2.so
#5  0x00007fe9770d951a in
geos::geom::util::GeometryTransformer::transformPolygon(geos::geom::Polygon
const*, geos::geom::Geometry const*) () from /usr/geos37/lib64/
libgeos-3.7.2.so
#6  0x00007fe9770d9288 in
geos::geom::util::GeometryTransformer::transformMultiPolygon(geos::geom::MultiPolygon
const*, geos::geom::Geometry const*) () from /usr/geos37/lib64/
libgeos-3.7.2.so
#7  0x00007fe9770d8d8a in
geos::geom::util::GeometryTransformer::transform(geos::geom::Geometry
const*) () from /usr/geos37/lib64/libgeos-3.7.2.so
#8  0x00007fe9771334d6 in
geos::operation::overlay::snap::GeometrySnapper::snapTo(geos::geom::Geometry
const&, double) () from /usr/geos37/lib64/libgeos-3.7.2.so
#9  0x00007fe9770c7784 in std::unique_ptr<geos::geom::Geometry,
std::default_delete<geos::geom::Geometry> >
geos::geom::SnapOp<geos::operation::overlay::overlayOp>(geos::geom::Geometry
const*, geos::geom::Geometry const*, geos::operation::overlay::overlayOp)
() from /usr/geos37/lib64/libgeos-3.7.2.so
#10 0x00007fe9770c827f in std::unique_ptr<geos::geom::Geometry,
std::default_delete<geos::geom::Geometry> >
geos::geom::BinaryOp<geos::operation::overlay::overlayOp>(geos::geom::Geometry
const*, geos::geom::Geometry const*, geos::operation::overlay::overlayOp)
() from /usr/geos37/lib64/libgeos-3.7.2.so
#11 0x00007fe9770c6684 in geos::geom::Geometry::Union(geos::geom::Geometry
const*) const () from /usr/geos37/lib64/libgeos-3.7.2.so
#12 0x00007fe97714195a in
geos::operation::geounion::CascadedPolygonUnion::unionActual(geos::geom::Geometry*,
geos::geom::Geometry*) () from /usr/geos37/lib64/libgeos-3.7.2.so
#13 0x00007fe977141df5 in
geos::operation::geounion::CascadedPolygonUnion::unionUsingEnvelopeIntersection(geos::geom::Geometry*,
geos::geom::Geometry*, geos::geom::Envelope const&) () from
/usr/geos37/lib64/libgeos-3.7.2.so
#14 0x00007fe9771420fa in
geos::operation::geounion::CascadedPolygonUnion::unionOptimized(geos::geom::Geometry*,
geos::geom::Geometry*) () from /usr/geos37/lib64/libgeos-3.7.2.so
#15 0x00007fe977142201 in
geos::operation::geounion::CascadedPolygonUnion::binaryUnion(geos::operation::geounion::GeometryListHolder*,
unsigned long, unsigned long) () from /usr/geos37/lib64/libgeos-3.7.2.so
#16 0x00007fe9771424b0 in
geos::operation::geounion::CascadedPolygonUnion::unionTree(geos::index::strtree::ItemsList*)
() from /usr/geos37/lib64/libgeos-3.7.2.so
#17 0x00007fe9771425f8 in
geos::operation::geounion::CascadedPolygonUnion::Union() () from
/usr/geos37/lib64/libgeos-3.7.2.so
#18 0x00007fe977142829 in
geos::operation::geounion::CascadedPolygonUnion::Union(std::vector<geos::geom::Polygon*,
std::allocator<geos::geom::Polygon*> >*) () from /usr/geos37/lib64/
libgeos-3.7.2.so
#19 0x00007fe977143cc0 in geos::operation::geounion::UnaryUnionOp::Union()
() from /usr/geos37/lib64/libgeos-3.7.2.so
#20 0x00007fe9770c61ee in geos::geom::Geometry::Union() const () from
/usr/geos37/lib64/libgeos-3.7.2.so
#21 0x00007fe97872f84b in GEOSUnaryUnion_r () from
/usr/geos37/lib64/libgeos_c.so.1
#22 0x00007fe978971cd7 in pgis_union_geometry_array () from
/usr/pgsql-9.5/lib/postgis-2.5.so
#23 0x00007fe978963c10 in PGISDirectFunctionCall1 () from
/usr/pgsql-9.5/lib/postgis-2.5.so
#24 0x00007fe978963c7d in pgis_geometry_union_finalfn () from
/usr/pgsql-9.5/lib/postgis-2.5.so
#25 0x00000000005bede4 in finalize_aggregate ()
#26 0x00000000005bf089 in finalize_aggregates ()
#27 0x00000000005bf7dc in ExecAgg ()
#28 0x00000000005b35e8 in ExecProcNode ()
#29 0x00000000005b0670 in standard_ExecutorRun ()
#30 0x0000000000568853 in ExplainOnePlan ()
#31 0x0000000000568b45 in ExplainOneQuery ()
#32 0x0000000000568f4d in ExplainQuery ()
#33 0x00000000006b5109 in standard_ProcessUtility ()
#34 0x00000000006b2791 in PortalRunUtility ()
#35 0x00000000006b351d in FillPortalStore ()
#36 0x00000000006b3f6f in PortalRun ()
#37 0x00000000006b1bb3 in PostgresMain ()
#38 0x00000000004695f9 in ServerLoop ()
#39 0x0000000000659d67 in PostmasterMain ()
#40 0x000000000046a21e in main ()
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20191008/f9737e76/attachment.html>


More information about the postgis-users mailing list