[PostGIS] #5737: ST_SimplifyPreserveTopology non-deterministic behavior?

PostGIS trac at osgeo.org
Tue Jun 4 07:39:51 PDT 2024


#5737: ST_SimplifyPreserveTopology non-deterministic behavior?
-------------------------------+---------------------------
 Reporter:  Alessandro Donati  |      Owner:  pramsey
     Type:  defect             |     Status:  new
 Priority:  critical           |  Milestone:  PostGIS 3.4.3
Component:  postgis            |    Version:  3.4.x
 Keywords:                     |
-------------------------------+---------------------------
 Hi,

 I came across what it seems to be a bug in ST_SimplifyPreserveTopology.
 Any advice/help is much appreciated.

 In my workflow it is crucial to have a deterministic behavior (same output
 from same input data), but ST_SimplifyPreserveTopology outputs different
 geometries across consecutive runs.

 I made up a test case to explain the issue.

 test_simplify_fun.sql contains the function test_simplify().

 postgis333.log and postgis341.log are the outputs of running the script
 with psql against postgres 14/postgis 3.3.3 and postgres 15/postgis 3.4.1
 respectively.

 For more details on the various library versions (geos, ...) in the log
 files you can also find the output of select version(),
 postgis_full_version() on the two systems.

 Quick workflow explanation:

 - poly.the_geom is test geometry, polygon, srid 3857, valid, 4613 points
 - ST_SimplifyPreserveTopology(poly.the_geom, 200) is called in subqueries
 s1,s2.
 - ST_Simplify(poly.the_geom, 200) would produce an invalid geometry, so
 this is a case where "preserve topology" comes into play.
 - A few reports are run on the two simplified geometries
         - wkt_difference: difference, as text
         - simp_equals: result of st_equals
         - simp_ordering_equals: result of st_orderingequals
         - eq: result of bare equality
         - wkb_eq: result of equality between wkbs
         - n1simp: how many points in first simplified geometry
         - n2simp: how many points in second simplified geometry
         - n1rem: how many points after removing repeated points from first
 simplified geometry
         - n2rem: how many points after removing repeated points from
 second simplified geometry
 - test_simplify() is called 10 times on the same database session

 The logs have been generated like this:
 psql -d <db_uri> -f test_simplify_fun.sql > some.log

 As you can see from logs the results are quite odd across multiple runs:

 - the two simplified geometries have different point count (n1simp, n2simp
 change)
 - the difference between the two simplified geometries changes
 (wkt_difference changes)
 - simp_equals=t, wkt_difference <> 'POLYGON EMPTY'. st_equals is true but
 the difference is not empty.
 - simp_equals=f, simp_ordering_equals=t. The second should be more
 restrictive.
 - n1rem > n1simp (or n2rem > n2simp). st_removerepeteadpoints adds points.
 - simp_equals=f, wkb_eq=t. The WKB is the same but geometries are not.

 I also made a test by calling an old geos (3.5) from c++ and the results
 are the same (limited to simplify preserve topology not returning the same
 geometry across multiple calls).

 Could it be a bug in geos? Uninitialized variables? Some geos internal
 state that doesn't reset?

 Any suggestions to solve the indetermination problem?

 Thanks in advance
 Alessandro
-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5737>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list