<div dir="ltr"><div><div><div><div><div><div><div><div>I'm trying to create a polygon
overlay. The basic process is relatively simple: 1) Get the boundaries
2) Union the boundaries to node the linestrings 3) Polygonize the noded
outlines 4) Filter out holes using a contains or intersects test. The
problem I'm running into is that I'm getting "GEOSUnaryUnion:
TopologyException: found non-noded intersection between LINESTRING"
errors from ST_Union.<br><br></div>I've taken the liberty of creating a
script that will set up a database to reproduce what I'm seeing. I've
gzipped it in the interest of size, since it's got a fairly sizable set
of polygons. Let me know if this delivery method is a problem; I'll be
happy to use other means. You can find it here: <a href="https://drive.google.com/file/d/0B_6I7kRgE8teVUpha2Q4ZlNDMWs/view?usp=sharing" target="_blank">https://drive.google.com/file/d/0B_6I7kRgE8teVUpha2Q4ZlNDMWs/view?usp=sharing</a>. (I originally tried to attach it, but the PostGIS mailing list rejected it.) The contents are a plain SQL file. It won't
create a database for you, but it will create the PostGIS extension IF
NOT EXISTS. Specifically for this problem, it creates and populates this
table:<br><br><span style="font-family:monospace,monospace">CREATE TABLE error_generating_polygons (<br> geom_set_id SERIAL PRIMARY KEY,<br> outer_boundary geometry NOT NULL,<br> polygons geometry[] NOT NULL,<br> error_code text NOT NULL,<br> error_message text NOT NULL<br>);</span><br><br></div>It's
a little weird, so I think I should provide some explanation. The
primary key is just a surrogate. The outer_boundary can be ignored for
the purposes of this error; it's there for me for that filtering out
holes and chaff after I get the raw overlay back. The polygons column is
the most interesting; it contains a set of polygons that will reproduce
this error. (I'll get into why I use an array in a minute.) The
error_code and error_message are the values of SQLSTATE and SQLERRM I
get when noding the boundaries; I've included them so that anyone can
compare if they get different results.<br><br></div>The reason I have an
array of polygons is that this was the simplest method of providing a
large group of polygons. The data you're seeing here is not sample data
I've made up. This is real world data provided to me by a client, who
has produced them primarily through a mixture of GPS, manual processing,
and probably some more or less "automated" processing. In short, it's
pretty messy, and these are the actual polygons I need to overlay. As
for why I'm giving you these big groups instead of breaking things down
into smaller chunks, I did try that, at least somewhat. I found that if I
broke this up into pairs of polygons, I could only reproduce the error
with a <i>single</i> pair out of thousands of combinations. So sorry for
the shear number of polygons here, but apparently, the errors come at
least partly from some cumulative effect.<br><br>I have verified that every polygon in there is a valid one:<br><br><span style="font-family:monospace,monospace">SELECT geom_set_id, ST_IsValidDetail(geom)<br>FROM (SELECT geom_set_id, UNNEST(polygons) geom<br> FROM error_generating_polygons<br> ) x<br>WHERE NOT ST_IsValid(geom);</span><br><br>That query gives me an empty set, plus I do plenty of checking before letting a geometry get into the system.<br><br></div>Now, to get down to business. Here's the query you can run to see errors:<br><br><span style="font-family:monospace,monospace">DO $$<br>DECLARE problem_row error_generating_polygons%ROWTYPE;<br>BEGIN<br> FOR problem_row IN (SELECT * FROM error_generating_polygons) LOOP<br> BEGIN<br> PERFORM ST_Union(ST_Boundary(geom))<br> FROM UNNEST(problem_row.polygons) p (geom);<br> RAISE NOTICE 'geom_set_id % succeeded', problem_row.geom_set_id;<br> EXCEPTION<br> WHEN OTHERS THEN<br> RAISE NOTICE 'Error for geom_set_id % (Code %): %', problem_row.geom_set_id, SQLSTATE, SQLERRM;<br> END;<br> END LOOP;<br>END<br>$$;</span><br><br></div><div>I also tried a variant of this: I replaced <span style="font-family:monospace,monospace">ST_Bounrary(geom)<font face="arial,helvetica,sans-serif"> with </font>ST_Boundary(ST_SnapToGrid(geom, 0.01))<font face="arial,helvetica,sans-serif">.<br></font></span></div><div><br></div><div>I
tested on two separate installations of PostgreSQL/PostGIS. Believe it
or not, I'm getting different results from these. Here are the specs and
results for each one:<br><br></div><div>1) Runs on CentOS, installed from CentOS repositories (I believe. I need to double check with IT to be 100% sure). Version Info:<br><br><span style="font-family:monospace,monospace">PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit<br>POSTGIS="2.1.4
r12966" GEOS="3.4.2-CAPI-1.8.2 r3921" PROJ="Rel. 4.8.0, 6 March 2012"
GDAL="GDAL 1.9.2, released 2012/10/08" LIBXML="2.7.6" LIBJSON="UNKNOWN"
RASTER</span><br><br></div><div>Every row throws an error on this one. This is where the errors in the table come from. Adding <span style="font-family:monospace,monospace">ST_SnapToGrid</span> makes it error out on only one row.<br><br></div><div>This one is more similar to my production machine at present.<br><br>2) Runs on Debian, installed from PostgreSQL maintained repository. Version info:<br><br><span style="font-family:monospace,monospace">PostgreSQL 9.4.0 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bit<br>POSTGIS="2.1.5
r13152" GEOS="3.3.3-CAPI-1.7.4" PROJ="Rel. 4.7.1, 23 September 2009"
GDAL="GDAL 1.9.0, released 2011/12/29" LIBXML="2.8.0" LIBJSON="UNKNOWN"
RASTER</span><br><br></div><div>This one errored out less; only about 1/3 of the rows error out. Adding <span style="font-family:monospace,monospace">ST_SnapToGrid</span> also makes it error out on only one row.<br></div><div><br><br></div><div>An
oddity to note: the Debian install has a more up to date PostGIS
(2.1.5) but a lagging GEOS (3.3.3), while the CentOS install has a
slightly older PostGIS (2.1.4) and an up to date GEOS (3.4.2). This
makes it difficult for me to determine where the improvement comes from:
it could be from the older version of GEOS, or it could be from the
newer version of PostGIS. It might also not have anything to do with the
versions. As I said, it seems to be some sort of cumulative effect, so
it might even be order dependent with the different installs choosing
different evaluation orders.<br><br></div><div>I would also like to make
it a point that even a single row erroring out is a problem for me. I
need to run this as part of a long running automatic process, and once
the overlays are generated, I will need to further generate summary
statistics using them. Also, while snapping to grid greatly improved my
results, it also caused a few errors on groups that were working without
it, if I recall correctly. (Those sets aren't included here.)<br></div></div></div><br>I
considered going straight to filing a bug, but I decided I'd rather
bring it up on the mailing list first, especially given that I'm seeing
different behavior on different installs. I wanted to see if any further
work could be done to narrow this down before filing a report, and I
wanted to confirm that filing it on the GEOS library would be the right
place. Feel free to ask for any clarification or for me to try anything
with this data. (Although, please be aware that my skill set may be
lacking, especially if you want me to do something with native code. =/)<br><br>Any
potential work-around suggestions would be appreciated as well. Right
now, my work-around is to dump out all the individual line segments to a
table and use ST_Snap and ST_Equals to eliminate very close line
segments. It gets rid of these errors, but I think it may be the source
of some problems I'm having when I polygonize the resulting noded line
string. With that process of noding everything, ST_Polygonize is
spitting out polygons that cover areas that should be separate polygons
or even just leaving out entire areas altogether. (I have taken a look
at those dumped-rebuilt linstring using QGIS. The lines appear to be
correct, but there could be some microscopic gap or something weird like
that for all I know.) I'm not really interested in addressing that
problem right now; I'd rather see what dealing with the noding problems
does to fix it first.<br><br></div>Thanks for anyone's help looking into this.</div>