[geos-devel] Building Multipolygons from Rings

Frederik Ramm frederik at remote.org
Tue Jun 21 05:13:51 EDT 2011

```Hi,

I have one broad and one specific question regarding the
construction of polygons from linework. They arise from an algorithm
that I have written to generate proper multipolygons from OpenStreetMap
data.

Suppose you have a number of linestrings which, if all is well,
constitute the linework for a multipolygon, but there might also be
errors like a gap in a ring, and you'll frequently have touching inner
rings. You don't know in advance how many rings there are and which are
the inner and outer bits. Even if there are geometry errors you want to
salvage as much as possible because your only other option is to discard
the whole thing. Does somebody have a relatively robust way to make a
proper multipolygon from that? I cannot use the standard polygonizer
because it is not resilient enough; it will often produce invalid
geometries.

My current algorithm is along the lines of (1) try to assemble linework
into rings, discard dangling bits; (2) fix trivial self-intersections
and small gaps in each ring individually; (3) build a "hierarchy" of
rings by checking which ring contains which other ring; (4) merge
touching rings; (5) build polygons from outer/inner ring groups; (6)
build multipolygon.

This works ok even though the auto-fix bits in (2) leave much to be
desired at the moment (I have not yet found a good way to fix rings with
more than one self-intersection sensibly).

The specific question is:

My above algorithm needs to create a matrix that determines which ring
is inside which other ring. I have recently encountered a polygon in
OpenStreetMap that had a very complex (100,000 node) outer ring and very
many (about 10,000), relatively simple, inner rings. (A forest area in
Japan, if you want to know.) My algorithm took several hours to build
the "what contains what" matrix. The matrix is rather large but the
small polygons were usually not the problem; the problem were the
roughly 10,000 "does this 100k node polygon contains this little
polygon" tests. Even when I reduced this from polygon.contains(polygon)
to polygon.contains(random single vertex of polygon) it hardly went any
faster.

Is there a way how I can somehow pre-process my 100k node polygon to
make the "contains" queries go faster? I have the impression that
"contains" must do some preprocessing internally, and that maybe the
same preprocessing is repeated unnecessarily if I do 10,000 "contains"
checks for the same polygon?

Bye
Frederik
```