[geos-devel] Building Multipolygons from Rings

Frederik Ramm frederik at remote.org
Tue Jun 21 05:13:51 EDT 2011


    I have one broad and one specific question regarding the 
construction of polygons from linework. They arise from an algorithm 
that I have written to generate proper multipolygons from OpenStreetMap 

The broad question is:

Suppose you have a number of linestrings which, if all is well, 
constitute the linework for a multipolygon, but there might also be 
errors like a gap in a ring, and you'll frequently have touching inner 
rings. You don't know in advance how many rings there are and which are 
the inner and outer bits. Even if there are geometry errors you want to 
salvage as much as possible because your only other option is to discard 
the whole thing. Does somebody have a relatively robust way to make a 
proper multipolygon from that? I cannot use the standard polygonizer 
because it is not resilient enough; it will often produce invalid 

My current algorithm is along the lines of (1) try to assemble linework 
into rings, discard dangling bits; (2) fix trivial self-intersections 
and small gaps in each ring individually; (3) build a "hierarchy" of 
rings by checking which ring contains which other ring; (4) merge 
touching rings; (5) build polygons from outer/inner ring groups; (6) 
build multipolygon.

This works ok even though the auto-fix bits in (2) leave much to be 
desired at the moment (I have not yet found a good way to fix rings with 
more than one self-intersection sensibly).

The specific question is:

My above algorithm needs to create a matrix that determines which ring 
is inside which other ring. I have recently encountered a polygon in 
OpenStreetMap that had a very complex (100,000 node) outer ring and very 
many (about 10,000), relatively simple, inner rings. (A forest area in 
Japan, if you want to know.) My algorithm took several hours to build 
the "what contains what" matrix. The matrix is rather large but the 
small polygons were usually not the problem; the problem were the 
roughly 10,000 "does this 100k node polygon contains this little 
polygon" tests. Even when I reduced this from polygon.contains(polygon) 
to polygon.contains(random single vertex of polygon) it hardly went any 

Is there a way how I can somehow pre-process my 100k node polygon to 
make the "contains" queries go faster? I have the impression that 
"contains" must do some preprocessing internally, and that maybe the 
same preprocessing is repeated unnecessarily if I do 10,000 "contains" 
checks for the same polygon?


More information about the geos-devel mailing list