[GRASS-dev] Some doubts about GRASS topology

Thu Sep 24 03:04:33 EDT 2009

Benjamin Ducke wrote:
> Dear all,
>
> in an attempt to better understand the GRASS vector and topology
> model, I imported a set of 3 polygons from an ESRI Shapefile (see
> attachment). The polygon in the upper left has 4 holes (called
> islands for some reason by GRASS), the lower one consists of 3
> parts (QGIS calls this a polygon with islands -- good to know we 
> understand each other in the GIS world!). 
Yes, maybe the way the term islands is used in GRASS is a bit 
misleading. According to simple feature specifications, GRASS islands 
are (more or less, not sure if 100%) equivalent to holes.
> The third is a simple,
> convex shape.
>
> Displaying the imported map shows all geometries exactly as it
> should. So far so good.
>
> Now, when I run v.info on the imported map, I get:
>
> Number of lines: 0
> Number of boundaries: 9
> Number of centroids: 5
> Number of areas: 9
> Number of islands: 9
>
> This completely baffles me!
>
> The GRASS documentation consistently states that an area
> is a boundary + a centroid + any number of "islands".
>   
That's an error in the documentation. An area is a closed ring of 
boundaries (can be only one boundary) + any number of "islands" (holes) 
within + *optionally* an attached centroid. An area without centroid can 
not have a category but as far as topology is concerned, it's a valid area.
> Now, assuming that the lines around the four "islands" count
> as boundaries, I understand why there are 9 boundaries
> altogether. 5 centroids also check out, given that there
> is no 1:1 equivalent for a shapefile multipart polygon in GRASS.
>   
> But how in the (GRASS) world can there be 9 areas if there
> are only 5 centroids? 
See above, an area in GRASS topology does not need to have a centroid 
attached.
> And why 9 islands? 
Every area is also an island if no boundary is shared with another area. 
If a boundary is shared with another area, these two areas together form 
one island. In your example, the area in the upper left with the four 
islands: the four islands are also areas, but without centroid attached. 
When attaching islands during topology building, the internal IDs of all 
islands falling inside the outer area are added to the topology 
information of that outer area. If one of these four islands would share 
a separate boundary with two other islands each, and only one islands 
would be completely isolated, that thing in the upper left would still 
consist of five areas (four inside, one outer), but of only three 
islands, one consisting of three connected areas, one for the remaining 
isolated inside area, one for the outer area.

When building topology, areas and islands are constructed first, islands 
are not yet attached to areas. Only in the next step are islands 
attached to areas, areas get holes. In the last step, centroids are 
attached to areas, or more precisely: for each area a not yet attached 
centroid is searched for, if found attached, if already attached, it's a 
duplicate centroid. There may also be several centroids falling inside 
the current area and only inside this area, these will also become 
duplicate centroids.

AFAICT, GRASS vector topology is very much based on simple feature 
specifications, but not strictly, it deviates here and there in the 
usage of terms and in the methods to build topology. The methods are not 
a problem, they are consistent even though not 100% following simple 
feature specifications, but the usage of terms can be confusing, 
particularly with misleading documentation and sometimes different 
meanings in closely related applications (QGIS).

Markus M