[GRASS-dev] Naming conventions and axiomatics [RFC]

Sat Aug 6 05:26:03 EDT 2011

[I have just subscribed for the list for this, o no need to put me in 
copy if someone replies. I won't participate in other "internal" 
threads.]

Hello,

This message is just to share some thoughts about what GRASS siblings
still do share: some principles about the data.

Since GRASS GPL is---to my knowledge---the last still active open
source version (the BSD open source KerGIS version is still
downloadable, but as far as I'm concerned, having been the only
one to contribute, it is dead; if it is to be restarted some day,
that will be without me; so I'm not trying to attract/distract
people...), I do think that some of the following may feed some
people thoughts. On a theorical/mathematical side, finding the
questions is more important than knowing answers.

If you do find that this exchange has nothing to do on the list, just
say so. Silence will mean that I'm definitively the only one interested
in this and will be taken as a kind of answer too ;)

Since there could be a lot of things, I will start by a chunk, and
follow on some things said previously.

When I say "GRASS" in what follows, this means CERL GRASS (the
historical origin). It may have changed for you implementation, I have
not looked (and will not).

1) Vectorial topology (again).

Topology is really powerful, since you can make non topological
manipulations with topological organized data, but the reverse is not
true---"advanced" non GRASS GIS generally provide extensions that add
topology to their native non-topological format.

One problem faced by users is that what is rendered as a "line" is not
only a "line", but called too an arc etc. and that "lines" or "arcs" do
define not only "lines" but "areas" etc.

The problem is that we are trying to name different things with the same
word; and that the layman definition of the word does not bear the
correct meaning for the purpose. The solution is to split, "explicate"
(unfold) the different standpoints and to name differently every single
notion.

Linearity is a fundamental basis of modern Mathematics (and it is no
surprise when one thinks a little).

A kind of "linear" feature is the fundamental basis of GRASS vectorial
format : a topeon (in french: "topéon", coined after pseudo greek
topostoïcheon: a linear element of locus).

What is stored in the "geometry" file is just a series of topeons. A
topeon is not an arc, since a topeon is neither connected, nor
_oriented_ (this is fundamental as we will see later). Typically, if the
same "line" (as seen by the user) is entered in one direction and then
duplicated in the other, normalizing tools will erase one of the
duplicates.

It shall be noted that this element is linear, but in the sense of the
Euclid definitions that are _larger_ than the definitions and postulates
for one possible geometry, the one developped further in Euclide's
work: the Euclidean one. This means that it could be whether
rectilinear or curvilinear, that is the fonction describing the
curse from the vertices or control points could not be reduced to
polylines ("série de segments" in french) as it is in GRASS. [This
is a whole subject by itself; I will not go deeper at least today.]

This level I will call: level0 [the 0 value being not in GRASS].

In a second step, starting from the topeons, the topology is built
bringing, fundamentally, connectivity: the nodes are created and deduced
from the topeons. This brings linear features that start from a node N1
to a node N2 (the 2 nodes can be the very same for a closed feature).
This means that now the linear feature is connected and oriented.
It is based on a topeon but with connection and orientation; hence
the new name: an arc.

	Note: when generating the nodes, snapping can be done. In
	[historical] GRASS, this has an impact on the topeons too since
	they are rewritten with the extremities matching the nodes created.
	This could be not the case, the geometry being left alone. The thing
	important for implementation is that this snapping makes the
	topological correctness not "predicative" in Poincaré's definition:
	"is predicative a classification that is not upset by the adjunction
	of new elements"; for example, when processing "spaghettis", if the
	chunks are split and connected with the snapping, the topeon or the
	arc deduced has changed its course, so it may intersect with topeons
	or arcs that it did not intersect with before. Predicativity is
	essential since it brings atomicity for parallelization of code.

If an arc is oriented by its definition, it is "only" to know how to
find its course from N1 to N2. It is a dimension 1 direction.

There is another orientation that leads to the distinction
between one dimension linear features and... two dimensions _linear_
features. All in all, a "LINE" arc [and in GRASS, the name of this
property is partly incorrect for a topological property] is a one
dimension linear element; while an "AREA" arc [same remark] is a two
dimensions linear feature, with a left and a right: another orientation.
An "area" [I use now "face"] is a closed linear feature with a
supplementary orientation that discriminates "left" from "right" i.e.
that splits the plan in two seperate plan chunks.

Hence, from the connectivity, this is also during this step that "areas"
are constructed, still from linear elements, providing that the
arcs have been defined as defining left/right ("AREA" topological
type), and that a closed series of interconnected arcs, with no
_connected_ arc inside (GRASS "closes" an area even if there is not 
connected stray arcs inside), can be found.

There is here a problem.

GRASS makes a distinction between "areas" and "isles". The departure is
made with an heuristic mean: the sign of the area. But this works for
"cartographic" that is open euclidean plan. If the contour is imagined
on an ellipsoid, what is the "inside" and the "outside"? At the time of
"nec plus ultra", when leaving Mediterranea there was nothing more after
("nec plus ultra"), and the earth was plan, this could do. In geographic
coordinates, this can not do as is...

Since the "areas" are built when the conditions above are met, all
topologically correct areas are built. But since a contour can be the
limit of what is exactly described and there is nothing more after ("nec
plus ultra"), there must be a mean to "validate" an "area": this is
done with attributes (aka categories in GRASS). But it must be kept
in mind that whether the "area" is valid (a non zero category) or
not, the geometrical object exists.

The step of the connectivity I will call: level1. The most important
thing to keep in mind is that for this step everything can be rebuilt
from the topeon file _and_ a same threshold. Hence, the crucial data
is the topeon file; and if it is decided that the snapping threshold
shall change the topeon file [the case in GRASS], providing a threshold
<= of the previous threshold can bring the same result. If the topeon
file is not changed by the threshold, the information has to be kept
somewhere.

The step of attributing "categories", I will call: level2. [The
departure is not made in GRASS, level0 being called V1, and level1 and
level2 being merged.]

The "insertion point" of the attribute is just a _mean_ for the _static_
attribution of a category. When processing the data, the process could
(and generally shall) keep the attribute linked to the geometrical
feature, but without resorting to the "insertion point". This is crucial
for example with "cookie cutter" since by the very nature of the
operation, a "centroid" will fall in a subface, and the other chunks
will have not any category anymore.

Hence the coordinates of the insertion points are for _external_ (i.e.
not in the topeon file) _static_ _storage_ of the attribution. This is
so as to be able to rebuild from level0 to level2 with only two minimal
files: the topeons one, and the attributes (categories) one.

2. Extending to 3D.

>From what has been explained above, a question arise for extending the
process to 3D.

An arc can be one dimension or two dimensions. But an "area" (face)
could be two dimensions (not defining a solid) or three dimensions (one
orientation more: "above" and "below"). So an area is not mandatorily an
edge of a solid.

I won't speak about whether a GIS has to be full 3D or not. But it can
be noted that cartography is about surfaces, and there is a distinction
to be made between dimensions and degree: a "2D" element can have
various degrees, the surface described being more or less complex.

3. About the crucial orientation/direction.

When wondering and wandering about/around the axiomatics, it appeared
that the orientation is a crucial property linked to the dimension. I
have found that I was not the first on these tracks; there are old 19th
century mathematical works about this [Paul Serret; Edmond Laguerre].

I hope that some people will find that the inners of GRASS are
interesting, and that from a theorical point of view, leading to
implementation questions, this can be a very fruitful area of research.

Cheers,
-- 
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                      http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C