[GRASS-dev] Re: [GRASS-user] RE: Problem querying layers other than '1' in gi s.m

Wed Sep 27 01:19:31 EDT 2006

On Tue, 26 Sep 2006 20:27:11 +0200 (CEST)
Moritz Lennert wrote:

> On Tue, September 26, 2006 15:37, Trevor Wiens wrote:
> > On Tue, 26 Sep 2006 11:32:38 +0200
> > Moritz Lennert wrote:
> >
> >> Michael has already said most of what I wanted to say, but some small
> >> additions.
> >>
> >> Michael Barton wrote:
> >> >> What seems much more natural to me is
> >> >> leaving the attribute management to the database where more
> >> >> elegant tools exist. Thus grass modules instead having a
> >> >> layer option need a input key and possibly an output key
> >> >> option depending on the module. If no key field is specified
> >> >> (which would be an attribute in the table linked to the
> >> >> vector file), then all objects in the vector file are
> >> >> processed. However if a key is used to query the vector file
> >> >> for a list of objects for processing. In the case where the
> >> >> same vector object has two cats, the vector attribute tables
> >> >> will have to have a one to many relationship from the vector
> >> >> file to the attribute table. Now modules could also allow a
> >> >> query specification to allow for complex querying across
> >> >> multiple keys and attributes, but output would probably have
> >> >> to be limited to a series of key fields (most likely only
> >> >> integers)
> >> >
> >> > Really, this is exactly what "layers" are now. AFAICT, the biggest
> >> problem
> >> > is in the terminology. Each "layer" is an integer key field in
> >> database
> >> > terminology. Multiple layers simply means multiple key fields, each of
> >> which
> >> > can be linked with an attribute table using v.db.connect.
> >>
> >
> > No, not really. What I describe is functionality identical to current
> > layers, but the critical distinction is that all the attribute
> > information can be managed in the database independent of GRASS or
> > without GRASS even running. Right now the cat values for a vector
> > object can only be accessed through GRASS. If GRASS built a simple
> > table for each vector object with a cat (or perhaps more clearly
> > named an objectid) and a user defined key as well as allowing users to
> > add other keys to that table, then there would be no need to run GRASS
> > to update attribute information for those objects. For example, lets say
> > you have a series of weather sites which would have a incremental
> > objectid and a user defined key such as a stationid. If you want to
> > be able to occasionally interpolate precipitation surfaces from these
> > sites since all the attribute information is accessible independent
> > of GRASS (I envision using PostgreSQL in this case) you can write you
> > application in whatever environment you like and access the database
> > outside of GRASS and add and edit new time data as needed. Then when it
> > is time to create your new surface you fire up GRASS do your
> > multi-table query without any call to v.db.connect because it is no
> > longer needed and get the result. Done.
> 
> I don't really understand this argument. Why can't you do exactly this
> with GRASS today ? 

You can and if arbitrary queries were permitted, then layers wouldn't
be necessary. I'm simply trying to point out that layers are confusing
and keep attribute management partially hidden within GRASS because
there is no way of knowing outside the system if a vector object has
more than one cat value. If the layers feature were removed and
replaced with immutable singular keys, then all attribute management
would be forced into the realm of attribute tables. I like this idea
because it is cleaner. 

> First of all why do you need a separate objectid and
> stationid if each station is represented by one object (let's say a point)
> ? 

You don't, but users are often confused by integer keys so I added this
as human friendly key in addition to the system key.

---snip---
> Well GIS as such makes no sense if it is not understood as the link
> between geometries and data, so you always have to mix the two in one way
> or another. The question is more on how to do this in a way which is most
> efficiently _and_ offers the most functionality.

Yes. Absolutely. IMO, layers are not the most efficient solution. There
has been discussions before, and many people still seem confused. This
seems to indicate that this was not the optimal solution. Now I am
biased, but I assume that most people can readily grasp the concept of
multiple attribute tables linking to the main table to select and
classify vector objects. If that is the case then my proposed method
would be much simpler to understand. 

The second part is functionality. Again arbitrary queries would provide
all the same functionality in clear fashion. Now it might be easier to
perform certain actions with layers as they provide a simple short-cut
to different views (not literal SQL sense) of the data, but are more
limiting for one time uses.

Now while writing this, if we were to support arbitrary queries, we
might want to consider also providing view support (in the SQL sense) to
allow users to easily store the classification schemes they wanted and
reference them quickly, just as layers could be referenced now.

> >> v.buffer is a very special case, and I don't know how you would solve
> >> the question in your system: the attribute information is lost since
> >> v.buffer fusions overlapping buffers into one single buffer. As
> >> mentioned on the man page, there is no automatic way to know which cat
> >> (or keyvalue) to give to this single fusioned buffer.
> >
> > My solution would work in the sense that new areas created would be
> > given a new objectid whereas areas (technically centroids associated
> > with areas) up to the point of overlap would retain their original
> > objectid and thus would have direct access to any associated attribute
> > information through a simple query.
> 
> Well, it should be no problem to reprogram v.buffer to do just that. Its
> current implementation works with the assumption that as you could
> potentially have overlaps, you treat all buffers as if they were overlaps,
> but you could obviously include some test in the code which treats buffers
> differentially. Again, I don't see how this is a problem of the model
> rather than of the implementation of a particular module.

I used this as an example because it was one of my frustrations when I
first started using vectors in GRASS was keeping attributes attached.
In the end, I abandoned GRASS for PostGIS for a large part of my vector
work because attribute management is simple and obvious. I still only
use GRASS vectors when I need functionality that isn't available or is
still flaky in PostGIS because all my data is already in PostgreSQL and
I'm comfortable with SQL.

> > It is important to note that my objectid terminology only makes sense
> > if this value is singular and immutable.
> 
> This might actually be the fundamental point in the argument. Currently
> GRASS doesn't enforce this as a rule (well actually, IIUC, each object has
> a its line number as a unique identifier, this is just not visible to the
> user). The question is whether it should enforce something like this, or
> whether the current model doesn't allow more flexibility by allowing to
> limit yourself to unique id's for each object, but also allowing the use
> of non-unique id's, or multiple id's.
> 
To reiterate, arbitrary queries provide the same functionality without
obscure naming conventions and partially obscured attribute management.

T
-- 
Trevor Wiens 
twiens at interbaun.com

The significant problems that we face cannot be solved at the same 
level of thinking we were at when we created them. 
(Albert Einstein)