[GRASS-dev] Re: [GRASS-user] RE: Problem querying layers other than '1' in gi s.m

Tue Sep 26 09:37:14 EDT 2006

On Tue, 26 Sep 2006 11:32:38 +0200
Moritz Lennert wrote:

> Michael has already said most of what I wanted to say, but some small 
> additions.
> 
> Michael Barton wrote:
> >> What seems much more natural to me is
> >> leaving the attribute management to the database where more
> >> elegant tools exist. Thus grass modules instead having a
> >> layer option need a input key and possibly an output key
> >> option depending on the module. If no key field is specified
> >> (which would be an attribute in the table linked to the
> >> vector file), then all objects in the vector file are
> >> processed. However if a key is used to query the vector file
> >> for a list of objects for processing. In the case where the
> >> same vector object has two cats, the vector attribute tables
> >> will have to have a one to many relationship from the vector
> >> file to the attribute table. Now modules could also allow a
> >> query specification to allow for complex querying across
> >> multiple keys and attributes, but output would probably have
> >> to be limited to a series of key fields (most likely only
> >> integers)
> > 
> > Really, this is exactly what "layers" are now. AFAICT, the biggest problem
> > is in the terminology. Each "layer" is an integer key field in database
> > terminology. Multiple layers simply means multiple key fields, each of which
> > can be linked with an attribute table using v.db.connect.
> 

No, not really. What I describe is functionality identical to current
layers, but the critical distinction is that all the attribute
information can be managed in the database independent of GRASS or
without GRASS even running. Right now the cat values for a vector
object can only be accessed through GRASS. If GRASS built a simple
table for each vector object with a cat (or perhaps more clearly
named an objectid) and a user defined key as well as allowing users to
add other keys to that table, then there would be no need to run GRASS
to update attribute information for those objects. For example, lets say
you have a series of weather sites which would have a incremental
objectid and a user defined key such as a stationid. If you want to
be able to occasionally interpolate precipitation surfaces from these
sites since all the attribute information is accessible independent
of GRASS (I envision using PostgreSQL in this case) you can write you
application in whatever environment you like and access the database
outside of GRASS and add and edit new time data as needed. Then when it
is time to create your new surface you fire up GRASS do your
multi-table query without any call to v.db.connect because it is no
longer needed and get the result. Done.

> What is missing as far as I can tell (but I don't know if Trevor's 
> solution would solve that) is that you cannot query across multiple 
> tables in the current system (i.e. you cannot query for those points in 
> a map which have value X in table Y where Y.key = maptable.key). You 
> have to first create a new table or view combining Y and maptable and 
> the link the map to that new table or view.
> 
> >> Now it could be argued that this is not essentially
> >> different than the 'layer' feature now, but what is
> >> fundamentally different is access to and control of the
> >> attribute tables. In this case, all attribute management
> >> stays in the database where it belongs.
> > 
> > Again, except for the key fields confusingly labeled as "layers" and one
> > other legacy feature from GRASS 5, all attribute management does stay in the
> > database. 
> 
> Unless you use it in the way I suggested in an earlier mail, i.e. cat 1 
> = coniferous, cat 2 = broadleaved, cat 10 = pine, etc.
> 
> The way it 'should' be used to stick with Trevor's suggestions is
> 
> cat 1 = tree number 1
> cat 2 = tree number 2
> 
> etc.
> 
> And then have a table with columns
> 
> treenumber, species, etc,
> 
> with possibly another table with
> 
> species, type
> 
> where type= conferous or broadleaved
> 
> And then, if you need a map of coniferous and broadleaved trees, you 
> create a view:
> 
> CREATE VIEW v1 AS SELECT treenumber, type FROM trees, types WHERE 
> trees.species=types.species
> 

Why bother with v.db.connect at all? Just allow a query to be used to
select the vector object keys (cats) and let the module in question
work with that list. I realize that many people not familiar with SQL
will find this difficult, but surely we could consider as part of
upgrading the GUI front end with a simple query builder.

A view would be convenient for ongoing use, but shouldn't be necessary
for single time uses.

> And then link to v1 with
> 
> v.db.connect map table=v1 key=treenumber
> 
> > The "cat" value in the key (aka "layer), connects a vector object
> > with its corresponding record(s) in the attribute table. The GRASS 5 legacy
> > is that you can have a single text field that accompanies each key. This is
> > a left over when the only vector database easily available consisted of a
> > single integer field (cat) and single text field (label) for each vector
> > file. This mirrored the attribute structure of raster files. We could
> > probably drop the "label" field from the vector database structure with
> > little loss and some gain in understandability. We could then rename
> > "layers" and "cat" to "key fields" or "keys". This brings up another
> > terminological confusion. We commonly refer to a "cat" value. This is simply
> > the integer value within each key.
> > 
> > I think it would help a lot if we simply dropped "cat" and "layer" (keeping
> > a reference to these terms in the documentation for legacy data) and used
> > some version of key and value. For example:
> 
> I'm really not sure that the layer, cat terminology is really the 
> problem, here...
> 

Well changing the terminology would certainly help, but the fundamental
problem was clearly defined by Moritz when he suggested that the
problem is mixing of database concepts with GIS concepts. Thus my
suggestion to keep database functions in the database.

> > 
> >> The benefits of this system can be seen in a simple example.
> >> Right now if you have a point file and you want to create a
> >> buffer around those points all the attribute information is
> >> lost and must be patched back in.
> > 
> > This is only lost if the key (aka "layer"/"cat") is lost. If it is
> > maintained, one only has to run v.db.connect to re-establish the link
> > between the vector key and the attribute table.
>  >
> >> By keeping attributes in
> >> the database it would be much simpler to have an attribute
> >> in the new vector linking back to the old key and thus
> >> effectively keeping all the old attribute information linked
> >> to the derived layer.
> >>
> > 
> > As per above. If the layer/cat is being lost, it needs to be fixed.
> 
> 
> v.buffer is a very special case, and I don't know how you would solve 
> the question in your system: the attribute information is lost since 
> v.buffer fusions overlapping buffers into one single buffer. As 
> mentioned on the man page, there is no automatic way to know which cat 
> (or keyvalue) to give to this single fusioned buffer.

My solution would work in the sense that new areas created would be
given a new objectid whereas areas (technically centroids associated
with areas) up to the point of overlap would retain their original
objectid and thus would have direct access to any associated attribute
information through a simple query.

It is important to note that my objectid terminology only makes sense
if this value is singular and immutable.

T
-- 
Trevor Wiens 
twiens at interbaun.com

The significant problems that we face cannot be solved at the same 
level of thinking we were at when we created them. 
(Albert Einstein)