[GRASS-dev] Re: [GRASS-user] RE: Problem querying layers other than '1' in gi s.m

Tue Sep 26 22:06:10 EDT 2006

I agree.

Michael
__________________________________________
Michael Barton, Professor of Anthropology
School of Human Evolution & Social Change
Center for Social Dynamics & Complexity
Arizona State University

phone: 480-965-6213
fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

> From: Moritz Lennert <mlennert at club.worldonline.be>
> Date: Tue, 26 Sep 2006 20:27:11 +0200 (CEST)
> To: Trevor Wiens <twiens at interbaun.com>
> Cc: Michael Barton <michael.barton at asu.edu>, "''grassuser at grass.itc.it' '"
> <grassuser at grass.itc.it>, GRASS-DEV <grass-dev at grass.itc.it>
> Subject: Re: [GRASS-dev] Re: [GRASS-user] RE: Problem querying layers other
> than '1' in gi s.m
> 
> On Tue, September 26, 2006 15:37, Trevor Wiens wrote:
>> On Tue, 26 Sep 2006 11:32:38 +0200
>> Moritz Lennert wrote:
>> 
>>> Michael has already said most of what I wanted to say, but some small
>>> additions.
>>> 
>>> Michael Barton wrote:
>>>>> What seems much more natural to me is
>>>>> leaving the attribute management to the database where more
>>>>> elegant tools exist. Thus grass modules instead having a
>>>>> layer option need a input key and possibly an output key
>>>>> option depending on the module. If no key field is specified
>>>>> (which would be an attribute in the table linked to the
>>>>> vector file), then all objects in the vector file are
>>>>> processed. However if a key is used to query the vector file
>>>>> for a list of objects for processing. In the case where the
>>>>> same vector object has two cats, the vector attribute tables
>>>>> will have to have a one to many relationship from the vector
>>>>> file to the attribute table. Now modules could also allow a
>>>>> query specification to allow for complex querying across
>>>>> multiple keys and attributes, but output would probably have
>>>>> to be limited to a series of key fields (most likely only
>>>>> integers)
>>>> 
>>>> Really, this is exactly what "layers" are now. AFAICT, the biggest
>>> problem
>>>> is in the terminology. Each "layer" is an integer key field in
>>> database
>>>> terminology. Multiple layers simply means multiple key fields, each of
>>> which
>>>> can be linked with an attribute table using v.db.connect.
>>> 
>> 
>> No, not really. What I describe is functionality identical to current
>> layers, but the critical distinction is that all the attribute
>> information can be managed in the database independent of GRASS or
>> without GRASS even running. Right now the cat values for a vector
>> object can only be accessed through GRASS. If GRASS built a simple
>> table for each vector object with a cat (or perhaps more clearly
>> named an objectid) and a user defined key as well as allowing users to
>> add other keys to that table, then there would be no need to run GRASS
>> to update attribute information for those objects. For example, lets say
>> you have a series of weather sites which would have a incremental
>> objectid and a user defined key such as a stationid. If you want to
>> be able to occasionally interpolate precipitation surfaces from these
>> sites since all the attribute information is accessible independent
>> of GRASS (I envision using PostgreSQL in this case) you can write you
>> application in whatever environment you like and access the database
>> outside of GRASS and add and edit new time data as needed. Then when it
>> is time to create your new surface you fire up GRASS do your
>> multi-table query without any call to v.db.connect because it is no
>> longer needed and get the result. Done.
> 
> I don't really understand this argument. Why can't you do exactly this
> with GRASS today ? First of all why do you need a separate objectid and
> stationid if each station is represented by one object (let's say a point)
> ? You could just use the cat value of each object. You would then have a
> table in PostgreSQL in which you have these cat values (possibly in a
> colum you could
> call stationid) and all other attributes you would like in this same
> table. If you get new data for the stations you can add it to the table
> without having to go through GRASS. Then when you enter grass, this new
> information is available as long as your map remains linked to that table.
> 
> The only thing you cannot do currently (if I'm not mistaken) is use
> aggregate queries on that table if you have more than one row for each
> station. But I don't think that this is due to the general data model of
> GRASS, but rather to the fact that it is not implemented.
> 
> 
> [...]
> 
>>>> Again, except for the key fields confusingly labeled as "layers" and
>>> one
>>>> other legacy feature from GRASS 5, all attribute management does stay
>>> in the
>>>> database.
>>> 
>>> Unless you use it in the way I suggested in an earlier mail, i.e. cat 1
>>> = coniferous, cat 2 = broadleaved, cat 10 = pine, etc.
>>> 
>>> The way it 'should' be used to stick with Trevor's suggestions is
>>> 
>>> cat 1 = tree number 1
>>> cat 2 = tree number 2
>>> 
>>> etc.
>>> 
>>> And then have a table with columns
>>> 
>>> treenumber, species, etc,
>>> 
>>> with possibly another table with
>>> 
>>> species, type
>>> 
>>> where type= conferous or broadleaved
>>> 
>>> And then, if you need a map of coniferous and broadleaved trees, you
>>> create a view:
>>> 
>>> CREATE VIEW v1 AS SELECT treenumber, type FROM trees, types WHERE
>>> trees.species=types.species
>>> 
>> 
>> Why bother with v.db.connect at all? Just allow a query to be used to
>> select the vector object keys (cats) and let the module in question
>> work with that list.
> 
> I think this is potentially possible with the current model, just not
> implemented. The database drivers allow any kind of query you want, and it
> should not be too complicated to rewrite modules in a way to allow more
> arbitrary queries then just with the current 'where' option.
> 
> Any query which returns cat values allows you to then work with these cat
> values (I am currently reworking d.vect.chart to do just that) and
> it should, therefore, not be too difficult to imagine modules which allow
> you to define an arbitrary query and to the fulfill their task on the
> basis of this query.
> 
> According to the use you make of it you obviously can have a problem if
> you have more than one object with the same cat value (I have that problem
> in d.vect.cat, for example), but there are ways to work around this.
> 
>> I realize that many people not familiar with SQL
>> will find this difficult, but surely we could consider as part of
>> upgrading the GUI front end with a simple query builder.
> 
> I should also be possible to offer both solutions.
> 
>> A view would be convenient for ongoing use, but shouldn't be necessary
>> for single time uses.
> 
> I agree totally.
> 
> [...]
> 
>> 
>> Well changing the terminology would certainly help, but the fundamental
>> problem was clearly defined by Moritz when he suggested that the
>> problem is mixing of database concepts with GIS concepts. Thus my
>> suggestion to keep database functions in the database.
> 
> Well GIS as such makes no sense if it is not understood as the link
> between geometries and data, so you always have to mix the two in one way
> or another. The question is more on how to do this in a way which is most
> efficiently _and_ offers the most functionality.
> 
> 
>>> 
>>> v.buffer is a very special case, and I don't know how you would solve
>>> the question in your system: the attribute information is lost since
>>> v.buffer fusions overlapping buffers into one single buffer. As
>>> mentioned on the man page, there is no automatic way to know which cat
>>> (or keyvalue) to give to this single fusioned buffer.
>> 
>> My solution would work in the sense that new areas created would be
>> given a new objectid whereas areas (technically centroids associated
>> with areas) up to the point of overlap would retain their original
>> objectid and thus would have direct access to any associated attribute
>> information through a simple query.
> 
> Well, it should be no problem to reprogram v.buffer to do just that. Its
> current implementation works with the assumption that as you could
> potentially have overlaps, you treat all buffers as if they were overlaps,
> but you could obviously include some test in the code which treats buffers
> differentially. Again, I don't see how this is a problem of the model
> rather than of the implementation of a particular module.
> 
>> 
>> It is important to note that my objectid terminology only makes sense
>> if this value is singular and immutable.
> 
> This might actually be the fundamental point in the argument. Currently
> GRASS doesn't enforce this as a rule (well actually, IIUC, each object has
> a its line number as a unique identifier, this is just not visible to the
> user). The question is whether it should enforce something like this, or
> whether the current model doesn't allow more flexibility by allowing to
> limit yourself to unique id's for each object, but also allowing the use
> of non-unique id's, or multiple id's.
> 
> Moritz
>