[GRASS-dev] what is the ideal way to store spatial data

Gerald Nelson gnelson at uiuc.edu
Tue Jan 1 11:50:29 EST 2008


I don't know enough to comment on the math issues specifically, but would like to relate a conversation I had with John MacDonald of MacDonald Detweiler (a big Canadian company that makes ground link stations etc) while serving on an advisory panel to the US national remotely sensed data archive (it stores much of the Landsat data). I was pretty naive about what actually goes on in turning raw data as collected by a satellite into various products that we all end up using. I was just interested in have a land cover/use data set and was arguing for the archive storing such a data set. He made two points. The first was similar to the one you make, which is that any manipulation of raw data introduces artifacts. The second was that it will always be cheaper to do processing in the future than it is today. So the data should always be archived in raw form. 

The result of his logic is that the archive does in fact store data in raw form, along with the operating characteristics of the satellite that collected it. A related recommendation he made, which has not been followed as far as I can tell, is that you should also archive the algorithms of the day (with a time stamp), so that you can recreate the products, which are what usually get used.

So getting back to grass, it may be too much to ask of today's  (and tomorrow's) cpus to do processing on the fly. But I wouldn't want current processing constraints to be hard wired into new versions of grass. Or at least I would encourage the developers to consider this issue. And I guess I would argue that the more usual user situation is one where the user knows less than the software, or at least the gurus who have written the software.  I can guarantee that describes me!

Regards,

Jerry

---- Original message ----
>Date: Tue, 1 Jan 2008 11:10:43 +0000
>From: Glynn Clements <glynn at gclements.plus.com>  
>Subject: Re: [GRASS-dev] what is the ideal way to store spatial data  
>To: Gerald Nelson <gnelson at uiuc.edu>
>Cc: grass-dev at lists.osgeo.org
>
>
>Gerald Nelson wrote:
>
>> Since all spatial data are about describing a specific location on a
>> specific planet, usually earth, it would seem that the best way
>> conceptually to store data is with respect to a single easily defined
>> reference point such as the gravitational center of the planet. Any
>> location could then be measured with three values. x,y like latitude
>> and longitude, and z a distance measure from the reference point along
>> a ray.
>> 
>> Projections such as utm, etc, are about how to convert the 3-d data
>> described above into 2-d with a minimum of distortion. Given the speed
>> of modern computers this conversion process ought to be increasingly
>> easy to do on the fly, as needed.
>> 
>> The reason I raise this question is to ask the experts whether it
>> would make sense (for 7.x) to think of a single standard way of
>> storing data in grass and then all operations would do the conversions
>> as necessary? There are (at least) two advantages of this. One is
>> standardization of data storage in a form that is closest to a true
>> representation of the real world. A second is to reduce the potential
>> for confusion/mistakes when data are shared and the metadata are not,
>> or are inadequate. I am continually getting access to data where the
>> units are not clearly defined. But even if they are defined say as
>> some utm coordinate, there must be some error in measurement built in.
>
>Apart from wasting CPU time, conversion introduces error. Applying a
>non-affine transformation to a regular grid (i.e. raster) doesn't
>result in a regular grid. Applying a non-affine transformation to a
>straight line doesn't result in a straight line. Any spatial
>measurement which is constant for the original data (e.g. maximum
>spatial error) will cease to be constant if the data is projected.
>
>All things considered, the optimum form in which to store the data is
>the form in which the user chooses to store it. There will always be
>factors of which the user is aware but the software isn't.
>
>-- 
>Glynn Clements <glynn at gclements.plus.com>
Gerald Nelson
Professor, Dept. of Agricultural and Consumer Economics
University of Illinois, Urbana-Champaign
office: 217-333-6465 
cell: 217-390-7888
315 Mumford Hall
1301 W. Gregory
Urbana, IL 61801


More information about the grass-dev mailing list