[GRASS5] Re: [Fwd: whinging about GRASS again]
Markus Neteler
neteler at itc.it
Tue Feb 1 10:27:23 EST 2005
On Tue, Feb 01, 2005 at 09:23:06PM +1300, Hamish wrote:
> Russell:
...
> > Second, when they
> > import something into $HOME/.grass, look at its bounding box. If it's
> > 0,0 through (say) 2048,2048, then it's an xy projection. If it's
> > 45,-75 through 44,-74, it's lat/lon. If it's howevermany hundred
> > thousand by howevermany million, it's UTM. Prompt them for the
> > projection, and use the inferred value as the default. That's now the
> > projection for everything in $HOME/.grass.
>
> So what if the user is in Europe or Asia? XY or Lat-lon?
> I'm in New Zealand, we use a couple different howevermany million
> projections here and a couple of different map datums; UTM isn't used
> much if at all.
>
> The very important point is this: it is much better to make no choice
> at all rather than to start making incorrect assumptions. This way the
> user knows where the error is and what question has to be answered.
> It's a very important and well demonstrated point. Many disasters.
>
> with respect to setting locations automatically from GeoTIFFs by
> default: I've got a CD here with about 50 important maps, all with
> bogus/incorrect metadata. I don't think this is so unusual, upstream
> data sources of specialist items often have less than perfect quality
> control. Just in my one case yes, but the problem exists, and a new user
> is never going to be able to know what to trust..
> I am reminded of Excel vs. Matlab in taking an average of a series of
> data points. Excel will take the average irregardless of the number of
> NaN cells; Matlab will cough blood and make you explicitly tell it
> that's what you really really want to do. Ease of use vs. imposed
> correctness isn't always a bad thing.
Just an off-topic addition from bioinformatics about what happens
if programs decide the data structure/type/format. Have a look at
this article (full text online):
"Mistaken Identifiers: Gene name errors can be introduced
inadvertently when using Excel in bioinformatics"
Zeeberg et. al, BMC Bioinformatics 2004, 5:80
doi:10.1186/1471-2105-5-80
http://www.biomedcentral.com/1471-2105/5/80
"Abstract
Background
When processing microarray data sets, we recently noticed that some
gene names were being changed inadvertently to non-gene names.
Results
A little detective work traced the problem to default date format conversions
and floating-point format conversions in the very useful Excel program package.
The date conversions affect at least 30 gene names; the floating-point
conversions affect at least 2,000 if Riken identifiers are included. These
conversions are irreversible; the original gene names cannot be recovered.
...
For example, the tumor suppressor DEC1 [Deleted in Esophageal Cancer 1] [3]
was being converted to '1-DEC.'
...
For example, the RIKEN identifier "2310009E13" was converted irreversibly
to the floating-point number "2.31E+13."
"
[3] http://www.biomedcentral.com/1471-2105/5/80/figure/F1
(more screenshots at the left of the main article page)
To me this sounds like a desaster.
So, please, let's about such automated rubbish in GRASS.
Markus
More information about the grass-dev
mailing list