[GRASSLIST:227] Re: Reading/writing maps

Roger Bivand Roger.Bivand at nhh.no
Tue May 27 13:32:49 EDT 2003


On Tue, 27 May 2003, David Finlayson wrote:

> I suppose I should move to the developer list to keep from boring 
> everyone not interested in Python, but, I never considered myself a 
> developer...just a user trying to get things done.

Aren't we all!! In fact, without "users trying to get things done", there 
wouldn't be free software for researchers, and if we do it ourselves 
(asking the "experts" when necessary), a surprising amount gets done!

> 
> The ESRI ASCII header defines a NODATA value, typically -99999.  My 
> class stores this value (along with the rest of the header information) 
> as an attribute of the class.  Otherwise, my Python class has no 
> knowledge of Null values and will happily calculate these values just 
> like real data.  Since I used arrays from the numeric package in python 
> for efficiency it is a requirement that the whole raster by a single 
> type (i.e., floating point, integer, etc.) so I can't simply substitute 
> a Null value in for the -99999.
> 
>From the NumPy FAQ:

"What about dealing with missing observations?

An optional package "MA", distributed with Numpy and documented in the 
manual, is designed to be an almost drop-in replacement for Numeric, 
allowing you to do operations and function calls without involving array 
entries that are invalid, including sums, averages, division, etc. Note 
that MA differs from Numpy in that slices are copies, not references."

> There would be a penalty for checking each read for null values before 
> processing.  I ran into a similar problem when I wrote code to get 
> values using either row and column notation or northing and easting.  I 
> wanted to check that each entry was within the bounds of the grid before 
> attempting the operation.  When I ran the code on a large dataset, I 
> found that the code spent a huge amount of time in the bounds checking 
> methods.  It was a performance killer.  But then again, when I code, I 
> have a knack for finding the worst possible way of doing things.  (I am 
> a geologist with 1 class in numerical methods.  Very dangerous!!)
> 
Dangerous is only when you are not as careful as you obviously are. But 
missing values are representable, both missing data, and others that 
"drop out" in computation (NaNs, Infs), all of which are well-supported in 
floating point. Yes, it encumbers computation, but also guards against 
false assumptions. In fact there are only a very few functions you need, 
say G_set_c_null_value() to set the GRASS NULLs, and G_is_null_value() to 
test looping along each row before you assign it to your array in Python, 
obviously substituting the value your Python module wants for a missing 
data value. You may find it takes time making the assignments, but that 
avoiding pointless computation may save you time later on - of course 
depending on the problem. 

Roger

> Maybe later this summer I can take a stab at writing a real python 
> interface and see if people like it enough to invest energy into this. 
> I will need help with the C code however.
> 
> Cheers,
> 
> David
> 
> Russell Nelson wrote:
> > Glynn Clements writes:
> >  > One question: how do you handle nulls?
> > 
> > I don't know how David handles it, but Python has a native 'None' value.
> > 
> >  > For handling rasters, you wouldn't need to use all that many
> >  > functions; e.g. r.mapcalc only uses 49 GRASS functions.
> > 
> > Still, you never know what people are going to want to do in Python.
> > Better to just wrap everything, and let people at it.
> > 
> 
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand at nhh.no




More information about the grass-user mailing list