[GRASS-dev] adding lib_arraystats

Wed Feb 13 16:04:34 EST 2008

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

On Feb 13, 2008, at 12:50 PM, grass-dev-request at lists.osgeo.org wrote:

>>> What would be particularly useful is if it's possible for GRASS to  
>>> use
>>> R functions on small amounts of data. E.g. r.series and  
>>> r.resamp.stats
>>> both compute aggregates over relatively small amounts of data.
>>
>> This is essentially what I had in mind when I first posted the  
>> suggestion--
>> using R code on simple arrays of data.
>
> This is essentially what classInterval() needs:
>
> initiate the R backend and workspace;
> put the vector of doubles into the workspace;
> load the classInt package into the workspace;
> run the classInterval function with the argument values;
> collect the break values vector from the output object;
> optionally continue to collect a vector of strings giving the RGB  
> values;
> optionally generate an empirical cumulative distribution function plot
>  of the data showing the class intervals and chosen colours as PNG;
> terminate the R backend;
>
> This could be done as a shell script using temporary files (could be  
> binary),
> from Python on supported platforms, on Windows from Python via  
> (D)COM, or using
> the R shared object. None of them will be as fast as an in-GRASS  
> solution,
> although R backend startup times and package loading are very much  
> faster now
> than previously.
>
>>
>>> If it was practical to have e.g. "r.series method=R expression=...",
>>> that would be much more useful than having to start R and load
>>> potentially hundreds of rasters into memory.
>>
>> Right-- the "not-in-memory" features of GRASS are extremely valuble.
>
> Since R operations are vectorised, it might be possible to pass a  
> block of
> values (say k rows, p columns, where p is the number of input  
> rasters) and do
> apply() on it to get k rows back, but the blocking would be on the  
> GRASS side.
> But this would be for specialist things, I expect.
>
>>
>>> r.resamp.stats only works on a single map, but as it's essentially a
>>> down-sampling tool, it's not unreasonable to use it on very large
>>> input maps.
>>
>> I like this idea. Use of a trimmed mean or other such R specialty  
>> would be
>> very nice here.
>
> Some of the specialites are rather sophisticated, but others are more
> sophisticated than needed in GIS on a regular basis. It may be that  
> the R
> functionality is more use here in supporting the implementation of  
> code within
> GRASS by permitting checking that both end up with largely the same  
> output for
> the same input.
>
> Roger

FWIW, we are going to have to require the Python numpy (or numeric)  
package along with standard Python to get all the GUI features we now  
have. A number of these functions are available in numpy. Hence, they  
would be scriptable in Python without R. Might be worth considering as  
an alternative for some things.

Michael