[GRASS-dev] adding lib_arraystats
Michael Barton
michael.barton at asu.edu
Wed Feb 13 16:04:34 EST 2008
voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton
On Feb 13, 2008, at 12:50 PM, grass-dev-request at lists.osgeo.org wrote:
>>> What would be particularly useful is if it's possible for GRASS to
>>> use
>>> R functions on small amounts of data. E.g. r.series and
>>> r.resamp.stats
>>> both compute aggregates over relatively small amounts of data.
>>
>> This is essentially what I had in mind when I first posted the
>> suggestion--
>> using R code on simple arrays of data.
>
> This is essentially what classInterval() needs:
>
> initiate the R backend and workspace;
> put the vector of doubles into the workspace;
> load the classInt package into the workspace;
> run the classInterval function with the argument values;
> collect the break values vector from the output object;
> optionally continue to collect a vector of strings giving the RGB
> values;
> optionally generate an empirical cumulative distribution function plot
> of the data showing the class intervals and chosen colours as PNG;
> terminate the R backend;
>
> This could be done as a shell script using temporary files (could be
> binary),
> from Python on supported platforms, on Windows from Python via
> (D)COM, or using
> the R shared object. None of them will be as fast as an in-GRASS
> solution,
> although R backend startup times and package loading are very much
> faster now
> than previously.
>
>>
>>> If it was practical to have e.g. "r.series method=R expression=...",
>>> that would be much more useful than having to start R and load
>>> potentially hundreds of rasters into memory.
>>
>> Right-- the "not-in-memory" features of GRASS are extremely valuble.
>
> Since R operations are vectorised, it might be possible to pass a
> block of
> values (say k rows, p columns, where p is the number of input
> rasters) and do
> apply() on it to get k rows back, but the blocking would be on the
> GRASS side.
> But this would be for specialist things, I expect.
>
>>
>>> r.resamp.stats only works on a single map, but as it's essentially a
>>> down-sampling tool, it's not unreasonable to use it on very large
>>> input maps.
>>
>> I like this idea. Use of a trimmed mean or other such R specialty
>> would be
>> very nice here.
>
> Some of the specialites are rather sophisticated, but others are more
> sophisticated than needed in GIS on a regular basis. It may be that
> the R
> functionality is more use here in supporting the implementation of
> code within
> GRASS by permitting checking that both end up with largely the same
> output for
> the same input.
>
> Roger
FWIW, we are going to have to require the Python numpy (or numeric)
package along with standard Python to get all the GUI features we now
have. A number of these functions are available in numpy. Hence, they
would be scriptable in Python without R. Might be worth considering as
an alternative for some things.
Michael
More information about the grass-dev
mailing list