[GRASS-stats] Re: Accessing grass raster data from R via library
and not via file
Rainer M Krug
r.m.krug at gmail.com
Mon Jan 16 10:02:42 EST 2012
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Roger
possibly you have seen the attached conversation which I posted on
grass-stats mailing list.
I did some profiling, and it seems that (ignoring the plugin), the
import of GRASS data via the bin (useGDAL=TRUE) is the fastest.
I thought that the reading of the data was the bottleneck, but it
seems (if I interpret the profiling correct), that the bottleneck is
not the reading (see attached pdf from the profile data using Hadleys
library profr)
The bottleneck really seems to be the function SpatialGridDataFrame.
Do yuo think there would be considerable space to streamline this
function? I assume it is rather complex, so I must admit I haven't
looked at it yet.
The other question is:
I import all my data into GRASS and all the spatial data I use in R is
raster of the same cell size and extend.
So I could actually import the GRASS data as a matrix instead of a
SpatialGridDataFrame. Remembering discussions stating that data.frames
are considerably slower then matrices of the same extend, I guess this
could make many things faster in my simulations.
Is there an easy way that I could read grass raster as matrices into
R? I remember some argument as.image in the function read.asciigrid
which returned the matrix as a picture - is something similar
available to read data from GRASS, or could something like that easily
be implemented?
Cheers,
Rainer
###########################################################
Profiling info:
> summaryRprof("readRAST.out")
$by.self
self.time self.pct total.time total.pct
"validityMethod" 0.06 27.27 0.20 90.91
"getGridIndex" 0.04 18.18 0.04 18.18
"asMethod" 0.02 9.09 0.14 63.64
"apply" 0.02 9.09 0.02 9.09
".getClassFromCache" 0.02 9.09 0.02 9.09
".local" 0.02 9.09 0.02 9.09
"slot<-" 0.02 9.09 0.02 9.09
"system" 0.02 9.09 0.02 9.09
$by.total
total.time total.pct self.time self.pct
"readRAST6" 0.22 100.00 0.00 0.00
"validityMethod" 0.20 90.91 0.06 27.27
"anyStrings" 0.20 90.91 0.00 0.00
"initialize" 0.20 90.91 0.00 0.00
"new" 0.20 90.91 0.00 0.00
"readBinGrid" 0.20 90.91 0.00 0.00
"SpatialGridDataFrame" 0.20 90.91 0.00 0.00
"validObject" 0.20 90.91 0.00 0.00
"asMethod" 0.14 63.64 0.02 9.09
"as" 0.14 63.64 0.00 0.00
"nrow" 0.10 45.45 0.00 0.00
"SpatialPixels" 0.10 45.45 0.00 0.00
"SpatialGrid" 0.08 36.36 0.00 0.00
"SpatialPoints" 0.06 27.27 0.00 0.00
"getGridIndex" 0.04 18.18 0.04 18.18
"is" 0.04 18.18 0.00 0.00
"apply" 0.02 9.09 0.02 9.09
".getClassFromCache" 0.02 9.09 0.02 9.09
".local" 0.02 9.09 0.02 9.09
"slot<-" 0.02 9.09 0.02 9.09
"system" 0.02 9.09 0.02 9.09
"as<-" 0.02 9.09 0.00 0.00
".bboxCoords" 0.02 9.09 0.00 0.00
"coordinates" 0.02 9.09 0.00 0.00
"execGRASS" 0.02 9.09 0.00 0.00
"getClassDef" 0.02 9.09 0.00 0.00
"standardGeneric" 0.02 9.09 0.00 0.00
"t" 0.02 9.09 0.00 0.00
$sample.interval
[1] 0.02
$sampling.time
[1] 0.22
###########################################################
###########################################################
Versions:
> sessionInfo()
R version 2.14.0 (2011-10-31)
Platform: i686-pc-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] profr_0.2 proftools_0.0-4 spgrass6_0.7-4 XML_3.6-2
[5] rgdal_0.7-5 sp_0.9-91
loaded via a namespace (and not attached):
[1] digest_0.5.1 grid_2.14.0 lattice_0.19-33 tools_2.14.0
> version
_
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status
major 2
minor 14.0
year 2011
month 10
day 31
svn rev 57496
language R
version.string R version 2.14.0 (2011-10-31)
>
###########################################################
On 14/01/12 10:14, Rainer M Krug wrote:
> On 13/01/12 22:19, Hamish wrote:
>> Rainer wrote:
>>> I am writing to you directly, as you are the author of the
>>> r.out.mat.
>
>> dangerous! my inbox is flooded and I might easily miss it :)
>
>
> OK - I'll push it to grass-stats - that others can get involved.
>
>>> I have a simulation, which heavily relies on reading and
>>> writing spatial data from GRASS into R and vice versa.
>>> Profiling indicated, that the actual readRAST6() from R takes
>>> uip a large proportion of the simulation. So I looked into
>>> alternatives, and as it is not that difficult to return data
>>> from C to R, I thought about that option.
>
>> please post to the statsgrass mailing list. Roger
>
> Done
>
>> may have a good idea about what to do. ISTR there were two
>> options, and older & slower but more compatible way and a newer
>> and faster way. you could select them via a switch. maybe your
>> system is using the old+slow way?
>
> I tried out with the useGDAL option TRUE and FALS, aned the plugin
> is not installed on the cluster where I finally want to run the
> simulation (and, in addition, I would like to avoid GDAL doe to
> continueing hassles wth isnstalaltions etc. on the cluster -
> locally GDAL works nicely)
>
> I actualy did some benchmarks between useGDAL TRUE and FALSE and
> there was a difference in the 10% range, but still not fast
> enough.
>
>
>>> The module r.out.mat actually extracts all the data from GRASS
>>> and saves it (spatial header information and the actual map
>>> data).
>
>> r.out.bin would be more appropriate, as Matlab format saves
>> column-wise not row-wise due to its Fortran origins, so you have
>> to read the entire map into memory, which doesn't work for
>> massive datasets.
>
> Thanks - I'll look at that one. The "part of map" is a useful
> consideration.
>
>
>> both are very simple, see doc/raster/r.example/ in the source
>> code. no need for libraries or anything.
>
>
>> header data is available from r.info flags and $MAPSET/cellhd/
>
> The thing is I would like to avoid using files and access GRASS
> maps as directly as possible, which, as I understand it at the
> moment, would be via C function(s) in GRASS whose return values are
> the data, and a C function(s) in R which calls these GRASS C
> functions to provide the data to R.
>
>
>
>> the alternate idea is to use R's gdal package to read the GRASS
>> raster maps directly.
>
> As mentioned earlier, I would like to avoid GDAL doe to
> installation problems - but I should try this.
>
> I understand that the idea of GDAL is to convert spatial data
> between different formats, but there is (as far as I now) no way to
> get GASS data directly into R, without creating an intermediate
> format, so GRASS (disk) --> GDAL (memory) --> new format (disk) -->
> R
>
> resulting in reading from disk, converting, writing to disk and
> reading the converted data again.
>
> No problem if I am doing analysis, but if I am running my
> simulation model, data is read and written to GRASS all the time,
> and this intermediate step costs lots of time.
>
> I could do some benchmarks with the spearfish dataset next week to
> show what I mean.
>
> Cheers and thanks,
>
> Rainer
>
>
>
>> double check your region bounds and resolution are correct, or it
>> could waste a lot of time.
>
> I'll do, but I am 99% sure that they are correct and as I need
> them.
>
>
>
>
>> regards, Hamish
>
>
- --
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)
Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa
Tel : +33 - (0)9 53 10 27 44
Cell: +33 - (0)6 85 62 59 98
Fax : +33 - (0)9 58 10 27 44
Fax (D): +49 - (0)3 21 21 25 22 44
email: Rainer at krugs.de
Skype: RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk8UPBIACgkQoYgNqgF2egpXCwCff/c48u/xDNr5xxEuJAwWLD+B
+IQAnjs6L2Xns+prNMAaFN2+hYLsXAAy
=uFpI
-----END PGP SIGNATURE-----
-------------- next part --------------
sample.interval=20000
"system" "execGRASS" "readRAST6"
".local" "coordinates" "standardGeneric" "coordinates" "SpatialPoints" "asMethod" "as" "validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGrid" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
"apply" "t" ".bboxCoords" "SpatialPoints" "is" "SpatialPixels" "asMethod" "as" "nrow" "validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGrid" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
"getGridIndex" "initialize" "initialize" "new" "SpatialPixels" "asMethod" "as" "nrow" "validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGrid" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
"slot<-" "asMethod" "as" "validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialPixels" "asMethod" "as" "nrow" "validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGrid" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
".getClassFromCache" "getClassDef" "is" "validObject" "initialize" "initialize" "new" "SpatialPoints" "asMethod" "as" "validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
"validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
"validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
"validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
"getGridIndex" "initialize" "initialize" "new" "SpatialPixels" "asMethod" "as" "nrow" "validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
"asMethod" "as<-" "initialize" "initialize" "new" "SpatialPixels" "asMethod" "as" "nrow" "validityMethod" "anyStrings" "validObject" "initialize" "initialize" "new" "SpatialGridDataFrame" "readBinGrid" "readRAST6"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: prof_readRAST6.pdf
Type: application/pdf
Size: 5025 bytes
Desc: not available
Url : http://lists.osgeo.org/pipermail/grass-stats/attachments/20120116/47eeb97d/prof_readRAST6.pdf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: readRAST.out.sig
Type: application/octet-stream
Size: 72 bytes
Desc: not available
Url : http://lists.osgeo.org/pipermail/grass-stats/attachments/20120116/47eeb97d/readRAST.out.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: prof_readRAST6.pdf.sig
Type: application/octet-stream
Size: 72 bytes
Desc: not available
Url : http://lists.osgeo.org/pipermail/grass-stats/attachments/20120116/47eeb97d/prof_readRAST6.pdf.obj
More information about the grass-stats
mailing list