[GRASS-user] Aggregation of massive number of raster layers with r.series

Pierre Roudier pierre.roudier at gmail.com
Thu May 11 14:30:28 PDT 2017


Thanks all,

I ended up having a script that tiles my overall region (using v.mkgrid). I
then loop through the tiles, and create a set of subregions on the fly
(using the save= option available for g.region). So in the end I have tiles
represeneted as a set of regions, named "region_[1-n]".

I then use the WIND_OVERRIDE env variable to process the tiles:

- On my personal machine, I can use GNU parallel:

g.list type=region pat=region_* | parallel WIND_OVERRIDE={} r.series
in=`g.list rast pat=temp_* sep=","` out=tiled_{} method=quantile
quantile=0.95 --o

- BUT: on the cluster, I can't use GNU parallel, so I generate one script
per region, which essentially is a one liner:

WIND_OVERRIDE=region_n r.series in=`g.list rast pat=temp_* sep=","`
out=tiled_region_n method=quantile quantile=0.95 --o

This script is launch silently using GRASS_BATCH_JOB.

My problem now is that I got errors because several GRASS scripts are
hitting the GRASS database at the same time:

Starting GRASS GIS...
ERROR: pierre.roudier is currently running GRASS in selected mapset
(file */projects/nesi00165/nobackup/modis/grassdata/modis_ts/PERMANENT/PERMANENT/*.gislock
found). Concurrent use not allowed.
You can force launching GRASS using -f flag (note that you need
permission for this operation). Have another look in the processor
manager just to be sure...
Exiting...

My question: in this instance, is it safe to use the -f flag, given these
different GRASS instances are not writing the same dataset to the DB?


On 21 April 2017 at 20:44, Blumentrath, Stefan <Stefan.Blumentrath at nina.no>
wrote:

> Hi Pierre,
>
> tiling should speed up significantly, if you process the tiles in parallel
> (and if you have multiple cores and if IO is not the bottleneck (e.g. slow
> network connection to the data)).
> Care has to be taken with the region settings, though.
>
> See e.g.:
> https://grasswiki.osgeo.org/wiki/Parallel_GRASS_jobs#Working_with_tiles
>
> Cheers
> Stefan
>
> ________________________________________
> Von: grass-user <grass-user-bounces at lists.osgeo.org> im Auftrag von
> Pierre Roudier <pierre.roudier at gmail.com>
> Gesendet: Freitag, 21. April 2017 00:49
> An: grass-user
> Betreff: [GRASS-user] Aggregation of massive number of raster layers with
>      r.series
>
> Hi,
>
> I am trying to compute the 95th percentile of a massive grid (12+
> million pixels) for a massive number of layers (~2500 layers).
>
> I am doing the aggregation using r.series on our cluster running grass
> 7.2, but of course it takes ages (21% there after 3 days).
>
> - I tried to tile the process, but it doesn't seem to help much.
>
> - Is there any benefit for me to switch to t.rast.aggregate? My
> understanding was that it was a wrapper around r.series.
>
> - Does anyone have a fancy trick to make the aggregation go faster
> (parallelisation)?
>
> Cheers,
>
> Pierre
> _______________________________________________
> grass-user mailing list
> grass-user at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/grass-user
>
-------------- section suivante --------------
Une pièce jointe HTML a été nettoyée...
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20170512/630184eb/attachment.html>


More information about the grass-user mailing list