[GRASS-dev] News from pygrass modules
Pietro
peter.zamb at gmail.com
Fri Apr 5 06:57:05 PDT 2013
Hi folks!
recently I worked on a small re-factoring of the pygrass interface to
the GRASS modules.
The bad news is that the pygrass shortcuts (r.*, g.*, etc. through the
MetaModule class) at the moment are not working due to an infinite
loop that I hope to fix soon... :-)
The good news is that I added a new the GridModule class that is
suppose to work only with the raster and image GRASS modules that
allow user to parallelize the module work.
At the moment the gain on my machine it is around 20% less to execute
the same module in the same region.
How does the GridModule work?
it simply create several new mapsets, inside each mapset run your
command in a smaller region (tile), then patch all the results and
remove all the mapsets.
In this way the bottle neck is basically your hard-disk...
How did I test it?
I've tested on my PC on a region of 64k x 64k cells, with 8 virtual
core and SSD, applying the r.slope.aspect module to a random map,
during the execution the ram consumption was around 1 Gb of RAM. The
GRASS module used for the test is "r.slope.aspect", because it require
an overlay of 2 pixel to be compute and have more than one outputs to
patch.
The GRASS module: r.slope aspect it takes 3043 seconds (~50')
the r.slope.aspect used through the GridModule class, it takes 2400
seconds (~40'), therefore around a 20% less... that it is not bad...
But looking more carefully to the process it seems that run the module
on each mapset it takes around 6-7 minutes the remain part of the time
(~33 minutes) is spent to patch the tiles into the final map. This
last operation is done in python, therefore I think that we can have
further speed improvements transforming this python function into C.
I'm not using r.patch because I want to exclude all the pixels that
are overlapped.
This part is highly unstable therefore is not for faint of heart.
Do you have other idea to speed-up the process?
Have fun!
Pietro
ps: How to test the new GridModule class?
{{{
#!python
# -*- coding: utf-8 -*-
#------------------------------------------------------------------
# set global variables
#set the region dimension
ROWS = 64 * 10 ** 3
COLS = 64 * 10 ** 3
# raster inputs parameters
RNAME = 'field'
#------------------------------------------------------------------
# import from standard library
import time
# import from grass
from grass.pygrass.modules import Module
from grass.pygrass.modules import GridModule
#------------------------------------------------------------------
# Create the map
print "set region"
Module('g.region', s='0', n=str(ROWS), w='0', e=str(COLS), res='1', flags='p')
print "generate random raster...", RNAME
Module('r.mapcalc', expression="%s = rand(0., 100.)" % RNAME, overwrite=True)
#------------------------------------------------------------------
# Start the test
print "start using Module"
tmstart = time.time()
Module('r.slope.aspect', # the GRASS module name
elevation='field', slope='slope', aspect='aspect', # GRASS parameters
overwrite=True)
tmend = time.time()
print 'Module need: ', tmend - tmstart
print "start using GridModule"
tgstart = time.time()
grd = GridModule('r.slope.aspect', # the GRASS module name
width=ROWS, height=8000, overlap=2, # set the tiles
processes=None, split=False, debug=False, # set other params
elevation='field', slope='slope', aspect='aspect', #
GRASS params
overwrite=True)
grd.run()
tgend = time.time()
print 'Grid need: ', tgend - tgstart
}}}
More information about the grass-dev
mailing list