[GRASS-dev] News from pygrass modules

Pietro peter.zamb at gmail.com
Fri Apr 5 06:57:05 PDT 2013


Hi folks!

recently I worked on a small re-factoring of the pygrass interface to
the GRASS modules.

The bad news is that the pygrass shortcuts (r.*, g.*, etc. through the
MetaModule class) at the moment are not working due to an infinite
loop that I hope to fix soon... :-)

The good news is that I added a new  the GridModule class that is
suppose to work only with the raster and image GRASS modules that
allow user to parallelize  the module work.

At the moment the gain on my machine it is around 20% less to execute
the same module in the same region.


How does the GridModule work?

it simply create several new mapsets, inside each mapset run your
command in a smaller region (tile), then patch all the results and
remove all the mapsets.
In this way the bottle neck is basically your hard-disk...


How did I test it?

I've tested on my PC on a region of 64k x 64k cells, with 8 virtual
core and SSD, applying the r.slope.aspect module to a random map,
during the execution the ram consumption was around 1 Gb of RAM. The
GRASS module used for the test is "r.slope.aspect", because it require
an overlay of 2 pixel to be compute and have more than one outputs to
patch.

The GRASS module: r.slope aspect it takes 3043 seconds (~50')
the r.slope.aspect used through the GridModule class, it takes 2400
seconds (~40'), therefore around a 20% less... that it is not bad...

But looking more carefully to the process it seems that run the module
on each mapset it takes around 6-7 minutes the remain part of the time
(~33 minutes) is spent to patch the tiles into the final map. This
last operation is done in python, therefore I think that we can have
further speed improvements transforming this python function into C.
I'm not using r.patch because I want to exclude all the pixels that
are overlapped.
This part is highly unstable therefore is not for faint of heart.

Do you have other idea to speed-up the process?

Have fun!

Pietro

ps: How to test the new GridModule class?

{{{
#!python
# -*- coding: utf-8 -*-
#------------------------------------------------------------------
# set global variables
#set the region dimension
ROWS = 64 * 10 ** 3
COLS = 64 * 10 ** 3
# raster inputs parameters
RNAME = 'field'


#------------------------------------------------------------------
# import from standard library
import time

# import from grass
from grass.pygrass.modules import Module
from grass.pygrass.modules import GridModule


#------------------------------------------------------------------
# Create the map
print "set region"
Module('g.region', s='0', n=str(ROWS), w='0', e=str(COLS), res='1', flags='p')
print "generate random raster...", RNAME
Module('r.mapcalc', expression="%s = rand(0., 100.)" % RNAME, overwrite=True)


#------------------------------------------------------------------
# Start the test
print "start using Module"
tmstart = time.time()
Module('r.slope.aspect',                                # the GRASS module name
       elevation='field', slope='slope', aspect='aspect',    # GRASS parameters
       overwrite=True)
tmend = time.time()
print 'Module need: ', tmend - tmstart

print "start using GridModule"
tgstart = time.time()
grd = GridModule('r.slope.aspect',                      # the GRASS module name
                 width=ROWS, height=8000, overlap=2,            # set the tiles
                 processes=None, split=False, debug=False,    # set other params
                 elevation='field', slope='slope', aspect='aspect',  #
GRASS params
                 overwrite=True)
grd.run()
tgend = time.time()
print 'Grid need: ', tgend - tgstart

}}}


More information about the grass-dev mailing list