[GRASS-user] region issues using python multiprocessing.Pool()

Hamish hamish_b at yahoo.com
Thu Jul 11 03:44:45 PDT 2013


Eric wrote:
> Thanks for the responses, I'm looking forward to trying out your
> suggestion, Pietro. Pygrass looks interesting, but I'm a little
> confused about its relationship with grass.script.

Pietro can certainly explain it better than me, but in general,
grass.script works in the traditional way of calling grass modules to do
what those grass modules do (think pythonized bash scripts), while
pygrass abstracts a lot of that away and pythonizes the grass experience.
See:
  http://grasswiki.osgeo.org/wiki/GRASS_SoC_Ideas_2012/High_level_map_interaction

> One of the reasons I'm trying to use multiprocessing is because it
> side steps the GIL issue by not using threads, according to the
> documentation (http://docs.python.org/2/library/multiprocessing.html).
> I didn't think the grass.start_command() would use all the available
> CPUs. I've used multiprocessing with the gdal python api and it made
> use of all my cpu cores.

grass.start_command() will only start one command in one process at a
time. For example, i.landsat.rgb starts three parallel processes: one for
each of the Red, Green, and Blue bands. So maximum speedup for that part
of the module is just under 3x.

But it isn't hard to have that launch more to have it use all cores; see
the r3.in.xyz.py script and v.surf.icw.py addon script for examples of
that. That method only works well with a certain kind of problem, but it
offers very low overhead cost when it fits.

My worry with splitting up a single loop by rows is that the problem
often becomes I/O bound, and the overhead costs are much much larger. But
each module has its own characteristics, so by all means, use what works..


Hamish


More information about the grass-user mailing list