[GRASS-user] region issues using python multiprocessing.Pool()

Thu Jul 11 20:12:39 PDT 2013

Thanks again Pietro and Hamish. After looking at the code for
use_temp_region(), I believe I was using it incorrectly--I should have
called it after setting the region instead of before. With that
change, out of 7 tiles (* 8 bands for 56 images to process in the test
area), 50 of them completed successfully and the remaining 6 had pixel
values ranging from +/- NaN.

Since the NaNs sound like another region issue, I followed Hamish's
advice and looked at how the mapcalc_start() function was implemented
in i.oif module and modified the caclulateTOAR function to process all
8 bands of an image that have the same region at once. This method is
nice for the multispectral imagery (56 images completed in under 30
seconds), but unfortunately it won't help when I need to do the pan
imagery with its much higher resolution. I think Pietro's method will
be helpful here but it will take more time to wrap my head around that
implementation. :) Now I need to find out why some bands have
reflectances that exceed 1.0 after fixing a bug in my mapcalc equation
(forgot to divide by the effective bandwidth in the top of atmosphere
radiance calculation)...

Thanks,
Eric

On Thu, Jul 11, 2013 at 5:44 AM, Hamish <hamish_b at yahoo.com> wrote:
> Eric wrote:
>> Thanks for the responses, I'm looking forward to trying out your
>> suggestion, Pietro. Pygrass looks interesting, but I'm a little
>> confused about its relationship with grass.script.
>
> Pietro can certainly explain it better than me, but in general,
> grass.script works in the traditional way of calling grass modules to do
> what those grass modules do (think pythonized bash scripts), while
> pygrass abstracts a lot of that away and pythonizes the grass experience.
> See:
>   http://grasswiki.osgeo.org/wiki/GRASS_SoC_Ideas_2012/High_level_map_interaction
>
>> One of the reasons I'm trying to use multiprocessing is because it
>> side steps the GIL issue by not using threads, according to the
>> documentation (http://docs.python.org/2/library/multiprocessing.html).
>> I didn't think the grass.start_command() would use all the available
>> CPUs. I've used multiprocessing with the gdal python api and it made
>> use of all my cpu cores.
>
> grass.start_command() will only start one command in one process at a
> time. For example, i.landsat.rgb starts three parallel processes: one for
> each of the Red, Green, and Blue bands. So maximum speedup for that part
> of the module is just under 3x.
>
> But it isn't hard to have that launch more to have it use all cores; see
> the r3.in.xyz.py script and v.surf.icw.py addon script for examples of
> that. That method only works well with a certain kind of problem, but it
> offers very low overhead cost when it fits.
>
> My worry with splitting up a single loop by rows is that the problem
> often becomes I/O bound, and the overhead costs are much much larger. But
> each module has its own characteristics, so by all means, use what works..
>
>
> Hamish