[GRASS-dev] Parallelize a job using multiprocess python library without destroying environmental variable
Vaclav Petras
wenzeslaus at gmail.com
Mon Jun 30 07:05:21 PDT 2014
On Mon, Jun 30, 2014 at 5:21 AM, Annalisa Minelli <annagrass6 at gmail.com>
wrote:
> Hi all,
> I'm attempting to parallelize a job in a python script using multiprocess
> library in grass70.
> I had a look at the following links:
> http://grasswiki.osgeo.org/wiki/Parallel_GRASS_jobs
> and http://grasswiki.osgeo.org/wiki/Parallelizing_Scripts.
>
> I would like to work in the same location but in different mapsets because
> my jobs touch the region settings, but I don't know how to set separate
> mapset for separate jobs.
>
> Since now I discovered that this processes, if run in the same mapset,
> clean all the environmental variables (GISDBASE, LOCATION, MAPSET) so then
> GRASS does not start anymore and I have to restore the .grass70/rc file..
>
> can anyone hint me on how to set different mapsets for different jobs?
>
>
First, look at the PyGRASS GridModule [1] whether this can help you.
For general case, there is unfortunately no API. From what I understand,
you have to create a file "gisrc" somewhere and then do something like env
= copy(os.environ) and change GISRC there to your custom "gisrc". Then you
the change the mapset and region by standard GRASS means but you must pass
`env` parameter to all command/module calls (env is used by Python
subprocess to set environment just for one process).
Note that GISRC, GISBASE and LOCATION are (system) environmental variables
while GISDBASE, LOCATION_NAME and MAPSET are GRASS GIS session/environment
variables and are stored in "gisrc" file. I don't have an idea what
LOCATION variable is for (it contains full path to the mapset).
I would be glad to hear what others think about this.
You can of course read source code of GridModule, rendering in wxGUI,
g.gui.animation, or the following snipped but I don't say that it will be
easy to understand and there might be a lot of imperfections.
Vaclav
# we rely on the tmp dir having enough space for our map
tgt_gisdbase = tempfile.mkdtemp()
# this is not needed if we use mkdtemp but why not
tgt_location = 'r.out.png.proj_location_%s' % epsg_code
# because we are using PERMANENT we don't have to create mapset
explicitly
tgt_mapset_name = 'PERMANENT'
src_mapset = Mapset(src_mapset_name)
# get source (old) and set target (new) GISRC enviromental variable
# TODO: set environ only for child processes could be enough and it
would
# enable (?) parallel runs
src_gisrc = os.environ['GISRC']
tgt_gisrc = gsetup.write_gisrc(tgt_gisdbase,
tgt_location, tgt_mapset_name)
# we should use a copy and pass it but then it would not be possible to
use create_location
os.environ['GISRC'] = tgt_gisrc
if os.environ.get('WIND_OVERRIDE'):
old_temp_region = os.environ['WIND_OVERRIDE']
del os.environ['WIND_OVERRIDE']
else:
old_temp_region = None
# these lines looks good but anyway when developing the module
# switching location seemed fragile and on some errors (while running
# unfinished module) location was switched in the command line
try:
# the function itself is not safe for other (backgroud) processes
# (e.g. GUI), however we already switched GISRC for us
# and child processes, so we don't influece others
gcore.create_location(dbase=tgt_gisdbase,
location=tgt_location,
epsg=epsg_code,
datum=None,
datum_trans=None)
# Mapset object cannot be created if the real mapset does not exists
tgt_mapset = Mapset(gisdbase=tgt_gisdbase, location=tgt_location,
mapset=tgt_mapset_name)
# set the current mapset in the library
# we actually don't need to switch when only calling modules
# (right GISRC is enough for them)
tgt_mapset.current()
...
[1] http://grass.osgeo.org/grass71/manuals/pygrass/modules_grid.html
> All the best,
> Annalisa
>
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20140630/37fca0cd/attachment-0001.html>
More information about the grass-dev
mailing list