[GRASS-user] [GRASS-dev] Parallelize a job using multiprocess python library without destroying environmental variable

Vaclav Petras wenzeslaus at gmail.com
Mon Jun 30 07:05:21 PDT 2014


On Mon, Jun 30, 2014 at 5:21 AM, Annalisa Minelli <annagrass6 at gmail.com>
wrote:

> Hi all,
> I'm attempting to parallelize a job in a python script using multiprocess
> library in grass70.
> I had a look at the following links:
> http://grasswiki.osgeo.org/wiki/Parallel_GRASS_jobs
> and http://grasswiki.osgeo.org/wiki/Parallelizing_Scripts.
>
> I would like to work in the same location but in different mapsets because
> my jobs touch the region settings, but I don't know how to set separate
> mapset for separate jobs.
>
> Since now I discovered that this processes, if run in the same mapset,
> clean all the environmental variables (GISDBASE, LOCATION, MAPSET) so then
> GRASS does not start anymore and I have to restore the .grass70/rc file..
>
> can anyone hint me on how to set different mapsets for different jobs?
>
>
First, look at the PyGRASS GridModule [1] whether this can help you.

For general case, there is unfortunately no API. From what I understand,
you have to create a file "gisrc" somewhere and then do something like env
= copy(os.environ) and change GISRC there to your custom "gisrc". Then you
the change the mapset and region by standard GRASS means but you must pass
`env` parameter to all command/module calls (env is used by Python
subprocess to set environment just for one process).

Note that GISRC, GISBASE and LOCATION are (system) environmental variables
while GISDBASE, LOCATION_NAME and MAPSET are GRASS GIS session/environment
variables and are stored in "gisrc" file. I don't have an idea what
LOCATION variable is for (it contains full path to the mapset).

I would be glad to hear what others think about this.

You can of course read source code of GridModule, rendering in wxGUI,
g.gui.animation, or the following snipped but I don't say that it will be
easy to understand and there might be a lot of imperfections.

Vaclav

    # we rely on the tmp dir having enough space for our map
    tgt_gisdbase = tempfile.mkdtemp()
    # this is not needed if we use mkdtemp but why not
    tgt_location = 'r.out.png.proj_location_%s' % epsg_code
    # because we are using PERMANENT we don't have to create mapset
explicitly
    tgt_mapset_name = 'PERMANENT'

    src_mapset = Mapset(src_mapset_name)

    # get source (old) and set target (new) GISRC enviromental variable
    # TODO: set environ only for child processes could be enough and it
would
    # enable (?) parallel runs
    src_gisrc = os.environ['GISRC']
    tgt_gisrc = gsetup.write_gisrc(tgt_gisdbase,
                                   tgt_location, tgt_mapset_name)
    # we should use a copy and pass it but then it would not be possible to
use create_location
    os.environ['GISRC'] = tgt_gisrc
    if os.environ.get('WIND_OVERRIDE'):
        old_temp_region = os.environ['WIND_OVERRIDE']
        del os.environ['WIND_OVERRIDE']
    else:
        old_temp_region = None
    # these lines looks good but anyway when developing the module
    # switching location seemed fragile and on some errors (while running
    # unfinished module) location was switched in the command line

    try:
        # the function itself is not safe for other (backgroud) processes
        # (e.g. GUI), however we already switched GISRC for us
        # and child processes, so we don't influece others
        gcore.create_location(dbase=tgt_gisdbase,
                              location=tgt_location,
                              epsg=epsg_code,
                              datum=None,
                              datum_trans=None)

        # Mapset object cannot be created if the real mapset does not exists
        tgt_mapset = Mapset(gisdbase=tgt_gisdbase, location=tgt_location,
                            mapset=tgt_mapset_name)
        # set the current mapset in the library
        # we actually don't need to switch when only calling modules
        # (right GISRC is enough for them)
        tgt_mapset.current()
...



[1] http://grass.osgeo.org/grass71/manuals/pygrass/modules_grid.html



> All the best,
> Annalisa
>
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20140630/37fca0cd/attachment-0001.html>


More information about the grass-user mailing list