[GRASS-user] Multiple `r.proj` requests on the same raster map(s)

Markus Metz markus.metz.giswork at gmail.com
Sun Feb 25 07:50:41 PST 2018


On Sun, Feb 25, 2018 at 9:44 AM, Nikos Alexandris <nik at nikosalexandris.net>
wrote:
>
> * Markus Neteler <neteler at osgeo.org> [2018-02-24 23:00:32 +0100]:
>
>> On Sat, Feb 24, 2018 at 10:31 PM, Markus Metz
>> <markus.metz.giswork at gmail.com> wrote:
>>>
>>> On Sat, Feb 24, 2018 at 10:06 PM, Nikos Alexandris <
nik at nikosalexandris.net>
>>>>
>>>> * Markus Metz <markus.metz.giswork at gmail.com> [2018-02-24 21:39:40
+0100]:
>>>>>
>>>>> On Sat, Feb 24, 2018 at 9:25 PM, Nikos Alexandris
>>>>> <nik at nikosalexandris.net>
>>
>> ...
>>>>
>>>> Looking at jobs logs, I read lots of ".gislock" lines.
>>>> It might be some permission related issue. I partially operated
directly
>>>> (with my user-id) on many Locations.
>>>>
>>>> The operateor of the scheduler, has naturally, another user-id. I
wonder
>>>> if I should apply GRASS_SKIP_MAPSET_OWNER_CHECK=1 everywhere.
>>>
>>>
>>> No, you need to run each process in a unique temporary mapset.
>>
>>
>> Yes, only that works. Be sure to have a sufficiently long random
>> string to be used as temporary mapset name.
>>
>> You can for example the outout of
>> mktemp --dry-run
>>
>> and add the machine name to it (and maybe the current time stamp
>> cleaned for special chars) to avoid race conditions if you use a
>> shared network storage.
>
>
> Thanks M1/2.
>
>
>>> Once you have
>>> the final result, change the current mapset with g.mapset to the common
>>> mapset where final results should stored and copy the final result from
the
>>> temporary mapset to the current mapset (the mapset to hold the final
>>> results).
>>
>>
>> (we have processed terabytes of LST data like this :-)
>
>
> Just a pseudo-example: it would suffice then to,
>
> save current region

does not work, you should be still outside GRASS

> for loop over something
>    ...
>
>    CURRENT_MAPSET=$(g.mapset -p)

does not work, you should be still outside GRASS

>
>    # a temporary Mapset
>    RANDOM_STRING=$(mktemp --dry-run |cut -d"." -f2)

now you start GRASS ...
>    grass -c $RANDOM_STRING

missing is --exec

... and are out of GRASS again

put this in a script to be called with grass -c ... --exec myscript.sh
-->
>
>    # do something
>    r.mask vector=VectorMap where="Attribute='Here'" &&
>    g.region zoom=MASK &&
>    r.zonal.stats cover=covermap base=basemap method=average
output=outputmap
>
>    # back to "valid" Mapset
>    g.mapset $CURRENT_MAPSET
>
>    g.copy raster=outputmap@${RANDOM_STRING},outputmap
>    r.stats -acp in=outputmap out=report
>    r.mask -r
<--

> restore region

does not work, you should be again outside GRASS

All outside GRASS, try to

1. create a script with commands and parameters to be executed, e.g.
myscript.sh
2. create a unique name of a temporary mapset (full path), store it in an
env var, e.g. TMPMAPSET
3. run grass -c $TMPMAPSET --exec myscript.sh
4. remove the temporary mapset simply with rm -fr $TMPMAPSET

>>> Alternatively/additionally, don't use the script grassXY to start a
GRASS
>>> session, instead define the GRASS environment with custom scripts (one
for
>>> the GRASS version to use, one for the database/location/mapset to use).
This
>>> avoids race conditions on a HPC system. A unique temporary mapset for
each
>>> process helps to avoid all sorts of concurrent access problems.
>
>
> It mostly works for me with --exec. Mostly. That is, there are missing
> or empty WIND files, here and there, and .gislock related issues.

I found a confusing message about a missing WIND file in the startup
script, fixed in trunk and relbr74 with r72277,8

.gislock issues can only appear if 1) you attempt to run several instances
in the same mapset, 2) there is an old .gislock file left from a previous
run that failed.

g.mapset mapset=$TARGET_MAPSET
might avoid the .gislock issue, but it should only be used for copying
results with e.g.
g.copy rast=outputmap@$TMPMAPSET,outputmap

>>
>> Let's expand this Wiki section a bit with our findings (I'll try to
>> find my notes):
>>
https://grasswiki.osgeo.org/wiki/Parallel_GRASS_jobs#Cluster_and_Grid_computing
>>
>> markusN

It's on my TODO list...

Markus M
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20180225/fb850d69/attachment.html>


More information about the grass-user mailing list