[GRASS-user] Multiple `r.proj` requests on the same raster map(s)

Nikos Alexandris nik at nikosalexandris.net
Sat Feb 24 16:10:00 PST 2018


* Markus Metz <markus.metz.giswork at gmail.com> [2018-02-24 22:31:41 +0100]:

>On Sat, Feb 24, 2018 at 10:06 PM, Nikos Alexandris <nik at nikosalexandris.net>
>wrote:
>>
>> * Markus Metz <markus.metz.giswork at gmail.com> [2018-02-24 21:39:40 +0100]:
>>
>>> On Sat, Feb 24, 2018 at 9:25 PM, Nikos Alexandris <
>nik at nikosalexandris.net>
>>> wrote:
>>>>
>>>>
>>>> Dear community,
>>>>
>>>> I am asking for help to "debug" a situation.
>>>>
>>>> One ETRS89-based location, one Mapset with hundreds of land cover raster
>>>> map tiles.
>>>>
>>>> Then, hundreds of UTM Zones-based Locations, subset of the WRS2 grid.
>>>>
>>>> Multiple `r.proj`es are running in parallel. However, only one at a time
>>>> from inside one Mapset inside the respective UTM-Zone-based Location,
>>>> requesting for one land cover raster map tile from the ETRS89-based
>>>
>>> Location.
>>>>
>>>>
>>>> And, each `r.proj` is isolated running inside an independent docker
>>>
>>> container.
>>>>
>>>>
>>>> To the question. Some tiles are cross-cut by more than one UTM-Zone.
>>>> Hence, it happens that many UTM-Zones will/might request to read the
>>>> same land cover raster map tile at the same time.
>>>>
>>>> That is one write per target Mapset/Location, yet highly probable
>>>> concurrent read requests to the same raster map(s) in the source
>>>> Mapset/Location.
>>>>
>>>> Is this bad?
>>>
>>>
>>> Are you experiencing problems?
>>
>>
>> Yes.
>>
>> (It's not easy for me to have access to the logs, as I
>> don't directly have access to the scheduler. I got a copy though and I
>> am reading through.)
>>
>> Looking at jobs logs, I read lots of ".gislock" lines.
>> It might be some permission related issue. I partially operated directly
>> (with my user-id) on many Locations.
>>
>> The operateor of the scheduler, has naturally, another user-id. I wonder
>> if I should apply GRASS_SKIP_MAPSET_OWNER_CHECK=1 everywhere.
>
>No, you need to run each process in a unique temporary mapset. Once you
>have the final result, change the current mapset with g.mapset to the
>common mapset where final results should stored and copy the final result
>from the temporary mapset to the current mapset (the mapset to hold the
>final results).

That's smart! Thank for this precious tip.

>Alternatively/additionally, don't use the script grassXY to start a GRASS
>session, instead define the GRASS environment with custom scripts (one for
>the GRASS version to use, one for the database/location/mapset to use).
>This avoids race conditions on a HPC system. A unique temporary mapset for
>each process helps to avoid all sorts of concurrent access problems.

This is something that I learned the hard way. I have to update all of
my scripts, step by step.

I wanted to have fine control and log details of processes. So, I built
up custom functions over `grassXY $MAPSET --exec`.

Nikos

[rest deleted]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20180225/0746d38f/attachment.sig>


More information about the grass-user mailing list