[GRASS-user] Grass save on cluster?

Rainer M Krug r.m.krug at gmail.com
Tue Mar 24 08:22:55 EDT 2009

On Mon, Mar 23, 2009 at 6:06 PM, Markus Neteler <neteler at osgeo.org> wrote:
> On Mon, Mar 23, 2009 at 4:51 PM, Rainer M Krug <r.m.krug at gmail.com> wrote:
>> On Sat, Mar 21, 2009 at 4:31 PM, Markus Neteler <neteler at osgeo.org> wrote:
>>> On Thu, Mar 19, 2009 at 2:14 PM, Rainer M Krug <r.m.krug at gmail.com> wrote:
>>>> Hi
>>>> how save is GRASS 6.3.2 to be used on a cluster? I am starting several
>>>> instances of grass on a single node, they all have their own grass
>>>> database (but they share a linked PERMANENT and the name of the mapset
>>>> and location is the same). I am asking as I experience from time to
>>>> time crashes in my simulation because certain files can not be found,
>>>> and these crashes are not reproducible when only one GRASS instance
>>>> runs.
>>> I cannot say for 6.2.3, but I have run > 100k jobs in parallel with 6.4.0x,
>>> also having the sharing nodes (using PBs and lately only Grid Engine).
>> Are these in different mapsets? I am asking because we might be using
>> grass as a backend for a web-application - and when it can handle that
>> many parallel jobs, it should work.
> Yes.
> I create a tmp mapset for each job. Then I use a g.copy-job with
> check for race condition to get over the calculated stuff into the
> target mapset. Works well. I have 128 parallel jobs,
> so far nothing lost AFAIK (the competition arises when more than
> one job tries to write to the target mapset - hence I made it a loop
> to sleep and try again for 3 times - see wiki).

That really sounds and looks nice 0 thanks for this good news.

>>> Certain files can not be found didn't happen at all.
>>> Which are these files?
>> I am running a simulation, and these are layers which should have been
>> created before.
> I assume that you just need to add a
> g.mapsets add=sourcemapset
> to each job script since new (tmp) mapsets see only PERMANENT.
> Note that we fixed a bug in g.mapsets for this in GRASS 6.4.

I am actually conducting the calculations in parallel in the same
mapset - so I don't think that this applies here. I am using R to
script GRASS, and in R I am doing (or rather trying to do) some
parallel processing (r.mapcalc and calculations in R). It is not
essential to do them in parallel, so I can us it as it is now.

>>>> In addition, how safe is it to execute several r.mapcalc and other map
>>>> operations in the same mapset in parallel (obviously with different
>>>> maps)?
>>> You can do that unless the current region isn't touched. But it is
>>> not really recommended.
>> I am definitely not changing the region, so the error must be somewhere else.
> Which error? need to know more details...

The files not found error - but as I said, I will come back with more
detailed questions in a few weeks time. I might restructure my
simulation model and then I will take this into consideration.

Cheers and thanks a lot,


>>>> Again, I experienced severe problems with that, wherefore I
>>>> abandoned it (although it increases my simulation time considerable).
>>>> I would have expected it to work, but again, I had non reproducible
>>>> (and inconsistent) crashes.
>>>> Can somebody comment on that or provide some pointers?
>>> Sure:
>>> http://grass.osgeo.org/wiki/Parallel_GRASS_jobs
>> Thanks - this looks very intersting - I'll look into it.
> cheers
> Markus

Rainer M. Krug, Centre of Excellence for Invasion Biology,
Stellenbosch University, South Africa

