[GRASS-user] Multicore Processing and Temporary File Cleanup

Markus Neteler neteler at osgeo.org
Wed Feb 13 11:00:32 EST 2008


Joseph,

I am using a cluster right now which is based on PBS to elaborate MODIS
satellite data. Some answers below:

On Feb 13, 2008 2:43 PM, joechip90 <joechip90 at googlemail.com> wrote:
>
> Dear All,
>
> I have looked around on other postings and it appears that the majority (if
> not all) of the GRASS libraries are NOT thread safe.

Yes, unfortunately true.

> Unfortunately I have a
> very large processing job that would benefit from cluster processing.  I
> have written a script that can be run on multiple processors whilst being
> very careful not to allow different processes to try to modify the same data
> at any point.  The same raster file is not accessed by different processes
> at all in fact.

Yes, fine. Essentially there are at least two approaches of "poor man"
parallelization without modifying GRASS source code:

- split map into spatial chunks (possibly with overlap to gain smooth results)
- time series: run each map elaboration on a different node.

> However, I also realise that alone might not solve all my problems.  In any
> one process some temporary files are created (by GRASS libraries) and then
> these are deleted on statup (cleaning temporary files...).  Now I was
> wondering what these temporary files were and if there might be a problem
> with one process creating temporary files that it needs whilst another
> process starts up GRASS and deletes them.  Is there any way to call GRASS in
> a way that doesn't delete the temporary files?

You could just modify the start script and remove that call for "clean_temp".
BUT:
I am currently elaborating some thousand maps for the same region (time
series). I elaborate each map in the same location but a different mapset
(simply using the map name as mapset name). At the end of the elaboration I
call a second batch job which only contains g.copy to copy the result into a
common mapset. There is a low risk of race condition here in case that two
nodes finish at the same time but this could be even trapped in a loop which
checks if the target mapset is locked and, if needed, launches g.copy again till
success.

> I appreciate that I'm trying to do something that GRASS doesn't really
> support but I was hoping that it might be possible to fiddle around and find
> a way.  Any help would be gratefully received.

To some extend GRASS supports what you need.
I have drafted a related wiki page at:
http://grass.gdf-hannover.de/wiki/Parallel_GRASS_jobs

Feel free to hack that page!

Good luck,
Markus


More information about the grass-user mailing list