[GRASS-user] Multicore Processing and Temporary File Cleanup
Markus Neteler
neteler at osgeo.org
Wed Feb 13 11:00:32 EST 2008
Joseph,
I am using a cluster right now which is based on PBS to elaborate MODIS
satellite data. Some answers below:
On Feb 13, 2008 2:43 PM, joechip90 <joechip90 at googlemail.com> wrote:
>
> Dear All,
>
> I have looked around on other postings and it appears that the majority (if
> not all) of the GRASS libraries are NOT thread safe.
Yes, unfortunately true.
> Unfortunately I have a
> very large processing job that would benefit from cluster processing. I
> have written a script that can be run on multiple processors whilst being
> very careful not to allow different processes to try to modify the same data
> at any point. The same raster file is not accessed by different processes
> at all in fact.
Yes, fine. Essentially there are at least two approaches of "poor man"
parallelization without modifying GRASS source code:
- split map into spatial chunks (possibly with overlap to gain smooth results)
- time series: run each map elaboration on a different node.
> However, I also realise that alone might not solve all my problems. In any
> one process some temporary files are created (by GRASS libraries) and then
> these are deleted on statup (cleaning temporary files...). Now I was
> wondering what these temporary files were and if there might be a problem
> with one process creating temporary files that it needs whilst another
> process starts up GRASS and deletes them. Is there any way to call GRASS in
> a way that doesn't delete the temporary files?
You could just modify the start script and remove that call for "clean_temp".
BUT:
I am currently elaborating some thousand maps for the same region (time
series). I elaborate each map in the same location but a different mapset
(simply using the map name as mapset name). At the end of the elaboration I
call a second batch job which only contains g.copy to copy the result into a
common mapset. There is a low risk of race condition here in case that two
nodes finish at the same time but this could be even trapped in a loop which
checks if the target mapset is locked and, if needed, launches g.copy again till
success.
> I appreciate that I'm trying to do something that GRASS doesn't really
> support but I was hoping that it might be possible to fiddle around and find
> a way. Any help would be gratefully received.
To some extend GRASS supports what you need.
I have drafted a related wiki page at:
http://grass.gdf-hannover.de/wiki/Parallel_GRASS_jobs
Feel free to hack that page!
Good luck,
Markus
More information about the grass-user
mailing list