[STATSGRASS] propagating temporary files

Roger Bivand Roger.Bivand at nhh.no
Wed Jun 7 04:37:46 EDT 2006


Dear Rohan,

On Wed, 7 Jun 2006, rsadler at cyllene.uwa.edu.au wrote:

> Dear statsgrassians,
> 
> I seem to be getting propagating temporary files that belong to GRASS  
> in both the GRASS database mapset .tmp directory and R's /tmp  
> directory. For example,
> 
> # 1930 categories
> clump of test51 in test
> 
> 0.00 0.00 0.00 0.00
> 
> is reported in the GRASS's .tmp directory
> 
> whereas whole rasters as ASCII files are reported in R's /tmp.
> 
> Neither are deleted once command line code is executed, as should be  
> the case from my understanding.
> 

More details, please! In R, sessionInfo(), system("g.version") should do 
it. If I recall correctly, this is spgrass6, GRASS 6, but the version 
number of spgrass6 may make a difference.

In current spgrass6, both readFLOAT6sp() and writeRast6sp() (don't mind 
the names) call the R unlink() command on the temporary file they create - 
created using g.tempfile in the .tmp under the current mapset. For 
reading, unlink() seems to work for me, and for writing, with files being 
created and unlinked under mapset .tmp for each raster moved.

Current spgrass6 does not use R's tmp, it only uses the one used by 
g.tempfile. 

Hope this helps,

Roger


> Since I am running lots of simulations this quickly junks up available  
> disk space, and I believe is contributing to a slow down in performance.
> 
> I can disappear these files at run-time, but the strange thing is that  
> this didn't appear to be a problem a couple of a weeks ago.
> 
> Any suggestions before I reinstall both softwares and packages?
> 
> Regards
> Rohan
> 
>   PhD Student
>   School of Plant Biology
>   School of Mathematics and Statistics
>   Bushfire Cooperative Research Centre
>   The University of Western Australia
> 
> BTW: Joel and Roger, cheers for the below. Moving the GRASS location  
> to the local hard drives made all the difference (duh!). A working  
> speed on the cluster of 20000 simulations an hour was very useful,  
> until this latest morsel arrived.
> 
> ######################################################################
> > Without knowing much about the R interface I would guess that it may  
> > be slow due to all machines trying to access data off of the same  
> > disk. Have you got any way to measure disk reads and the network  
> > bandwidth to the machine hosting the disk?
> 
> I agree - I think the original process was also disk-bound, that is 500
> per hour looks very much like 30 times 20 per hour? This would imply that
> 10 machines would do 50-60 an hour each. So spreading the compute load
> doesn't help, if this is the case.
> 
> Roger
> 
> >
> > -Joel
> >
> > On Friday 17 March 2006 2:39 pm, rsadler at cyllene.uwa.edu.au wrote:
> > > Dear List,
> > >
> > > I have implemented Monte Carlo inference for a random closed set model
> > > that mimics the different "phases" of vegetation patterning to be
> > > found in images of a semi-arid grassland in north west Australia. The
> > > procedure is implemented using the multiple sessions capability of
> > > grass60 on a computing cluster with shared disk space. The problem is
> > > that when running a single machine alone I can generate 500
> > > simulations an hour. However when I run all 30 machines concurrently
> > > simulation rate drops dramatically to 20 sessions an hour for a single
> > > machine (all machines are the same).
> > >
> > > I am first contacting the statsgrass list because the procedure uses
> > > the grass/R interface for a number of separate tasks. What I don't
> > > know is whether the slow down is a result of the grass/R interface or
> > > whether the slow done occurs on the grass side of things where there
> > > is some shared file that all sessions use (like .grass.bashrc but not
> > > that). The program is run as an R batch file using vanilla and slave,
> > > with all output is being written to separate text files. All sessions
> > > use different locations and therefore different mapsets.
> > >
> > > Please advise
> > >
> > > Regards
> > > Rohan Sadler
> > >
> 
> 
> 
> _______________________________________________
> statsgrass mailing list
> statsgrass at grass.itc.it
> http://grass.itc.it/mailman/listinfo/statsgrass
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no




More information about the grass-stats mailing list