[GRASS-dev] mutli-core GRASS [was: ramdisk as mapset?]

Mon Jul 23 03:12:46 EDT 2007

> Hamish wrote:
> > * I have some scripts which are very heavy on raster map disk i/o.
> >    r.cost chugs heavily on the hard drive & the script can take days
> >    to loop through. I don't want to wear a hole in it if I don't
> >    have to.
..
> > I am trying to think of a way to get the raster ops to happen all in
> > RAM
..
> > 1) [Linux] create a 2GB ramdisk using ramfs. use g.mapset to swich
> > into it, do the heavy i/o.
..
Glynn:
> It's also an inefficient use of RAM. Bear in mind that the kernel will
> automatically cache files; if you use them frequently enough, they'll
> be in RAM anyhow, and creating a RAM disk reduces the amount of RAM
> that the kernel can use for caching.

True. I could do (pseudo) `cat -r $MAPSET/*`, but that's no faster than
just letting the kernel do it itself as it happens.

the ramdisk helps train the caching ahead of time. next time I do a big
r.cost loop I might experiment with this and see how much of a
difference
it makes.

Hamish:
> > problem: how to set group ID and mode/umask for ramdrive without
> > having to do chown+chmod as root?

Jakub Kulczynski wrote:
> You can't. Well actually you could use sudo (preferably with nopasswd 
> option) to chown+chmod the dir.

Ok. I now see the `mount` man page says that ramfs has no mount options.
(slightly older 2.4 kernel + gnu) I did try setting in fstab; was
ignored.

Glynn:
> For programs which perform random I/O (e.g. r.cost), consider
> replacing the use of the segment library (which is rather poorly
> implemented) with the segment code from r.proj.seg.

This is the heart of my wish really. The ramfs stuff is just a
workaround for that issue.

Dylan wrote:
> I am about to purchase a cluster of Mac Pros for filtering and
> rendering sonar data and I have been curious what has been done to
> parallelize GRASS buy enterprising people.

Q: is the GRASS segmentation process inherently thread-friendly?
(I mean theoretically, not as written)

ie if the segmentation library was rewritten to be better, could it use
some sort of n_proc value set at compile time, (or (better) a gis var
set with g.gisenv, or even a shell enviro var) to determine how many
simultaneous processes to run at once?

Given our manpower, the best way I see to get GRASS more multi-core and
multi-processor ready is a piecemeal approach, starting will the low-
hanging fruit / biggest bang-for-buck libraries. Simultaneously we can
gradually remove as many global variables as possible from the libs.

I wonder if Thiery has any thoughts here, as he is probably in a better
position to fundamentally & quickly rework the architecture than we are.
(ie less baggage to worry about) I think it is very safe to say that for
the next decade or so multi-core scaling is going to be the future of
number crunching. Eventually new paradigms and languages will arrive, but
for now we have to fight with making our serial languages thread-safe....

some sort of plan of action, in order of priority:
1) [if plausible] Make the segment lib multi-proc'able. If it's currently
   crappy, then all the more reason to start rewrites here.
2) Work on quad-tree stuff (v.surf.*, r.terraflow) individually  (???)
3) Create new row-by-row libgis fns and start migrating r.modules, 1 by 1.
   (what would the module logic look like instead of two for loops?)
4) I don't know, but suspect, MPIing vector ops will be much much harder.

After the segment lib & one-offs, the next big multi-proc task I see is
the row-by-row raster ops. This of course means replacing
G_{get|put}_*_row() in the raster modules with a more abstract method.
Then, in some new libgis fn, splitting the map up into n_proc parts and
applying the operation to each. Worry about multi-row r.neighbors etc
later?   This is getting near to writing r.mapcalc as a lib fn. (!)
I wonder if the python-C SWIG interface helps with prototyping?
Then slowly move as many of the 150 raster modules to the new MPI-aware
lib fns as are suited for it, one by one. Again I think the low-hanging
fruit will be obvious and the most important modules (r.mapcalc, r.cost)
will be taken care of first, and the lesser used raster modules on a needs
basis by contributors. (as long as we offer a clean API method)

I've read that "n" in 'make -j n' should be n_procs + 1. Is that just
true for quick little processes where you always want a job ready at the
door and there's a lot of overhead creating & destroying the process?

thoughts?
Hamish