[GRASS-dev] ramdisk as mapset?
Glynn Clements
glynn at gclements.plus.com
Thu Jul 19 12:33:44 EDT 2007
Hamish wrote:
> * I have some scripts which are very heavy on raster map disk i/o.
> r.cost chugs heavily on the hard drive & the script can take days
> to loop through. I don't want to wear a hole in it if I don't have to.
> * I have many GB of RAM to play with (enough to hold the region as DCELL)
> * The raster modules typically don't use much ram at all. (low overheads
> to compete with for RAM)
>
> I am trying to think of a way to get the raster ops to happen all in RAM
> to save time & wear on the hard drive. (script spans a number of r.*
> modules)
>
> ideas so far:
>
> 1) [Linux] create a 2GB ramdisk using ramfs. use g.mapset to swich into
> it, do the heavy i/o. switch back to the original mapset, g.copy the
> results map back to the "real" mapset, then destroy the ramdisk.
> advantages: easy to do.
> disadvantages: it's more of a local hack than a general solution.
It's also an inefficient use of RAM. Bear in mind that the kernel will
automatically cache files; if you use them frequently enough, they'll
be in RAM anyhow, and creating a RAM disk reduces the amount of RAM
that the kernel can use for caching.
> mkdir /mnt/ramdrive
>
> # default max_size is 1/2 physical ram, auto-resizes 'til then
> mount -t ramfs none /mnt/ramdrive
> mkdir -p /mnt/ramdrive/tmp_mapset
> TMP_MAPSET="/mnt/ramdrive/tmp_mapset"
> ln -s "$TMP_MAPSET" $USER/grassdata/$LOCATION/tmp_mapset
> cp $USER/grassdata/$LOCATION/$MAPSET/WIND "$TMP_MAPSET"
> g.mapset mapset=tmp_mapset
> ...
> g.module in=map@$MAPSET out=result
> ...
> g.mapset mapset=$MAPSET
> g.copy result at tmp_mapset,result
> umount /mnt/ramdrive
>
> problem: how to set group ID and mode/umask for ramdrive without
> having to do chown+chmod as root?
Mounting filesytems inevitably requires the cooperation of root. You
can allow normal users to mount specific filesystems by adding them to
/etc/fstab with the "user" option; you can normally set the
permissions of the root directory there.
> 2) Some backgrounded "grass_mapd" process to dynamically allocate and
> hold a single map in memory. It's a child of the main GRASS process so
> exiting GRASS tears it down. It could be a "virtual" map sort of like
> how a reclass map is just a wrapper for something else. This is just a
> very rough idea, probably not so easy to do; but if possible I reckon it
> would be a cool tool to have.
For programs which perform sequential I/O, you won't improve much on
the kernel's built in caching. If you have enough RAM, it will get
used.
A large proportion of the overhead is in the processing which occurs
between read() -> G_get_*_row() and G_put_*_row() -> write(), rather
than in the "actual" I/O (i.e. read() and write()). Creating
uncompressed maps would eliminate some of this; a better
implementation of nulls would also help.
For programs which perform random I/O (e.g. r.cost), consider
replacing the use of the segment library (which is rather poorly
implemented) with the segment code from r.proj.seg.
--
Glynn Clements <glynn at gclements.plus.com>
More information about the grass-dev
mailing list