[GRASS-dev] Re: big region r.watershed

Mon Oct 13 03:12:54 EDT 2008


Hamish wrote:
> Markus Metz wrote:
>   
>> The original version uses very little memory, so assuming that GRASS
>> runs today on systems where at least 500MB RAM are available I changed
>> the parameters for the seg mode, more data are kept in memory, speeding
>> up the seg mode. Looking at other modules using the segment library
>> (e.g. v.surf.contour, r.cost), it seems that there is not one universally
>> used setting, instead the segment parameters are tuned to each module.
>> The new settings work for me, but not necessarily for others, and maybe
>> using 500MB is a bit much.
>>     
>
> fwiw r.terraflow has a memory= option, the default is 300mb.
> AFAIU, the bigger you make that, the smaller the on-disk temp files need
> to be (ie work-around to keep tmp files <2gb for 32bit filesystems). 
>
> a number of modules like r.in.poly have a rows= option, which I didn't
> really understand until I got into the code. (hold at most that many
> region rows (all columns) in memory at once). Interestingly the default
> value has scaled quite well over the years.
>
> and other modules like r.in.xyz have percent= (0-100) for how much of the
> map to keep in memory at once.
>   
A default value that scales well over the years would be preferable, but
performance of r.watershed.fast -m is really poor if whole columns (or
rows ) are kept in memory and much better if segments have equal
dimensions. Interestingly, segments of 200 rows and 200 columns are
processed fastest, faster than e.g. 150 rows and columns or 250 rows and
columns. The more segments are kept in memory the better.
Right now I don't want to introduce a new option to give the user
control over how much memory is used (be it MB memory, number of rows or
percent of the map) because I want to keep all options of
r.watershed.fast identical to the original version. I'm still not happy
with the speed of the segmented version of r.watershed.fast, but at
least it is magnitudes faster than the in-memory version of the original
r.watershed. Maybe the iostream library that came with r.terraflow can
be used for r.waterhed -m as well.

Markus