[GRASS-user] r.watershed speed-up
markus_metz at gmx.de
Fri Aug 1 04:23:49 EDT 2008
there is now a new version of r.watershed.fast where results are even
more similar to r.watershed. They are still not 100% identical to
r.watershed, but I can't get it more similar. But it comes with a
further speed increase. I repeated the test of Moritz with the same
commands on GRASS 6.4.svn, results are below.
Moritz Lennert wrote:
> First test in North Carolina demo data set:
> g.region rast=elevation
> time r.watershed elevation=elevation at PERMANENT accumulation=old_accum
> drainage=old_dir basin=old_sheds stream=old_streams thresh=500
> real 19m2.744s
> user 18m41.318s
> sys 0m1.884s
> time r.watershed.fast elevation=elevation at PERMANENT
> accumulation=fast_accum drainage=fast_dir basin=fast_sheds
> stream=fast_streams thresh=500
> real 0m18.034s
> user 0m17.833s
> sys 0m0.196s
Absolute times are not really comparable between systems, but relative
differences in time should be similar. The following numbers are
calculated with real time. In the test Moritz did, r.watershed took 63x
as long as r.watershed.fast, i.e. r.watershed.fast needed only 1.6% of
the time of r.watershed.
New version: r.watershed took 127x as long as r.watershed.fast, i.e.
r.watershed.fast needed only 0.8% of the time of r.watershed.
> Of the 2025000 cells in the map, 1991218 show the same direction, i.e.
> 98%. Those which have different directions are overwhelmingly low
> slope cells.
New version: 2004480 cells, i.e. 99% of all cells show the same flow
> 1833907 cells have the same accumulation value, i.e. 90%, but I guess
> this is to be expected.
New version: 1921510 cells, i.e. 95% of all cells show the same
The idea is that a faster r.watershed can also be used for massive
grids, where GRASS users frequently gave up using r.watershed because it
would have taken hours or even days. I resampled "elevation" in the
North Carolina demo data set from 10m to 3m with r.resamp.rst using
default values (after the GRASS book Section 5.3.3, paragraph
"Regularized spline with tension (RST) interpolation") to generate a
fairly large map and ran the same test on the resampled map.
cells in region : 22,500,000
r.watershed took 5459x as long as r.watershed.fast, i.e.
r.watershed.fast needed only 0.02% of the time of r.watershed (here
10h2m55s vs. 1m7s, 10 hours versus 1 minute...).
Flow direction differences:
22288539 cells, i.e. 99% of all cells show the same flow direction.
Flow accumulation differences:
20963653 cells, i.e. 93% of all cells show the same accumulation value.
Memory usage of r.watershed and r.watershed.fast: maximum of about 940MB
I don't understand why memory usage increases after <SECTION 1a:
Initiating Memory> is completed.
Assuming that there is no longer a time constraint but only a memory
constraint (although <SECTION 4: Watershed Determination> can take some
time on large maps with a large threshold value), the upper region sizes
that r.watershed.fast can process in RAM would be *roughly* for
1GB RAM: 14,000,000 cells
2GB RAM: 38,000,000 cells
4GB RAM: 86,000,000 cells
8GB RAM: 181,000,000 cells
after putting 400MB aside for the system and other open applications.
Estimate based on Linux 64bit.
If you want to repeat and analyse the tests with the North Carolina demo
data set, the new r.watershed.fast is here
and the test script is below.
time r.watershed elevation=elevation at PERMANENT accumulation=nc_accum_old
drainage=nc_dir_old basin=nc_sheds_old stream=nc_streams_old thresh=500
time r.watershed.fast elevation=elevation at PERMANENT
accumulation=nc_accum_fast drainage=nc_dir_fast basin=nc_sheds_fast
r.mapcalc nc_dir_dif='if(("nc_dir_old" - "nc_dir_fast" != 0),1,0)'
r.mapcalc nc_accum_dif='if(("nc_accum_old" - "nc_accum_fast" != 0),1,0)'
r.stats -c input=nc_dir_dif at PERMANENT
r.stats -c input=nc_accum_dif at PERMANENT
r.resamp.rst input=elevation at PERMANENT ew_res=3 ns_res=3
elev=elevation_rst overlap=3 zmult=1.0 tension=40.
time r.watershed elevation=elevation_rst at PERMANENT
basin=nc_rst_sheds_old stream=nc_rst_streams_old thresh=500
time r.watershed.fast elevation=elevation_rst at PERMANENT
basin=nc_rst_sheds_fast stream=nc_rst_streams_fast thresh=500
r.mapcalc nc_rst_dir_dif='if(("nc_rst_dir_old" - "nc_rst_dir_fast" !=
r.mapcalc nc_rst_accum_dif='if(("nc_rst_accum_old" - "nc_rst_accum_fast"
r.stats -c input=nc_rst_dir_dif at PERMANENT
r.stats -c input=nc_rst_accum_dif at PERMANENT
More information about the grass-user