[GRASS-user] r.stream.extract error

Markus Metz markus.metz.giswork at gmail.com
Wed Nov 8 10:52:29 PST 2017


On Wed, Nov 8, 2017 at 5:29 PM, Markus Metz <markus.metz.giswork at gmail.com>
wrote:
>
>
>
> On Wed, Nov 8, 2017 at 3:47 PM, Giuseppe Amatulli <
giuseppe.amatulli at gmail.com> wrote:
> >
> > Hi Markus,
> > I have compile the GRASS7.2.3svn
> >
> > r.stream.extract elevation=elv  accumulation=upa threshold=0.5
 depression=dep     direction=dir  stream_raster=stream memory=45000 --o
--verbose
> >
> > with a small area
> >
> > expr 600 \*  310  = 186 000 and everything works fine.
> >
> > If I enlarge the area
> > expr  72870 \* 80040  = 5 832 514 800
> >
> > I get the following warning
> >
> > 16.67% of data are kept in memory
> > Will need up to 293.52 GB (300563 MB) of disk space
> > Creating temporary files...
> > Loading input raster maps...
> >
0..3..6..9..12..15..18..21..24..27..30..33..36..39..42..45..48..51..54..57..60..63..66..69..72..75..78..81..84..87..90..93..96..99..100
> > Initializing A* search...
> > 0..WARNING: Segment pagein: read EOF
> > WARNING: segment lib: put: pagein failed
> > WARNING: Unable to write segment file
> > WARNING: Segment pagein: read EOF
>
> There was a small bug in the segment library, fixed in trunk r71648. You
will need to update your local copy of GRASS 7.3.

Now also fixed for GRASS 7.2 in r71649.

Markus M

>
> Markus M
>
>
> >
> > and than the stream is not created.
> > Is something due with  compile of the GRASS7.2.3svn ?
> > or something else?
> >
> > Concerning the threshold: I'm using area-flow-accumulation expressed in
km2. So this means that each pixel have a value of the upper stream
sum-area in km2. So if I fix the threshold to 0.5 means that my stream will
start when all the cells below have value > 0.5. I have check and look
correct to me, in fact my smallest upper stream basin have 7 cell 90*90  (
~ 1/2 km2).
> >
> > Thank you
> > Giuseppe
> >
> > On 1 November 2017 at 17:52, Markus Metz <markus.metz.giswork at gmail.com>
wrote:
> >>
> >>
> >>
> >> On Wed, Nov 1, 2017 at 10:41 PM, Giuseppe Amatulli <
giuseppe.amatulli at gmail.com> wrote:
> >> >
> >> > Thanks again!!
> >> >
> >> > I'm working with a area-flowaccumulation so the 0.5 threshold means
0.5 km2, which is 90m * 90m * 60 cell.
> >>
> >> I forgot to mention that the unit of the threshold option is cells,
not squared map units. That means you need to change the threshold value.
> >>
> >> Markus M
> >>
> >>
> >> > My intention is prune back the stream later on with a machine
learning procedure. I will be carefully look not to overpass the
2,147,483,647 detected stream segments.
> >> >
> >> > To reduce as much as possible  I/O I save the *.tif file in the
/dev/shm of each node, read then with r.external and build up the location
on the flight in each /tmp. So, it quite fast. I will try to increase a bit
the RAM.
> >> >
> >> > Will post later how is going.
> >> > Best
> >> > Giuseppe
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On 1 November 2017 at 17:12, Markus Metz <
markus.metz.giswork at gmail.com> wrote:
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Nov 1, 2017 at 7:15 PM, Giuseppe Amatulli <
giuseppe.amatulli at gmail.com> wrote:
> >> >> >
> >> >> > Thanks Markus!!
> >> >> > I will test and I will let you know how it works.
> >> >>
> >> >> Your feedback is very helpful!
> >> >> >
> >> >> > I have few  more questions
> >> >> > 1) now how much is the upper limit matrix cell number that
r.stream.extract can handle?
> >> >>
> >> >> About 1.15e+18 cells.
> >> >>
> >> >> Another limitation is the number of detected stream segments. This
must not be larger than 2,147,483,647 streams, therefore you need to figure
out a reasonable threshold with a smaller test region. A threshold of 0.5
is definitively too small, no matter how large or small the input is.
Threshold should typically be larger than 1000, but is somewhat dependent
on the resolution of the input. As a rule of thumb, with a coarser
resolution, a smaller threshold might be suitable, with a higher
resolution, the threshold should be larger. Testing different threshold
values in a small subset of the full region can safe a lot of time.
> >> >>
> >> >> > 2) is the r.stream.basins add-on subjects to the same limitation?
In case would be possible to update also for  r.stream.basins?
> >> >>
> >> >> The limitation in r.watershed and r.stream.extract comes from the
search for drainage directions and flow accumulation. The other r.stream.*
modules should support large input data, as long as the number of stream
segments does not exceed 2,147,483,647.
> >> >>
> >> >> > 3) is r.stream.extract support the use of multi-threaded through
openMP? Would be difficult implement?
> >> >>
> >> >> In your case, only less than 13% of temporary data are kept in
memory. Parallelization with openMP or similar will not help here, your CPU
will run only at less than 20% with one thread anyway. The limit is disk
I/O. You can make it faster by using more memory and/or using a faster disk
storage device.
> >> >>
> >> >> Markus M
> >> >> >
> >> >> > Best
> >> >> > Giuseppe
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On 31 October 2017 at 15:54, Markus Metz <
markus.metz.giswork at gmail.com> wrote:
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Mon, Oct 30, 2017 at 1:42 PM, Giuseppe Amatulli <
giuseppe.amatulli at gmail.com> wrote:
> >> >> >> >
> >> >> >> > Hi,
> >> >> >> > I'm using the r.stream.extract grass command
> >> >> >> >
> >> >> >> > r.stream.extract elevation=elv  accumulation=upa threshold=0.5
   depression=dep     direction=dir  stream_raster=stream memory=35000 --o
--verbose
> >> >> >> >
> >> >> >> > where the elv is raster of 142690 *  80490 = 11,485,118,100
cell
> >> >> >> >
> >> >> >> > and I get this error
> >> >> >> >
> >> >> >> > 12.97% of data are kept in memory
> >> >> >> > Will need up to 293.52 GB (300563 MB) of disk space
> >> >> >> > Creating temporary files...
> >> >> >> > Loading input raster maps...
> >> >> >> >
0..3..6..9..12..15..18..21..24..27..30..33..36..39..42..45..48..51..54..57..60..63..66..69..72..75..78..81..84..87..90..93..96..99..100
> >> >> >> > ERROR: Unable to load input raster map(s)
> >> >> >>
> >> >> >> This error is caused by integer overflow because not all
variables necessary to support such large maps were 64 bit integer.
> >> >> >>
> >> >> >> Fixed in trunk and relbr72 with r71620,1, and tested with a DEM
with 172800 * 67200 = 11,612,160,000 cells: r.stream.extract finished
successfully in 18 hours (not a HPC, a standard desktop maschine with 32 GB
of RAM and a 750 GB SSD).
> >> >> >> >
> >> >> >> > According to the help manual the memory=35000 should be set in
according to the overall memory available. I set the HPC upper memory limit
to 40G.
> >> >> >> >
> >> >> >> > I try several combination of these parameters  but i still get
the same error.
> >> >> >> > If the r.stream.extract is based on r.watershed than the
segmentation  library should be able to handle a huge raster.
> >> >> >>
> >> >> >> r.stream.extract is based on a version of r.watershed that did
not support yet such huge raster maps, therefore support for such huge
raster maps needed to be added to r.stream.extract separately.
> >> >> >>
> >> >> >> >
> >> >> >> > Anyone know how to over pass this limitation/error ?
> >> >> >>
> >> >> >> Please use the latest GRASS 7.2 or GRASS 7.3 version from svn.
> >> >> >>
> >> >> >> Markus M
> >> >> >>
> >> >> >> >
> >> >> >> > Thank you
> >> >> >> > Best
> >> >> >> > --
> >> >> >> > Giuseppe Amatulli, Ph.D.
> >> >> >> >
> >> >> >> > Research scientist at
> >> >> >> > Yale School of Forestry & Environmental Studies
> >> >> >> > Yale Center for Research Computing
> >> >> >> > Center for Science and Social Science Information
> >> >> >> > New Haven, 06511
> >> >> >> > Teaching: http://spatial-ecology.org
> >> >> >> > Work:  https://environment.yale.edu/profile/giuseppe-amatulli/
> >> >> >> >
> >> >> >> > _______________________________________________
> >> >> >> > grass-user mailing list
> >> >> >> > grass-user at lists.osgeo.org
> >> >> >> > https://lists.osgeo.org/mailman/listinfo/grass-user
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Giuseppe Amatulli, Ph.D.
> >> >> >
> >> >> > Research scientist at
> >> >> > Yale School of Forestry & Environmental Studies
> >> >> > Yale Center for Research Computing
> >> >> > Center for Science and Social Science Information
> >> >> > New Haven, 06511
> >> >> > Teaching: http://spatial-ecology.org
> >> >> > Work:  https://environment.yale.edu/profile/giuseppe-amatulli/
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Giuseppe Amatulli, Ph.D.
> >> >
> >> > Research scientist at
> >> > Yale School of Forestry & Environmental Studies
> >> > Yale Center for Research Computing
> >> > Center for Science and Social Science Information
> >> > New Haven, 06511
> >> > Teaching: http://spatial-ecology.org
> >> > Work:  https://environment.yale.edu/profile/giuseppe-amatulli/
> >>
> >
> >
> >
> > --
> > Giuseppe Amatulli, Ph.D.
> >
> > Research scientist at
> > Yale School of Forestry & Environmental Studies
> > Yale Center for Research Computing
> > Center for Science and Social Science Information
> > New Haven, 06511
> > Teaching: http://spatial-ecology.net
> > Work:  https://environment.yale.edu/profile/giuseppe-amatulli/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20171108/7aafe411/attachment-0001.html>


More information about the grass-user mailing list