[GRASS-dev] [GRASS GIS] #2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce

GRASS GIS trac at osgeo.org
Tue Jan 9 12:19:18 PST 2018


#2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce
---------------------+-------------------------
  Reporter:  dylan   |      Owner:  grass-dev@…
      Type:  defect  |     Status:  new
  Priority:  normal  |  Milestone:  7.2.3
 Component:  Raster  |    Version:  unspecified
Resolution:          |   Keywords:
       CPU:  x86-64  |   Platform:  Linux
---------------------+-------------------------

Comment (by dylan):

 Replying to [comment:30 mmetz]:
 > Replying to [comment:29 dylan]:
 > >
 > > [...] Note that I don't have any issues with any other GRASS commands,
 or (as far as I can tell) general usage on this machine. I only see these
 errors when working with GRASS commands that:
 > >
 > >   * take a long time to run: `r.sun` or `t.rast.mapcalc` ([http
 ://osgeo-org.1560.x6.nabble.com/Error-reading-raster-data-for-row-xxx-
 only-when-using-r-series-and-t-rast-series-td5229569.html e.g. a couple of
 years ago])
 > >   * operate on moderately large, floating-point maps
 > >   * are done in parallel, either via GNU `parallel` or as implemented
 in the temporal suite of modules
 > >
 > > ...hence the extreme difficulty in recreating the errors or further
 debugging.
 >
 > Unfortunately, I was not able to recreate these errors with the provided
 test data and scripts.
 >
 > I still think this is some obscure disk IO error. You could try to use
 `nice`, e.g. `nice r.sun ...` and `nice r.mapcalc ...` in `daily-rad.sh`.
 At least this helps when running many GRASS modules in parallel on HPC
 systems where results are written out to one single storage device.

 Well thank you very much for all of your patience, patches, and testing.
 I'll try the `nice` option. For now, I think that I can tolerate the much
 lower frequency of errors after switching to LZ4 compression. Perhaps the
 faster speed of LZ4 lowers the probability of concurrent write operations.

 It is still quite puzzling that this kind of error has come up on several
 different machines while tracking GRASS trunk over a 10 year period. Maybe
 this is a subtle hint that it is time to build a new workstation...

 I know this is a lot to ask, but did you try testing using ZLIB
 compression and running it multiple times? It took a couple of tiles
 before I noticed the error.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2764#comment:31>
GRASS GIS <https://grass.osgeo.org>



More information about the grass-dev mailing list