[GRASS-dev] [GRASS GIS] #2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce

GRASS GIS trac at osgeo.org
Sun Jan 7 13:21:56 PST 2018


#2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce
---------------------+-------------------------
  Reporter:  dylan   |      Owner:  grass-dev@…
      Type:  defect  |     Status:  new
  Priority:  normal  |  Milestone:  7.2.3
 Component:  Raster  |    Version:  unspecified
Resolution:          |   Keywords:
       CPU:  x86-64  |   Platform:  Linux
---------------------+-------------------------

Comment (by dylan):

 Replying to [comment:22 mmetz]:

 > If you are not using the test scripts attached to this ticket any more,
 can you please also add data and commands that trigger this error to this
 ticket? Thanks!

 Good point, I should probably have opened another ticket.

 Several folks have posted a number of messages to GRASS-user and GRASS-dev
 over the last couple of years, related to errors reading raster maps after
 parallel processing:

   * http://osgeo-org.1560.x6.nabble.com/Error-reading-raster-data-for-row-
 xxx-only-when-using-r-series-and-t-rast-series-td5229569i20.html
   * https://lists.osgeo.org/pipermail/grass-dev/2015-July/075691.html
   * https://lists.osgeo.org/pipermail/grass-dev/2015-July/075627.html
   * #2762

 This ticket was originally opened with a set of scripts to test for errors
 writing raster maps in parallel and it now seems that those tests **no
 longer generate errors** when:

    * run on an SSD
    * run on a standard HDD
    * run on mirrored HDD (Linux RAID 1) ''as long as maps are deleted vs.
 overwritten via --o''


 This would have been a good place to close this ticket and re-open another
 one.
 ----

 As of 2017-12-30, I was again encountering the hard-to-reproduce "ERROR:
 Error reading raster data for row 1949 of <map>" errors. Just as with last
 time in the context of parallel execution. I posted [http://osgeo-
 org.1560.x6.nabble.com/r-sun-daily-with-multiple-CPU-cores-error-
 uncompressing-raster-data-td5348054.html this message] to GRASS-user and
 then proceeded to document progress as of comment 8 in this ticket.

 I am attaching two scripts to illustrate the latest instance, `beam-rad-
 at-tile.sh` and `daily-rad.sh`, invoked like this:

 {{{
 # ...

 ## try ZLIB compression:
 # random errors as described in #2764
 # export GRASS_COMPRESSOR=ZLIB

 ## try LZ4 compression:
 # no errors!
 export GRASS_COMPRESSOR=LZ4

 #
 # ... looping code
 #
 bash beam-rad-at-tile.sh $tile_i

 # ...
 }}}

 Essentially, I am iterating over 5000x5000 {elevation, slope, aspect}
 tiles, computing horizon angle maps in parallel, computing daily beam
 radiance maps in parallel, summing daily maps, and then proceeding to the
 next tile. With ZLIB compression, I am randomly encountering raster read
 errors generated by `r.horizon`, `r.sun`, or `r.mapcalc` in this context.
 This seems to happen with or without NULL cells in the active tile. Errors
 are not encountered when using LZ4 compression.

 The system is an 8-core Intel i7 950 @ 3.07Ghz with GRASS database/mapset
 residing on an SSD.

 I'll post one of the {elevation, slope, aspect} tiles if anyone is
 interested in tinkering with them. They seem to work fine both within
 GRASS and, after exporting, in other GIS software.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2764#comment:23>
GRASS GIS <https://grass.osgeo.org>



More information about the grass-dev mailing list