[GRASS-dev] [GRASS GIS] #2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce

GRASS GIS trac at osgeo.org
Mon Jan 8 09:46:17 PST 2018


#2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce
---------------------+-------------------------
  Reporter:  dylan   |      Owner:  grass-dev@…
      Type:  defect  |     Status:  new
  Priority:  normal  |  Milestone:  7.2.3
 Component:  Raster  |    Version:  unspecified
Resolution:          |   Keywords:
       CPU:  x86-64  |   Platform:  Linux
---------------------+-------------------------

Comment (by dylan):

 Replying to [comment:26 mmetz]:
 >[...]
 > That means r.sun has created corrupt output. BTW, considering that you
 are running several instances of r.sun in parallel, I wonder if you
 compiled GRASS with openmp and use the nprocs option of r.sun. In this
 case you would have several instances of r.sun and each instance of r.sun
 would be multi-threaded: no speed gain, more sources of potential errors.

 Note that this sometimes happens in the follow-up step where there output
 from `r.sun` is converted to MJ/sq.m by `r.mapcalc`: "beam.106"
 (`r.mapcalc`) vs. "temp_beam_106" (`r.sun`).

 I had previously compiled using --with-openmp. I'll try disabling it. So
 far, I haven't used the `nprocs` argument to `r.sun`.


 > >
 > > Well crud, just got this after a 10 hour run, returned by `r.series`:
 > >
 > > {{{
 > > WARNING: LZ4 decompression error
 > > ERROR: Error uncompressing fp raster data for row 3929 of <beam.106>:
 error
 > >        code -1
 > > }}}
 > >
 > > That is the first error using LZ4 compression after many successful
 tiles. I wonder if the faster compression results in a lower probability
 of row corruption? Within my current project, I seem to be encountering
 corrupt rows about 0.0001% of the time: 2 rows out of (5000 rows * 365
 calls to `r.sun`).
 >
 > Chances are very small that ZLIB and LZ4 have the same bug. It rather
 seems to be a write error when writing several files at (nearly) the same
 time.

 Right. Are there any functions below put_row.c where concurrent writes
 could be trouble? Is there anything else that I can do to help test?

 I just posted an [http://soilmap2-1.lawr.ucdavis.edu/dylan/temp/example-
 maps.tgz example tileset] containing elevation, slope, and aspect maps.
 Sorry for the large file sizes. Note that it takes my machine about 10
 hours per tile, with errors surfacing in the final call to `r.series`.
 I'll re-write my code to check the intermediate files in the meantime.

 Thanks!

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2764#comment:27>
GRASS GIS <https://grass.osgeo.org>



More information about the grass-dev mailing list