[GRASS-dev] [GRASS GIS] #2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce

GRASS GIS trac at osgeo.org
Thu Jan 4 10:28:56 PST 2018


#2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce
---------------------+-------------------------
  Reporter:  dylan   |      Owner:  grass-dev@…
      Type:  defect  |     Status:  new
  Priority:  normal  |  Milestone:  7.2.3
 Component:  Raster  |    Version:  unspecified
Resolution:          |   Keywords:
       CPU:  x86-64  |   Platform:  Linux
---------------------+-------------------------

Comment (by dylan):

 Replying to [comment:15 mmetz]:
 > Replying to [comment:14 neteler]:
 > > Replying to [comment:12 dylan]:
 > > > Relevant [https://www.zlib.net/manual.html zlib manual page].
 > >
 > > For ticket completion, here the related ''grass-user'' message:
 > >
 > > https://lists.osgeo.org/pipermail/grass-user/2018-January/077572.html
 > >
 > > On Wed, Jan 3, 2018 at 8:41 PM, Dylan Beaudette wrote:
 > > > Update: after applying the latest patch, I now see
 > > >
 > > > ERROR: Decompression failed with error -1
 > > >
 > > > I found the map that fails decompression. Is there any way to
 inspect
 > > > the map in order to search for more clues as to what is wrong with
 it
 > > > or how it might have happened?
 > > >
 > > >
 > > > All of the maps in this project are using the default ZLIB
 > > > compression, along with compressed NULL files. Looking over the zlib
 > > > manual (https://www.zlib.net/manual.html), I see several references
 to
 > > > an error code of "-1":
 > > >
 > > > ----------------------------
 > > > #define Z_ERRNO        (-1)
 > > >
 > > > Z_ERRNO if there is an error writing the flushed data
 > > >
 > > > Z_ERRNO on a file operation error
 > > >
 > > > ZEXTERN const char * ZEXPORT gzerror OF((gzFile file, int *errnum));
 >
 > We can't use gzerror() because libgis does not operate on a gzFile.
 Instead we could use the undocumented zError() function, now in the
 updated patch.
 >
 > You will need at least trunk r71890 for this patch. Maybe trunk r71890
 or later without the attached patch fixes the problem already.
 >
 > The errors must be somehow related to G_read_compressed() because only
 FCELL/DCELL maps use G_read_compressed() while CELL maps use G_expand()
 directly.


 Thanks Markus. This is all starting to (maybe?) make sense, given the
 clues collected over the last couple of years:

   * errors only reported at ''read-time'', from maps generated by several
 different modules
   * FCELL / DCELL maps occasionally contain corrupt ZLIB-compressed rows
 when created in parallel
   * CELL maps not affected
   * the likelihood of corrupt rows seems to be a function of the number of
 cells involved: number of maps * number of cells * number of iterations *
 number of concurrent processes
   * this only seems to happen when working with relatively large maps 3000
 x 3000 cells or larger
   * the `*** buffer overflow detected ***` message encountered while
 reading the corrupt row
   * the zlib error number
   * the fact that I have not yet encountered this problem with LZ4
 compression

 I think that I was still encountering errors, post-r71890. My latest build
 (with ZLIB-related errors) was r7200.

 Could it be that the zlib library's length function isn't multi-thread
 safe and occasionally reports the wrong length in the context of writing a
 row of data? A simple, but quite heavy-handed test would be a debugging
 option for reading the last written row as a sanity check. I don't have
 the skills to implement this, so just an idea.

 I'll try the latest patch today and report back.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2764#comment:16>
GRASS GIS <https://grass.osgeo.org>



More information about the grass-dev mailing list