[GRASS-dev] [GRASS GIS] #2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce
GRASS GIS
trac at osgeo.org
Thu Jan 4 10:28:56 PST 2018
#2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce
---------------------+-------------------------
Reporter: dylan | Owner: grass-dev@…
Type: defect | Status: new
Priority: normal | Milestone: 7.2.3
Component: Raster | Version: unspecified
Resolution: | Keywords:
CPU: x86-64 | Platform: Linux
---------------------+-------------------------
Comment (by dylan):
Replying to [comment:15 mmetz]:
> Replying to [comment:14 neteler]:
> > Replying to [comment:12 dylan]:
> > > Relevant [https://www.zlib.net/manual.html zlib manual page].
> >
> > For ticket completion, here the related ''grass-user'' message:
> >
> > https://lists.osgeo.org/pipermail/grass-user/2018-January/077572.html
> >
> > On Wed, Jan 3, 2018 at 8:41 PM, Dylan Beaudette wrote:
> > > Update: after applying the latest patch, I now see
> > >
> > > ERROR: Decompression failed with error -1
> > >
> > > I found the map that fails decompression. Is there any way to
inspect
> > > the map in order to search for more clues as to what is wrong with
it
> > > or how it might have happened?
> > >
> > >
> > > All of the maps in this project are using the default ZLIB
> > > compression, along with compressed NULL files. Looking over the zlib
> > > manual (https://www.zlib.net/manual.html), I see several references
to
> > > an error code of "-1":
> > >
> > > ----------------------------
> > > #define Z_ERRNO (-1)
> > >
> > > Z_ERRNO if there is an error writing the flushed data
> > >
> > > Z_ERRNO on a file operation error
> > >
> > > ZEXTERN const char * ZEXPORT gzerror OF((gzFile file, int *errnum));
>
> We can't use gzerror() because libgis does not operate on a gzFile.
Instead we could use the undocumented zError() function, now in the
updated patch.
>
> You will need at least trunk r71890 for this patch. Maybe trunk r71890
or later without the attached patch fixes the problem already.
>
> The errors must be somehow related to G_read_compressed() because only
FCELL/DCELL maps use G_read_compressed() while CELL maps use G_expand()
directly.
Thanks Markus. This is all starting to (maybe?) make sense, given the
clues collected over the last couple of years:
* errors only reported at ''read-time'', from maps generated by several
different modules
* FCELL / DCELL maps occasionally contain corrupt ZLIB-compressed rows
when created in parallel
* CELL maps not affected
* the likelihood of corrupt rows seems to be a function of the number of
cells involved: number of maps * number of cells * number of iterations *
number of concurrent processes
* this only seems to happen when working with relatively large maps 3000
x 3000 cells or larger
* the `*** buffer overflow detected ***` message encountered while
reading the corrupt row
* the zlib error number
* the fact that I have not yet encountered this problem with LZ4
compression
I think that I was still encountering errors, post-r71890. My latest build
(with ZLIB-related errors) was r7200.
Could it be that the zlib library's length function isn't multi-thread
safe and occasionally reports the wrong length in the context of writing a
row of data? A simple, but quite heavy-handed test would be a debugging
option for reading the last written row as a sanity check. I don't have
the skills to implement this, so just an idea.
I'll try the latest patch today and report back.
--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2764#comment:16>
GRASS GIS <https://grass.osgeo.org>
More information about the grass-dev
mailing list