[GRASS-dev] [GRASS GIS] #2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce
GRASS GIS
trac at osgeo.org
Tue Jan 9 12:19:18 PST 2018
#2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce
---------------------+-------------------------
Reporter: dylan | Owner: grass-dev@…
Type: defect | Status: new
Priority: normal | Milestone: 7.2.3
Component: Raster | Version: unspecified
Resolution: | Keywords:
CPU: x86-64 | Platform: Linux
---------------------+-------------------------
Comment (by dylan):
Replying to [comment:30 mmetz]:
> Replying to [comment:29 dylan]:
> >
> > [...] Note that I don't have any issues with any other GRASS commands,
or (as far as I can tell) general usage on this machine. I only see these
errors when working with GRASS commands that:
> >
> > * take a long time to run: `r.sun` or `t.rast.mapcalc` ([http
://osgeo-org.1560.x6.nabble.com/Error-reading-raster-data-for-row-xxx-
only-when-using-r-series-and-t-rast-series-td5229569.html e.g. a couple of
years ago])
> > * operate on moderately large, floating-point maps
> > * are done in parallel, either via GNU `parallel` or as implemented
in the temporal suite of modules
> >
> > ...hence the extreme difficulty in recreating the errors or further
debugging.
>
> Unfortunately, I was not able to recreate these errors with the provided
test data and scripts.
>
> I still think this is some obscure disk IO error. You could try to use
`nice`, e.g. `nice r.sun ...` and `nice r.mapcalc ...` in `daily-rad.sh`.
At least this helps when running many GRASS modules in parallel on HPC
systems where results are written out to one single storage device.
Well thank you very much for all of your patience, patches, and testing.
I'll try the `nice` option. For now, I think that I can tolerate the much
lower frequency of errors after switching to LZ4 compression. Perhaps the
faster speed of LZ4 lowers the probability of concurrent write operations.
It is still quite puzzling that this kind of error has come up on several
different machines while tracking GRASS trunk over a 10 year period. Maybe
this is a subtle hint that it is time to build a new workstation...
I know this is a lot to ask, but did you try testing using ZLIB
compression and running it multiple times? It took a couple of tiles
before I noticed the error.
--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2764#comment:31>
GRASS GIS <https://grass.osgeo.org>
More information about the grass-dev
mailing list