[GRASS-dev] Error reading raster data for row xxx (only when using r.series and t.rast.series)

Wed Oct 14 21:45:08 PDT 2015

On Wed, Oct 14, 2015 at 12:55 PM, Dylan Beaudette
<dylan.beaudette at gmail.com> wrote:
> On Wed, Oct 14, 2015 at 10:50 AM, Dylan Beaudette
> <dylan.beaudette at gmail.com> wrote:
>> Some additional clues:
>>
>> The original stack was 365 maps with 3105 x 7025 cells.
>>
>> 1. zooming into a smaller region (30 x 40 cells) and running
>> t.rast.series 100x resulted in 100 "correct" maps, no errors.
>>
>> 2. returning to the full extent and running t.rast.series 30x on the
>> first 31 maps resulted in 30 "correct" maps, no errors.
>>
>> 3. returning to the full extent and running t.rast.series 30x on the
>> last 31 maps resulted in 30 "correct" maps, no errors
>>
>>
>> So, it seems that t.rast.series (r.series) is throwing an error, or
>> generating wront output, when when:
>>
>> a large set of maps are supplied as input, and, a region that has a
>> moderate number of total cells.
>>
>> Yeah, I know, that isn't very specific. I will try re-compiling with
>> debugging and no optimization next.
>>
>> Dylan
>>
>>
>
> More data,
>
> 1. re-compiled with CFLAGS="-g -Wall":
>  * Multiple runs of t.rast.series with the full stack (365 maps with
> 3105 x 7025 cells), no errors.
>  * each run required about 8.5 minutes to complete
>
> 2. re-compiled with  CFLAGS="-O2 -mtune=native -march=native" LDFLAGS="-s":
>  * 10x tests with full stack, no errors
>  * each run required about 3.5 minutes
>
> 3. re-run original script (see listing below)
>  * random errors from t.rast.series
>
> This doesn't make much sense to me. The only difference between my
> latest "tests" and the original code is that the input to
> t.rast.series was static over the course of my "tests", vs. dynamic
> within the original code (see below). I purposely selected a stack
> that caused t.rast.series to throw an error for my tests.
>

OK, this does make sense--t.rast.series (r.series) was not the source
of the problems. I was able to verify this by running t.univar on the
output from the previous step:

>   # NOTE: 4 CPUs so that external disk isn't thrashed
>   gdd_max_C=30
>   gdd_min_C=10
>   gdd_base_C=10
>   t.rast.mapcalc --q --o nprocs=4 input=tmin_subset,tmax_subset
> output=gdd basename=gdd expr="max(((min(tmax_subset, $gdd_max_C) +
> max(tmin_subset, $gdd_min_C)) / 2.0) - $gdd_base_C, 0)"

... which means that t.rast.mapcalc was generating one (or more)
outputs with some kind of problem, which was then causing t.univar and
t.rast.series to fail.

The inputs to t.rast.mapcalc are files that have been registered with
r.external. I suspect that the multiple concurrent r.mapcalc instances
may be to blame. I don't have an explanation other than some evidence
from the last time I encountered this type of issue. The workflow then
was :

1. spawn 8 concurrent processes via backgrounding: r.sun -> r.mapcalc

2. when finished with daily solar models, sum maps with r.series

I would occasionally encounter the "Error reading raster data for row
xxx" error from r.series in this case and assume that r.series had
somehow broken the map in question.

It would seem that concurrent use of r.mapcalc may be worth
investigating... however, it is strange that it only occurs sometimes.

Oddly enough, I didn't have problems with maps generated with the
following (similar) code:

# spring frost
# if tmin never drops below 0 before the start of summer, then the
last "spring frost" is on day 0
# NOTE: 2 CPUs so that disk isn't thrashed
t.rast.mapcalc --o -n nprocs=2 input=tmin output=spring_frost
basename=spring_frost \
expr="if(start_doy() < 182, if(tmin < 0, start_doy(), 0), null())"

# fall frost
# NOTE: 2 CPUs so that disk isn't thrashed
t.rast.mapcalc --o -n nprocs=2 input=tmin output=fall_frost
basename=fall_frost \
expr="if(start_doy() > 213, if(tmin < 0, start_doy(), 365), null())"

Dylan