[GRASSLIST:8036] Re: error reading large rasters in 5.4/6.0

Glynn Clements glynn at gclements.plus.com
Tue Aug 23 17:25:05 EDT 2005


Andrew Danner wrote:

> After looking into the problem of reading very large rasters, I believe
> I have isolated the bug. The problem only happens when reading row 0 and
> triggers the following block in flate.c:186
> 
>  else if (b[0] != G_ZLIB_COMPRESSED_YES)
>     {
>         /* We're not at the start of a row */
>         G_free (b);
>         return -1;
>     }
> 
> I think the problem is a bug in writing rasters, not reading them. When
> creating a new raster, it writes a vector of row offsets to the fcell
> file in the function G__write_row_ptrs. It attempts to guess the size of
> the file offsets. 
> 
>  int nbytes = sizeof(off_t);
> ...
>  if (nbytes > 4 && fcb->row_ptr[nrows] <= 0xffffffff)
>     nbytes = 4;
> ...
>  len = (nrows + 1) * nbytes + 1;
>  b = buf = G_malloc(len);
> 
> The problem is that when a new raster is created fcb->row_ptr[nrows]=0,
> so nbytes is always 4. Later when writing the rows and actually
> computing the offsets, if fcb->row_ptr[nrows] > 0xffffffff, it bumps
> nbytes upto 8, but row 0 was written to the file assuming an offset of 4
> and row 0 becomes corrupted if the raster is larger than 4GB. 

Right. I'd overlooked the fact that G_write_row_ptrs() gets called
twice.

> Commenting out the "if" block seems to fix the problem. 

The reader will handle 8-byte offsets even if off_t is only 4 bytes so
long as the offsets themselves don't exceed the 32-bit range.

There isn't any practical way to determine in advance whether the
offsets will fit into 4 bytes, so removing the "if" block is the right
fix.

> Should I submit a bug report?

No need; I've comitted the fix to CVS.

One other bug which I've just noticed: the reader checks whether an
offset exceeds the range of an off_t, but only in the sense that it
checks whether it occupies too many bytes.

This check is wrong for files between 2GiB and 4GiB, where the offset
fits into 4 bytes, but exceeds the 31-bit (signed) range of an off_t. 
There needs to be an "if (offset < 0)" check in there somewhere.

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-user mailing list