[GRASSLIST:8035] Re: error reading large rasters in 5.4/6.0

Glynn Clements glynn at gclements.plus.com
Tue Aug 23 17:08:03 EDT 2005


Andrew Danner wrote:

> > > I'm having problems working with some large raster files in GRASS, even
> > > though I have enabled large file support. 
> > > 
> > > 
> > > I get the following problems:
> > > 
> > > 1) "r.info test" displays the wrong number of total cells. e.g., 
> > 
> > Right. Even if you enable large file support, many values are still
> > limited to the range of a signed 32-bit integer (i.e. 2^31-1). This
> > will start to manifest itself on maps with more than 2^31-1 cells;
> > although most operations should still work, you will get oddities such
> > as r.info displaying a negative cell count.
> > 
> > Unfortunately, this isn't likely to change any time soon; there are
> > too many individual cases which would need to be changed.
> 
> If there is an accepted way to fix this, I'll offer to patch up some of
> the affected programs. r.info is an easy fix--one line in a printf.

Not quite. You also have to perform the calculation in 64 bits. Right
now, the only code anywhere which uses 64-bit types on a 32-bit system
(where "long" is 32 bits) is the low-level raster I/O code in lib/gis
(opencell.c, get_row.c, put_row.c and format.c, and G.h), which uses
off_t, which may be 64 bits depending upon compiler switches.

Code which counts cells will use 32-bit arithmetic, which will
overflow on large maps.

> The only issue is what is the most portable way to print a 64-bit
> int?

C99 has <inttypes.h> which defines macros for printf/scanf format
specifiers for various integer types, e.g. PRId64 for int64_t. This
will expand to either "ld" or "lld" depending up on whether int64_t is
an alias for "long" or "long long".

If the compiler doesn't support C99, you're out of luck. It may not
have a 64-bit integer type, or be able to print such.

> current:
> sprintf (line, "  Total Cells:  %ld", (long)cellhd.rows * cellhd.cols);
> 
> my standard Linux way
> sprintf (line, "  Total Cells:  %lld", ((off_t)cellhd.rows) *  
>                                                    cellhd.cols);
> 
> A more portable way?
> sprintf (line, "  Total Cells:  %" PRId64, (off_t)cellhd.rows) *  
>                                                    cellhd.cols);

1. You aren't guaranteed that the platform provides the C99 PRI*
macros.

2. You aren't guaranteed that off_t is 64 bits; large file support is
optional, and needs to remain so until we fix all of the file I/O
code, not just the low-level raster I/O code (or find a way to
implement it such that only code which has been updated to handle
large files will use them). Currently, I see 103 files which use
lseek, fseek or ftell.

3. Ideally, it should work regardless of whether large file support is
enabled. Due to compression, you could have a map with more than 2^31
cells even if you can't have a file larger than 2GiB (i.e. off_t is 32
bits).

I think that we should sort out these issues generally before anyone
starts trying to fix specific pieces of code.

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-user mailing list