[GRASS-dev] r.in.xyz: really big ints

Glynn Clements glynn at gclements.plus.com
Fri Aug 29 16:07:15 EDT 2008


Andrew Danner wrote:

> >> Apparently `wc` on their system can count higher than 2^31, can we?
> >>
> >> Is it as simple as replacing printf %d with %u ?? (seems to work)
> >> In that case should the variable be defined as "unsigned int" for
> >> correctness? (%u seems to work correctly with plain signed int in a
> >> little test program I wrote) Then we wait for the first 160GB dataset...
> 
> %u only delays the problem to ~4 billion points. What you really want is 
> a %lld and to store the line count as a 64-bit int. ISO C99 also 
> includes <inttypes.h> which specifies %PRId64. Is there an agreed upon 
> 64-bit int type in GRASS?

Currently, we try to avoid requiring anything beyond ANSI C89, which
means that we can't assume the existence of a 64-bit integer type.

> I have found a couple of places in the past 
> where row and column where 32-bit and the number of cells would overflow 
> because the calculation is done using 32-bit arithmetic. It would be 
> nice to have 64-bit support even on 32-bit architectures in GRASS 7. I 
> recall r.info had some issue in printing cell counts.

My inclination would be to declare e.g. count_t and COUNT_FMT, which
would be long long and %lld where available, long and %ld otherwise.

Even so, it will probably take some effort to catch bugs caused by
code which doesn't correctly handle integer types other than "int".
E.g.:

	offset = (off_t)((row * cols + col) * sizeof(CELL));

which should be e.g.:

	offset = ((off_t)row * cols + col) * sizeof(CELL);

If you use the former, the compiler won't complain, and you won't
discover the bug unless you actually test with >2GiB of data.

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list