[GRASS-dev] r.in.xyz: really big ints

Hamish hamish_b at yahoo.com
Fri Aug 29 03:19:35 EDT 2008


Hi,

a user has just successfully imported a 92GB LIDAR data file with r.in.xyz
(2.4 billion data points; 4.5hrs). This has exposed a cosmetic bug, the
number of points processed is reported to the user as -1871174186.

The raster output is fine AFAIK, but the broken status message ain't a
good look.

wc -l reports 2,424,200,605 points which is bigger than (IIUC) the 32bit
limit for c90 int of 2,147,483,648. I do not know if the hardware/OS/build
was 32 or 64 bit. The "line" variable is defined simply as "int".

Apparently `wc` on their system can count higher than 2^31, can we?

Is it as simple as replacing printf %d with %u ?? (seems to work)
In that case should the variable be defined as "unsigned int" for
correctness? (%u seems to work correctly with plain signed int in a
little test program I wrote) Then we wait for the first 160GB dataset...

I could rewrite it to store the number of lines as a double and printf
%.0f, but hope for a cleaner solution.


side idea:
Would it be possible to add a flag to g.version to report some build
info? Like: 32/64 bits, endianness, build date, svn checkout date (if
applicable), `uname -a` of build machine, LFS, nls, and in general 
./configure feature report stuff, ...

?

thanks,
Hamish


More information about the grass-dev mailing list