[GRASS-dev] r.in.xyz: really big ints

Andrew Danner adanner at cs.swarthmore.edu
Fri Aug 29 09:29:46 EDT 2008


Helena Mitasova wrote:
> 
> On Aug 29, 2008, at 3:19 AM, Hamish wrote:
> 
>> Hi,
>>
>> a user has just successfully imported a 92GB LIDAR data file with 
>> r.in.xyz
>> (2.4 billion data points; 4.5hrs). This has exposed a cosmetic bug, the
>> number of points processed is reported to the user as -1871174186.
>>
>> The raster output is fine AFAIK, but the broken status message ain't a
>> good look.
>>
>> wc -l reports 2,424,200,605 points which is bigger than (IIUC) the 32bit
>> limit for c90 int of 2,147,483,648. I do not know if the 
>> hardware/OS/build
>> was 32 or 64 bit. The "line" variable is defined simply as "int".
> 
> It was a 64 bit with GRASS6.4 compiled from  8/22/08 version
> Doug, you could add more details and maybe also post your comparison
> of GRASS performance on linux versus MS Windows,
> 
> Helena
> 
>>
>> Apparently `wc` on their system can count higher than 2^31, can we?
>>
>> Is it as simple as replacing printf %d with %u ?? (seems to work)
>> In that case should the variable be defined as "unsigned int" for
>> correctness? (%u seems to work correctly with plain signed int in a
>> little test program I wrote) Then we wait for the first 160GB dataset...
>>

%u only delays the problem to ~4 billion points. What you really want is 
a %lld and to store the line count as a 64-bit int. ISO C99 also 
includes <inttypes.h> which specifies %PRId64. Is there an agreed upon 
64-bit int type in GRASS? I have found a couple of places in the past 
where row and column where 32-bit and the number of cells would overflow 
because the calculation is done using 32-bit arithmetic. It would be 
nice to have 64-bit support even on 32-bit architectures in GRASS 7. I 
recall r.info had some issue in printing cell counts.

>> I could rewrite it to store the number of lines as a double and printf
>> %.0f, but hope for a cleaner solution.
>>
>>
>> side idea:
>> Would it be possible to add a flag to g.version to report some build
>> info? Like: 32/64 bits, endianness, build date, svn checkout date (if
>> applicable), `uname -a` of build machine, LFS, nls, and in general
>> ./configure feature report stuff, ...
> 
> having that would be greatly appreciated
>>

+1

>> ?
>>
>> thanks,
>> Hamish


More information about the grass-dev mailing list