[GRASS-dev] Re: [GRASS-user] Large vector files

Hamish hamish_nospam at yahoo.com
Sun Oct 8 20:01:12 EDT 2006

[moved to the devel list]
> > > I am always looking for feedback on how r.in.xyz goes with massive
> > > input data. (>2gb? >4gb?)
> > r.in.xyz doesn't use LFS, so it will be limited to 2Gb on 32-bit
> > systems (any system where "long" is 32 bits). As it uses ANSI stdio
> > functions (including ftell/fseek), extending it to support large
> > files would be non-trivial.

It's inherent in the purpose of the module that it be LFS compliant.

I disagree with "extending it to support large files is non-trivial":

All the filesize, ftell, fseek calls don't need to be there and can 
easily be #ifdef'd out if required. They are just there for the
(somewhat lame & inaccurate; but fast, lightweight, and non-critical)
guess at the total number of lines in the input file to pass to

But G_percent() is most interesting when the processing will take a long
time, so it would be nice to have it there for large files.

This is the critical loop:
 while( 0 != G_getl2(buff, BUFFSIZE-1, in_fd) ) { ... }

besides that it's just fopen() and fclose() -- it is very simple really.
All the other scanning stuff is optional.

> Attached is a quick patch to enable LFS.  It's "poorly" implemented
> with fseeko/ftello, so I'm not sure if I should commit it.

Thanks Brad. The patch looks good to my untrained eye, my only query is
if those calls should be conditionalized to USE_LARGEFILES, as fseeko()
& co are not ANSI compliant:

       These  functions  are found on SysV-like systems.  They are not
       present in libc4, libc5, glibc 2.0 but  available  since  glibc

       The fseeko and ftello functions conform to SUSv2.

I don't have:
 - a 64 bit machine
 - a dataset that large
 - a [funded] research project that needs it
 - any real experience with LFS

so it is as it is, and I welcome improvements from anybody with
something from the above list.


More information about the grass-dev mailing list