[GRASS-user] r.in.xyz: Could open text file ~ 2.5GB

Glynn Clements glynn at gclements.plus.com
Sat Oct 21 08:12:25 EDT 2006


Hamish wrote:

> > > I'm trying to open a text file to scan for extents:
> > > 
> > > r.in.xyz -s input=2006MB_GarryTrough_1N.txt
> > > output=2006MB_GarryTrough_1N method=mean type=FCELL fs=space  x=1
> > > y=2 z=3 percent=100
> 
> Glynn wrote:
> >      fseek(in_fd, 0L, SEEK_END);
> >      filesize = ftell(in_fd);
> > +    if (filesize < 0)
> > +	filesize = 0x7FFFFFFF;
> >      rewind(in_fd);
> 
> 
> 
> Hi,
> 
> sorry I am busy with other commitments and don't have time to get into
> the discussion more ...
> 
> 
> just a thought though, we really don't need to store the actual
> filesize, we could just as well store filesize/10 or filesize/1024 and
> then adjust the other calculations for that. We just need the ratio for
> G_percent(), not the exact numbers.
> 
> then for the 2gb<filesize<4gb case, maybe something like
> 
>     if (ftell(in_fd) < 0)
> 	filesize_div10 = -1 * (0x7FFFFFFF - ftell(in_fd))/10;
> 
> ? (not sure which direction the negative result from ftell() goes)
> 
> or store it as a double... ?

The problem is that ftell() returns the result as a (signed) long. If
the result won't fit into a long, it returns -1 (and sets errno to
EOVERFLOW).

This can only happen if you also set _FILE_OFFSET_BITS to 64 so that
fopen() is redirected to fopen64(), otherwise fopen() will simply
refuse to open files larger than 2GiB (apparently, this isn't true on
some versions of MacOSX, which open the file anyhow then fail on
fseek/ftell once you've passed the 2GiB mark).

If you want to obtain the current offset for a file whose size exceeds
the range of a signed long, you instead have to use the (non-ANSI)
ftello() function, which returns the offset as an off_t. But before we
do that, we would need to add configure checks so that we don't try to
use ftello() on systems which don't provide it.

IOW, doing it right is non-trivial, and for a relatively minor benefit
(accurate progress indication for large files).

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-user mailing list