[GRASS-dev] vector large file support

Glynn Clements glynn at gclements.plus.com
Fri Feb 6 02:30:06 EST 2009


Markus Metz wrote:

> >> 3) solve the fseek/fseeko and ftell/ftello problem. Get inspiration from 
> >> libgis and LFS-safe modules? Or as suggested in the grass wiki on LFS, add
> >> extern off_t G_ftell(FILE *fp);
> >> extern int G_fseek(FILE *stream, off_t offset, int whence);
> >> for global use?
> >>     
> >
> > That would be my preference.
> >   
> How about
> 
> extern off_t G_ftell(FILE *fp)
> {
> #ifdef HAVE_LARGEFILES
>     return (ftello(fp);
> #else
>     return (ftell(fp);
> #endif     
> }

Yep, other than the extraneous open parenthesis (2 open, 1 close).

G_fseek() is slightly more tricky, as you need to check for an
out-of-range value rather than silently truncating.

> >> 4) figure out if coor file size really needs to be stored in coor and 
> >> topo. coor file size doesn't say a lot about the number of features 
> >> because coor can contain a high proportion of dead lines (a problem in 
> >> itself, vector TODO). If if does not need to be stored in coor and topo, 
> >> how does removing coor file size info affect reading and writing of coor 
> >> and topo? Are there hard-coded offsets for reading these files?
> >>     
> >
> > No idea here.
> 
> I think coor file size is stored as safety check but not needed to read 
> features. That safety check is probably there for a good reason.
> The problem is that the offset of a given feature in the coor file is 
> stored in topo and must be properly retrieved, i.e. the number of bytes 
> used to store a given feature offset must match the off_t size of the 
> current library. A topo file written with LFS support will thus be only 
> readable by old libraries or grass compiled without LFS support after 
> rebuilding topology. The other way around, a vector that was written 
> without LFS support can be opened with new LFS-enabled libraries only 
> after rebuilding topology. The off_t size used to write the vector must 
> be stored somewhere in the header info of topo (most important) and 
> coor. If this size does not match the current off_t size, topology must 
> be rebuild.
> Letting older libraries know that topology needs to be rebuilt is the 
> easiest part...
> 
> Nonsense or going into the right direction?

I think that the code which reads these files needs functions to
read/write off_t values at the size used by the file, not the size
used by the code.

I.e. if the code is built for 64-bit off_t, it should still be able to
directly read/write files using a 32-bit off_t. Code built for 32-bit
off_t should also directly read/write files which use a 64-bit off_t,
subject to the constraint that only 31 bits are non-zero (if you have
a 32-bit off_t, attempting to open a file >=2GiB will fail, as will
attempting to enlarge a file beyond that size).

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list