[GRASS-dev] vector large file support

Markus Metz markus.metz.giswork at googlemail.com
Thu Feb 5 08:42:52 EST 2009



Glynn Clements wrote:
> Markus Metz wrote:
>
>   
>> I think I understand. So according to the grass wiki the steps to enable 
>> large file support would be
>>
>> 1) add
>> ifneq ($(USE_LARGEFILES),)
>> EXTRA_CFLAGS = -D_FILE_OFFSET_BITS=64
>> endif
>>
>> to all relevant Makefiles
>>     
>
> Yes. Although this should probably come after 2 and 3 ;) First, make
> the code safe for LFS, *then* enable it.
>   
No chronological order intended, I'm relying on your help :-)
>   
>> 2) use off_t where appropriate, and take care with type casting. file 
>> offset is used in various different places in the vector library, a bit 
>> of work to get off_t usage right.
>>     
>
> Using off_t isn't a problem; it's when you generate an off_t from
> smaller types that care needs to be taken.
>   
Yes, that's what I meant, because the vector library uses file offset a 
lot, all of type long so far. It is doable to update the vector library 
with off_t usage and proper casting as you pointed out previously, but 
file offset is an integral part of grass vector topology and must be 
handled with care, I'm aware of that.
>   
>> 3) solve the fseek/fseeko and ftell/ftello problem. Get inspiration from 
>> libgis and LFS-safe modules? Or as suggested in the grass wiki on LFS, add
>> extern off_t G_ftell(FILE *fp);
>> extern int G_fseek(FILE *stream, off_t offset, int whence);
>> for global use?
>>     
>
> That would be my preference.
>   
How about

extern off_t G_ftell(FILE *fp)
{
#ifdef HAVE_LARGEFILES
    return (ftello(fp);
#else
    return (ftell(fp);
#endif     
}

That is the current solution in the iostream library for fseek/fseeko. I 
know it's not that easy according to the grass wiki on LFS support, "The 
issues". Apparently just because you configure with --enable-largefile 
doesn't mean fseeko and ftello are available, although most modern 
systems should have it. ftello is currently not used in devbr_6 and trunk.
>   
>> 4) figure out if coor file size really needs to be stored in coor and 
>> topo. coor file size doesn't say a lot about the number of features 
>> because coor can contain a high proportion of dead lines (a problem in 
>> itself, vector TODO). If if does not need to be stored in coor and topo, 
>> how does removing coor file size info affect reading and writing of coor 
>> and topo? Are there hard-coded offsets for reading these files?
>>     
>
> No idea here.
>   
I think coor file size is stored as safety check but not needed to read 
features. That safety check is probably there for a good reason.
The problem is that the offset of a given feature in the coor file is 
stored in topo and must be properly retrieved, i.e. the number of bytes 
used to store a given feature offset must match the off_t size of the 
current library. A topo file written with LFS support will thus be only 
readable by old libraries or grass compiled without LFS support after 
rebuilding topology. The other way around, a vector that was written 
without LFS support can be opened with new LFS-enabled libraries only 
after rebuilding topology. The off_t size used to write the vector must 
be stored somewhere in the header info of topo (most important) and 
coor. If this size does not match the current off_t size, topology must 
be rebuild.
Letting older libraries know that topology needs to be rebuilt is the 
easiest part...

Nonsense or going into the right direction?

I'm afraid it will be a long way to get LFS into the vector libraries.



More information about the grass-dev mailing list