[GRASS-dev] vector large file support
Markus Metz
markus.metz.giswork at googlemail.com
Thu Feb 5 08:42:52 EST 2009
Glynn Clements wrote:
> Markus Metz wrote:
>
>
>> I think I understand. So according to the grass wiki the steps to enable
>> large file support would be
>>
>> 1) add
>> ifneq ($(USE_LARGEFILES),)
>> EXTRA_CFLAGS = -D_FILE_OFFSET_BITS=64
>> endif
>>
>> to all relevant Makefiles
>>
>
> Yes. Although this should probably come after 2 and 3 ;) First, make
> the code safe for LFS, *then* enable it.
>
No chronological order intended, I'm relying on your help :-)
>
>> 2) use off_t where appropriate, and take care with type casting. file
>> offset is used in various different places in the vector library, a bit
>> of work to get off_t usage right.
>>
>
> Using off_t isn't a problem; it's when you generate an off_t from
> smaller types that care needs to be taken.
>
Yes, that's what I meant, because the vector library uses file offset a
lot, all of type long so far. It is doable to update the vector library
with off_t usage and proper casting as you pointed out previously, but
file offset is an integral part of grass vector topology and must be
handled with care, I'm aware of that.
>
>> 3) solve the fseek/fseeko and ftell/ftello problem. Get inspiration from
>> libgis and LFS-safe modules? Or as suggested in the grass wiki on LFS, add
>> extern off_t G_ftell(FILE *fp);
>> extern int G_fseek(FILE *stream, off_t offset, int whence);
>> for global use?
>>
>
> That would be my preference.
>
How about
extern off_t G_ftell(FILE *fp)
{
#ifdef HAVE_LARGEFILES
return (ftello(fp);
#else
return (ftell(fp);
#endif
}
That is the current solution in the iostream library for fseek/fseeko. I
know it's not that easy according to the grass wiki on LFS support, "The
issues". Apparently just because you configure with --enable-largefile
doesn't mean fseeko and ftello are available, although most modern
systems should have it. ftello is currently not used in devbr_6 and trunk.
>
>> 4) figure out if coor file size really needs to be stored in coor and
>> topo. coor file size doesn't say a lot about the number of features
>> because coor can contain a high proportion of dead lines (a problem in
>> itself, vector TODO). If if does not need to be stored in coor and topo,
>> how does removing coor file size info affect reading and writing of coor
>> and topo? Are there hard-coded offsets for reading these files?
>>
>
> No idea here.
>
I think coor file size is stored as safety check but not needed to read
features. That safety check is probably there for a good reason.
The problem is that the offset of a given feature in the coor file is
stored in topo and must be properly retrieved, i.e. the number of bytes
used to store a given feature offset must match the off_t size of the
current library. A topo file written with LFS support will thus be only
readable by old libraries or grass compiled without LFS support after
rebuilding topology. The other way around, a vector that was written
without LFS support can be opened with new LFS-enabled libraries only
after rebuilding topology. The off_t size used to write the vector must
be stored somewhere in the header info of topo (most important) and
coor. If this size does not match the current off_t size, topology must
be rebuild.
Letting older libraries know that topology needs to be rebuilt is the
easiest part...
Nonsense or going into the right direction?
I'm afraid it will be a long way to get LFS into the vector libraries.
More information about the grass-dev
mailing list