[GRASS-dev] vector large file support
Markus Metz
markus.metz.giswork at googlemail.com
Tue Feb 10 05:54:01 EST 2009
Glynn Clements wrote:
>
>>> Right. The files are always written big-endian, so the high word will
>>> always be first in the file.
>>>
>> I'm not so sure about that, why is byte order stored in the topo header?
>> Byte order for writing out is determined just before writing topo/cidx.
>>
>
> Okay; maybe that's just the internal default. I'm looking at
> dig_init_portable(), which copies the *_cnvrt arrays for big-endian
> and reverses them for little-endian.
>
I noticed that too, after my reply. Sometimes I can't keep up with
reading that code.
> I don't know why that code needs to be so obtuse.
>
That vector lib code has some potential for cleaning up...
>
>>> As well as checking that the high word is zero, you also need to check
>>> that the low word is <= 0x7fffffff (off_t is signed, hence the limit
>>> being 2GiB not 4GiB).
>>>
>> OK. Additionally the whole thing should not be negative, that would be
>> an invalid offset.
>>
>
> I'm assuming that the individual words are treated as unsigned.
>
Glynn, your idea to set off_t size according to the size of the coor
file was *absolutely brillant*!!!
That means that offset values will be stored as 64bit only if the coor
file is larger than 2GB and only if the native off_t size supports that.
That also means that the topo/cidx files can stay 100% identical to the
current version if the coor file is < 2GB. Otherwise they get changed
(larger headers), but then the current libs don't have a chance to read
this large vector anyway, exiting with an error message. That also means
we don't have to make a plan on how to read 64bit offset values when the
libs support only 32bit values, because this indicates a large coor file
that is unreadable with a 32bit-off_t lib anyway.
I have implemented dig__fread_port_O() and dig__fwrite_port_O() with
variable off_t size, added the appropriate tests to diglib/test.c and
the tests are passed (only that test.tmp is now larger than test.ok). It
is easy to determine the appropriate off_t size and adjust the topo/cidx
headers accordingly.
G_fseek and G_ftell also work, thanks!
BTW, display of grass7 with cairo appears to be half the speed of
grass65, I think it was faster last week (can't remember the revision).
Not sure if this justifies a complaint.
More information about the grass-dev
mailing list