[GRASS-dev] vector large file support

Markus Metz markus.metz.giswork at googlemail.com
Tue Feb 10 05:54:01 EST 2009


Glynn Clements wrote:
>   
>>> Right. The files are always written big-endian, so the high word will
>>> always be first in the file.
>>>       
>> I'm not so sure about that, why is byte order stored in the topo header? 
>> Byte order for writing out is determined just before writing topo/cidx.
>>     
>
> Okay; maybe that's just the internal default. I'm looking at
> dig_init_portable(), which copies the *_cnvrt arrays for big-endian
> and reverses them for little-endian.
>   
I noticed that too, after my reply. Sometimes I can't keep up with 
reading that code.
> I don't know why that code needs to be so obtuse.
>   
That vector lib code has some potential for cleaning up...
>   
>>> As well as checking that the high word is zero, you also need to check
>>> that the low word is <= 0x7fffffff (off_t is signed, hence the limit
>>> being 2GiB not 4GiB).
>>>       
>> OK. Additionally the whole thing should not be negative, that would be 
>> an invalid offset.
>>     
>
> I'm assuming that the individual words are treated as unsigned.
>   
Glynn, your idea to set off_t size according to the size of the coor 
file was *absolutely brillant*!!!
That means that offset values will be stored as 64bit only if the coor 
file is larger than 2GB and only if the native off_t size supports that. 
That also means that the topo/cidx files can stay 100% identical to the 
current version if the coor file is < 2GB. Otherwise they get changed 
(larger headers), but then the current libs don't have a chance to read 
this large vector anyway, exiting with an error message. That also means 
we don't have to make a plan on how to read 64bit offset values when the 
libs support only 32bit values, because this indicates a large coor file 
that is unreadable with a 32bit-off_t lib anyway.

I have implemented dig__fread_port_O() and dig__fwrite_port_O() with 
variable off_t size, added the appropriate tests to diglib/test.c and 
the tests are passed (only that test.tmp is now larger than test.ok). It 
is easy to determine the appropriate off_t size and adjust the topo/cidx 
headers accordingly.

G_fseek and G_ftell also work, thanks!

BTW, display of grass7 with cairo appears to be half the speed of 
grass65, I think it was faster last week (can't remember the revision). 
Not sure if this justifies a complaint.



More information about the grass-dev mailing list