[Gdal-dev] Tif file getting larger and larger

Vincent Schut schut at sarvision.nl
Thu Aug 14 08:20:38 EDT 2008


Fodder wrote:
> 
> 
> Ah, dear, so if I want to frequently change values in TIF file, I basically
> can't use compression as the file will quickly get larger than the
> uncompressed version. Is this a fair comment?

Right. the combination of tiled compression (where a line counts as a 
tile too) and tiff's internal file layout simply does not allow for easy 
rewriting of data, because a tile's compressed size will change, and 
possibly not fit anymore in the space that is reserved for that tile in 
the tiff file. One could rewrite the tiff driver/library to then change 
the layout of the tiff file instead of simply adding new reserved space 
for that tile, but that would usually mean that a large part of the tiff 
file will be changed, which would be a big performance hit (and a big 
rewrite).
A workaround could be to regularly rewrite the file using 
gdal_translate, which will re-layout all tiles and discard the unused 
'old' space. (a simple 'gdal_translate -co <compression options> 
file.tiff file_new.tiff' should do the trick).

That being said, my opinion on compressing: disk space is cheap 
nowadays, and though cpu cycles are too, imho compression often is not 
worth the effort, especially on files that you often need to read or 
write to. Compression can be a large performance hit. Especially on 
files you  need random write access on, don't compress. Of course there 
are exceptions to this, and you very well could be one... But generally, 
I think this holds.

In case someone is interested, personally I use these optimization rules:
- random write access: don't compress, strip or tile and pixel or band 
interleave according to read/write algorithm;
- mainly intended for reading/viewing: don't compress, square tiles 
(256x256), add pyramids;
- intended for clients that use commercial GIS/RS software: pixel 
interleave, strip oriented, no compression, no strange data types
- space is really really really an issue and I am sure I don't need any 
random write access to it and I am the only one that wants to read/view 
it: band interleaved, usually tiled, deflate compression. Sometimes add 
(compressed) pyramids.

Cheers,
Vincent.
> 
> Also, I've noticed that after 5-8 re-writes of the whole dataset (RasterIO
> the whole grid from a buffer), the dataset seems to get corrupted anyway.
> 
> 
> 
> Maciej Sieczka wrote:
>>
>> Related?: http://trac.osgeo.org/gdal/ticket/1688 (some more info in a 
>> duplicate http://trac.osgeo.org/gdal/ticket/1689).
>>
>> Maciek
>>
>> -- 
>> Maciej Sieczka
>> www.sieczka.org
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>>
> 



More information about the gdal-dev mailing list