[gdal-dev] Clarification on what is not addressed by GDALFlushCache

Sean Gillies sean at mapbox.com
Wed Oct 9 16:51:47 PDT 2019


Thanks, Even. I've flushed out rasterio's usage of FlushCache.

On Thu, Oct 3, 2019 at 11:04 AM Even Rouault <even.rouault at spatialys.com>
wrote:

> On jeudi 3 octobre 2019 10:14:45 CEST Sean Gillies wrote:
> > In the comments above FlushCache() in gcore/gdaldataset.cpp it is said:
> >
> >  * Using this method does not prevent use from calling GDALClose()
> >  * to properly close a dataset and ensure that important data not
> addressed
> >  * by FlushCache() is written in the file.
>
> > Does it vary by
> > format and driver?
>
> Of course, wouldn't be fun otherwise. For some formats, it might result in
> a
> completely consistent dataset, and in others, in something that can't be
> opened at all. So what is does, beyond evicting 'dirty' blocks from the
> cahce,
> is mostly an implementation detail.
>
> > What exactly is the important data that is not addressed?
>
> In the case of GeoTIFF, FlushCache() will for example ensure that all tile/
> strip data is flushed to disk, but the TileByteCount/TileOffset index
> arrays
> are not updated, and os a file that was just created, they will be at
> their
> zero default value, making the dataset appear to be empty to a reader that
> would try to open it at that point.
>
> If generating a large dataset, you can for example call FlushCache() at
> regular intervals to make sure that there is sufficiently space on the
> storage
> device (but the global block cache will also flush when it is saturated).
> This
> might be a way of avoiding the memory to reach the GDAL_CACHEMAX
> threshold.
> But this can also result in suboptimal behaviour if you call it at
> inappropriate point. For example if you write to a JPEG-compressed tiled
> TIFF,
> and your write pattern is row per row, then flushing before you reach a
> row
> number that is multiple of the tile height, will flush partially written
> blocks (their top will contain real data, and the bottom zeroes). So those
> blocks will be later decompressed and recompressed, causing unnecessary
> quality loss.
>
> FlushCache() is automatically called by dataset destructor, so my tip
> would
> be: "do not use FlushCache() unless you know you need it"
>
> Even
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
>


-- 
Sean Gillies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20191009/5acbe6f3/attachment.html>


More information about the gdal-dev mailing list