[gdal-dev] /vsicurl caching behavior
Even Rouault
even.rouault at spatialys.com
Fri Jan 5 03:21:12 PST 2018
Hi,
> 1) There are a lot of knobs that can be used to tune the thing that are not
> documented. For example CPL_VSIL_CURL_USE_CACHE. Is it on purpose?
Yes, the disk cache is an experiment that isn't used anywhere (from what I know) and likely
not in a finished state as you noticed in your below points which are all valid and should be
addressed if someone wanted to make it production ready.
>
> 6) If VSI_CACHE is enabled the data is cached twice in memory (papsRegions
> and VSICachedFile). Is it wanted?
The scope of the caches are not the same. papsRegions is a global cache shared by all /
vsicurl/ handles, and persistant (in memory) on their closing (so that if the same filename is
closed and re-opened in sequence, already read parts can be reused), whereas VSICachedFile
is associated with a single file handle.
I guess there could be some optimizations to avoid those duplications, but that could
complicate substantially the code which is already non trivial.
>
> 7) If the file's content is modified, it's the total mess. We'll end up
> having portions of the file having the old data while the rest has the new
> data. I'm quite sure the GeoTiff we end up with won't be very valid.
Indeed. But the mess would also happen with no caching mechanism if a file is modified while
being read. Even for a local file, GDAL using glibc FILE buffering API, so even if you modify
some portions of a GeoTIFF that haven't been read yet, but you already read closing regions,
there's a chance, you'll read old data in part.
>
> 8) In the case discussed in 7), CPL_VSIL_CURL_NON_CACHED will just purge
> the data from 1 the 3 caches: papsRegions. The vsil_cache and the disk will
> still cache the content.
CPL_VSIL_CURL_NON_CACHED avoids the content of a file to be preserved in the
papsRegion cache when a file handle is closed and re-opened. And VSICachedFile is only valid
during the lifetime of the file handle. So I don't think there's an issue there. Perhaps the
naming CPL_VSIL_CURL_NON_CACHED is a bit misleading: there's always some cache
(otherwise /vsicurl performance would be just too horrible), it is just that it doesn't survive
file closing.
Even
--
Spatialys - Geospatial professional services
http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20180105/712d3da0/attachment.html>
More information about the gdal-dev
mailing list