[gdal-dev] Cannot open S3 files after upload
Even Rouault
even.rouault at spatialys.com
Wed Jun 21 02:02:25 PDT 2017
Matt,
> My actual problem is a bit more specific then being unable to open S3 files
> after upload. The actual problem is that within the same Python session, I
> can open a file off S3 with the vsis3 driver, but then if I upload a new
> file that previously did not exist (using boto3), gdal does not see it as a
> valid file.
Yes I'm aware of that issue. There's indeed metadata (file size & date, directory listing) and
data (chunks of files) cached by /vsicurl/ and related file systems like /vsis3/ . /vsicurl/ was
designed at a time where web resources didn't change that much and it was unlikely during a
same GDAL session to see changes, but with cloud offerings, this is no longer the case.
A few weeks ago I've added in trunk a CPL_VSIL_CURL_NON_CACHED config option that can
be set to disable caching on a file or set of files.
See https://trac.osgeo.org/gdal/wiki/ConfigOptions#CPL_VSIL_CURL_NON_CACHED
So in your example, if you set
CPL_VSIL_CURL_NON_CACHED=/vsis3/put_here_the_bucket_name , that will work.
I've also just added per https://trac.osgeo.org/gdal/ticket/6937 a new function
VSICurlClearCache() function (bound to SWIG as gdal.VSICurlClearCache()). So if you add
gdal.VSICurlClearCache() just after the s3.meta.client.upload_file() call, that will work too.
Both mechanisms are complementary.
CPL_VSIL_CURL_NON_CACHED is useful in scenarios where you don't know when the server
content can change (some other processes or machines do that behind your back). Its
advantage is that it doesn't require modification of code (it was designed for MapServer use
case typically). The drawback of it is that you loose all caching when a same file is opened,
close, opened, closed, ... several times during the process.
VSICurlClearCache() will give you more control if you master when uploads happen.
I've also backported VSICurlClearCache() to 2.2 branch.
As far as VSI_CACHE=TRUE is concerned, its scope of caching is restricted to a same VSI file
handle instance. Can be useful if the global 16 MB vsicurl cache isn't big enough for very
large files.
Even
--
Spatialys - Geospatial professional services
http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170621/34958ac4/attachment-0001.html>
More information about the gdal-dev
mailing list