[gdal-dev] Problem accessing NASA Cloud Optimized GeoTIFF data
Even Rouault
even.rouault at spatialys.com
Fri Oct 1 03:09:04 PDT 2021
Aaron,
I see a suspicious "* WARNING: failed to save cookies in ~/cookies.txt:
Failed writing received data to disk/application" that comes from curl.
I'm not sure if it is due to the ~/ not being expanded or something
else. This might be dependent of your curl version.
Using --config CPL_VSIL_CURL_USE_HEAD FALSE is also needed for me with
GDAL master on Ubuntu 20.04 / curl 7.68.0 to make your request work. The
behavior of the HEAD request with CloudFront is quite strange (returning
a 206 code)
If you use the very latest curl 7.79.0, you need to use GDAL master (or
top of release/3.3 branch) to get
https://github.com/OSGeo/gdal/pull/4502 , but while testing your use
case, I realized this fix caused an issue with the
CPL_VSIL_CURL_USE_HEAD=FALSE case, which is going to be fixed per
https://github.com/OSGeo/gdal/pull/4578.
Even
Le 30/09/2021 à 23:20, Aaron Friesz a écrit :
> Hi,
>
> I was recently updating the HLS tutorial to access version 2.0 and
> discovered that the tutorial is no longer able to access the HLS data
> in the cloud from a local Python environment. I've been trying to
> troubleshoot for a couple days now with no positive results. The
> failure can be boiled down to just accessing an HLS COG file using
> rasterio in Python (note, a .netrc file with Earthdata login
> credentials is needed to access data).
>
> ```
> import rasterio as rio
> import os
>
> s30_url =
> 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T13TEK.2021261T175011.v2.0/HLS.S30.T13TEK.2021261T175011.v2.0.B04.tif
> <https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T13TEK.2021261T175011.v2.0/HLS.S30.T13TEK.2021261T175011.v2.0.B04.tif>'
>
> rio_env = rio.Env(GDAL_DISABLE_READDIR_ON_OPEN='EMPTY_DIR',
> #CPL_VSIL_CURL_ALLOWED_EXTENSIONS='tif',
> #CPL_VSIL_CURL_USE_HEAD='FALSE',
> CPL_DEBUG='ON',
> CPL_CURL_VERBOSE='ON',
> GDAL_HTTP_UNSAFESSL='YES',
> GDAL_HTTP_COOKIEFILE=os.path.expanduser('~/.edl_cookies'),
> GDAL_HTTP_COOKIEJAR=os.path.expanduser('~/.edl_cookies'))
>
> rio_env.__enter__()
>
> ds = rio.open(s30_url)
>
> ```
> While running a local python env (via conda) the result of the
> rio.open() is a '206 status', which essentially fails the command. If
> I add CPL_VSIL_CURL_USE_HEAD='FALSE' to the rio environment, I get a
> 'file format not supported' error, which seems to be a red herring.
> I'm curious if you've run into this type of thing before with COGs in
> the cloud.
>
> The same behavior can be observed when just using the gdal command
> line utilities.
> ```
> #without --config CPL_VSIL_CURL_USE_HEAD FALSE
> > gdalinfo
> /vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> <https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif>
> --config GDAL_HTTP_COOKIEFILE ~/cookies.txt --config
> GDAL_HTTP_COOKIEJAR ~/cookies.txt --config
> GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config CPL_CURL_VERBOSE ON
>
> ...
> #output
> * Mark bundle as not supporting multiuse
> < HTTP/1.1 206 Partial Content
> < Content-Type: image/tiff
> < Content-Length: 1
> < Connection: keep-alive
> < Date: Thu, 30 Sep 2021 20:52:14 GMT
> < Last-Modified: Tue, 20 Jul 2021 20:27:33 GMT
> < ETag: "85d0b4b089083135d26188a51e187788-1"
> < x-amz-server-side-encryption: AES256
> < Accept-Ranges: bytes
> < Content-Range: bytes 0-0/14178318
> < Server: AmazonS3
> < X-Edge-Origin-Shield-Skipped: 0
> < X-Cache: Miss from cloudfront
> < Via: 1.1 130ce7c752c5865952ded89032560b33.cloudfront.net
> <http://130ce7c752c5865952ded89032560b33.cloudfront.net> (CloudFront)
> < X-Amz-Cf-Pop: MIA3-C3
> < X-Amz-Cf-Id: qEq_2x3pF25dN5YMclkgRwhyTozXDexPdaQwOyJaI2VvIKhNnajozQ==
> <
> * Connection #1 to host d1nklfio7vscoe.cloudfront.net
> <http://d1nklfio7vscoe.cloudfront.net> left intact
> ERROR 11: HTTP response code: 206
> gdalinfo failed - unable to open
> '/vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> <https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif>'.
> ```
>
> ```
> #with --config CPL_VSIL_CURL_USE_HEAD FALSE
> > gdalinfo
> /vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> <https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif>
> --config GDAL_HTTP_COOKIEFILE ~/cookies.txt --config
> GDAL_HTTP_COOKIEJAR ~/cookies.txt --config
> GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config
> CPL_VSIL_CURL_USE_HEAD FALSE --config CPL_CURL_VERBOSE ON
>
> ...
> #output
> * Mark bundle as not supporting multiuse
> < HTTP/1.1 200 OK
> < Content-Type: image/tiff
> < Content-Length: 14178318
> < Connection: keep-alive
> < Date: Thu, 30 Sep 2021 21:12:52 GMT
> < Last-Modified: Tue, 20 Jul 2021 20:27:33 GMT
> < ETag: "85d0b4b089083135d26188a51e187788-1"
> < x-amz-server-side-encryption: AES256
> < Accept-Ranges: bytes
> < Server: AmazonS3
> < X-Edge-Origin-Shield-Skipped: 0
> < X-Cache: Miss from cloudfront
> < Via: 1.1 8a771ca27e5a3c9e06b12b7af5d25aa4.cloudfront.net
> <http://8a771ca27e5a3c9e06b12b7af5d25aa4.cloudfront.net> (CloudFront)
> < X-Amz-Cf-Pop: MIA3-C3
> < X-Amz-Cf-Id: Xzv3KWP6glYk68odo5d4KAPnzxsNUFjHRHe90EFXOKNGeYL3ea7jiA==
> * Failed writing header
> * Closing connection 2
> * schannel: shutting down SSL/TLS connection with
> d1nklfio7vscoe.cloudfront.net <http://d1nklfio7vscoe.cloudfront.net>
> port 443
> * WARNING: failed to save cookies in ~/cookies.txt: Failed writing
> received data to disk/application
> ERROR 4:
> `/vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> <https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif>'
> not recognized as a supported file format.
> gdalinfo failed - unable to open
> '/vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> <https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif>'.
> ```
>
> Strangely, I'm able to use the original rio/gdal configuration options
> if I'm running the code in a cloud environment, that is:
>
> > gdalinfo
> /vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> <https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif>
> --config GDAL_HTTP_COOKIEFILE ~/cookies.txt --config
> GDAL_HTTP_COOKIEJAR ~/cookies.txt --config
> GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config CPL_CURL_VERBOSE ON
>
> ...works fine in the USGS Pangeo and Openscapes 2i2c instance in
> us-west2. I get the failures on both Windows and MacOS, but my
> colleagues seem to be able to run the code just fine from linux within
> a Docker container.
>
> I've tried the gdalinfo commands using the gdal versions 3.1.4 through
> 3.2.2. Oddly, v3.2.0 succeeds in printing out the gdalinfo content,
> but I've been unable to create a conda environment with that version
> of gdal.
>
> I'm talking to our NASA cloud team as well to see if there was
> anything on the backend that changed recently just to cover that
> angle. I'm hoping, though, that I'm missing something obvious in the
> local configurations.
>
> Any guidance would be appreciated.
>
> -Aaron
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20211001/a765c821/attachment-0001.html>
More information about the gdal-dev
mailing list