[gdal-dev] Problem accessing NASA Cloud Optimized GeoTIFF data

Aaron Friesz amfriesz at gmail.com
Fri Oct 1 07:34:43 PDT 2021


Hey Even,

Sorry, yeah, the cookies warning goes away if I add the full path. The rest
of the log stays the same though.

-Aaron

On Fri, Oct 1, 2021 at 6:09 AM Even Rouault <even.rouault at spatialys.com>
wrote:

> Aaron,
>
> I see a suspicious "* WARNING: failed to save cookies in ~/cookies.txt:
> Failed writing received data to disk/application" that comes from curl. I'm
> not sure if it is due to the ~/ not being expanded or something else. This
> might be dependent of your curl version.
>
> Using --config CPL_VSIL_CURL_USE_HEAD FALSE is also needed for me with
> GDAL master on Ubuntu 20.04 / curl 7.68.0 to make your request work. The
> behavior of the HEAD request with CloudFront is quite strange (returning a
> 206 code)
>
> If you use the very latest curl 7.79.0, you need to use GDAL master (or
> top of release/3.3 branch) to get https://github.com/OSGeo/gdal/pull/4502
> , but while testing your use case, I realized this fix caused an issue with
> the CPL_VSIL_CURL_USE_HEAD=FALSE case, which is going to be fixed per
> https://github.com/OSGeo/gdal/pull/4578.
>
> Even
> Le 30/09/2021 à 23:20, Aaron Friesz a écrit :
>
> Hi,
>
> I was recently updating the HLS tutorial to access version 2.0 and
> discovered that the tutorial is no longer able to access the HLS data in
> the cloud from a local Python environment. I've been trying to troubleshoot
> for a couple days now with no positive results. The failure can be boiled
> down to just accessing an HLS COG file using rasterio in Python (note, a
> .netrc file with Earthdata login credentials is needed to access data).
>
> ```
> import rasterio as rio
> import os
>
> s30_url = '
> https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T13TEK.2021261T175011.v2.0/HLS.S30.T13TEK.2021261T175011.v2.0.B04.tif
> '
>
> rio_env = rio.Env(GDAL_DISABLE_READDIR_ON_OPEN='EMPTY_DIR',
>                   #CPL_VSIL_CURL_ALLOWED_EXTENSIONS='tif',
>                   #CPL_VSIL_CURL_USE_HEAD='FALSE',
>                   CPL_DEBUG='ON',
>                   CPL_CURL_VERBOSE='ON',
>                   GDAL_HTTP_UNSAFESSL='YES',
>
> GDAL_HTTP_COOKIEFILE=os.path.expanduser('~/.edl_cookies'),
>                   GDAL_HTTP_COOKIEJAR=os.path.expanduser('~/.edl_cookies'))
>
> rio_env.__enter__()
>
> ds = rio.open(s30_url)
>
> ```
> While running a local python env (via conda) the result of the rio.open()
> is a '206 status', which essentially fails the command. If I add
> CPL_VSIL_CURL_USE_HEAD='FALSE' to the rio environment, I get a 'file format
> not supported' error, which seems to be a red herring. I'm curious if
> you've run into this type of thing before with COGs in the cloud.
>
> The same behavior can be observed when just using the gdal command line
> utilities.
> ```
> #without --config CPL_VSIL_CURL_USE_HEAD FALSE
> > gdalinfo /vsicurl/
> https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> --config GDAL_HTTP_COOKIEFILE ~/cookies.txt --config GDAL_HTTP_COOKIEJAR
> ~/cookies.txt --config GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config
> CPL_CURL_VERBOSE ON
>
> ...
> #output
> * Mark bundle as not supporting multiuse
> < HTTP/1.1 206 Partial Content
> < Content-Type: image/tiff
> < Content-Length: 1
> < Connection: keep-alive
> < Date: Thu, 30 Sep 2021 20:52:14 GMT
> < Last-Modified: Tue, 20 Jul 2021 20:27:33 GMT
> < ETag: "85d0b4b089083135d26188a51e187788-1"
> < x-amz-server-side-encryption: AES256
> < Accept-Ranges: bytes
> < Content-Range: bytes 0-0/14178318
> < Server: AmazonS3
> < X-Edge-Origin-Shield-Skipped: 0
> < X-Cache: Miss from cloudfront
> < Via: 1.1 130ce7c752c5865952ded89032560b33.cloudfront.net (CloudFront)
> < X-Amz-Cf-Pop: MIA3-C3
> < X-Amz-Cf-Id: qEq_2x3pF25dN5YMclkgRwhyTozXDexPdaQwOyJaI2VvIKhNnajozQ==
> <
> * Connection #1 to host d1nklfio7vscoe.cloudfront.net left intact
> ERROR 11: HTTP response code: 206
> gdalinfo failed - unable to open '/vsicurl/
> https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> '.
> ```
>
> ```
> #with --config CPL_VSIL_CURL_USE_HEAD FALSE
> > gdalinfo /vsicurl/
> https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> --config GDAL_HTTP_COOKIEFILE ~/cookies.txt --config GDAL_HTTP_COOKIEJAR
> ~/cookies.txt --config GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR  --config
> CPL_VSIL_CURL_USE_HEAD FALSE --config CPL_CURL_VERBOSE ON
>
> ...
> #output
> * Mark bundle as not supporting multiuse
> < HTTP/1.1 200 OK
> < Content-Type: image/tiff
> < Content-Length: 14178318
> < Connection: keep-alive
> < Date: Thu, 30 Sep 2021 21:12:52 GMT
> < Last-Modified: Tue, 20 Jul 2021 20:27:33 GMT
> < ETag: "85d0b4b089083135d26188a51e187788-1"
> < x-amz-server-side-encryption: AES256
> < Accept-Ranges: bytes
> < Server: AmazonS3
> < X-Edge-Origin-Shield-Skipped: 0
> < X-Cache: Miss from cloudfront
> < Via: 1.1 8a771ca27e5a3c9e06b12b7af5d25aa4.cloudfront.net (CloudFront)
> < X-Amz-Cf-Pop: MIA3-C3
> < X-Amz-Cf-Id: Xzv3KWP6glYk68odo5d4KAPnzxsNUFjHRHe90EFXOKNGeYL3ea7jiA==
> * Failed writing header
> * Closing connection 2
> * schannel: shutting down SSL/TLS connection with
> d1nklfio7vscoe.cloudfront.net port 443
> * WARNING: failed to save cookies in ~/cookies.txt: Failed writing
> received data to disk/application
> ERROR 4: `/vsicurl/
> https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif'
> not recognized as a supported file format.
> gdalinfo failed - unable to open '/vsicurl/
> https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> '.
> ```
>
> Strangely, I'm able to use the original rio/gdal configuration options if
> I'm running the code in a cloud environment, that is:
>
> > gdalinfo /vsicurl/
> https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T10TEK.2021192T184511.v2.0/HLS.L30.T10TEK.2021192T184511.v2.0.B04.tif
> --config GDAL_HTTP_COOKIEFILE ~/cookies.txt --config GDAL_HTTP_COOKIEJAR
> ~/cookies.txt --config GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config
> CPL_CURL_VERBOSE ON
>
> ...works fine in the USGS Pangeo and Openscapes 2i2c instance in
> us-west2.  I get the failures on both Windows and MacOS, but my colleagues
> seem to be able to run the code just fine from linux within a Docker
> container.
>
> I've tried the gdalinfo commands using the gdal versions 3.1.4 through
> 3.2.2. Oddly, v3.2.0 succeeds in printing out the gdalinfo content, but
> I've been unable to create a conda environment with that version of gdal.
>
> I'm talking to our NASA cloud team as well to see if there was anything on
> the backend that changed recently just to cover that angle. I'm hoping,
> though, that I'm missing something obvious in the local configurations.
>
> Any guidance would be appreciated.
>
> -Aaron
>
> _______________________________________________
> gdal-dev mailing listgdal-dev at lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> -- http://www.spatialys.com
> My software is free, but my time generally not.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20211001/3a2619f1/attachment.html>


More information about the gdal-dev mailing list