[gdal-dev] Issue with GDAL VSICURL Passing Authorization Header to Redirected Pre-signed S3 URLs
Even Rouault
even.rouault at spatialys.com
Sun Sep 22 05:07:38 PDT 2024
Hi Pradeep
Can you give a try to https://github.com/OSGeo/gdal/pull/10857 ?
Even
Le 22/09/2024 à 07:37, Pradeep kumar via gdal-dev a écrit :
>
> Dear GDAL Developers,
>
>
> I hope this message finds you well.
>
>
> I am experiencing an issue with GDAL’s VSICURL virtual file system
> where the Authorization header is being passed to redirected
> pre-signed S3 URLs, leading to errors when accessing AWS S3.
>
>
> *Background:*
>
>
> In my use case, I have a proxy URL that, when called with an
> Authorization header, validates the token, verifies authorization, and
> then generates a pre-signed URL to an S3 object. The proxy then
> redirects the client to this pre-signed URL. When using GDAL with this
> reverse proxy server and setting the configuration option --config
> CPL_VSIL_CURL_USE_S3_REDIRECT NO, everything works correctly. However,
> the issue is that GDAL makes too many round trips, and each time I end
> up generating a new pre-signed URL. Ideally, I would like GDAL to send
> the initial request and then reuse the pre-signed URLs for subsequent
> requests.
>
>
> This desired behavior is achieved with --config
> CPL_VSIL_CURL_USE_S3_REDIRECT YES. However, the problem now is that
> GDAL, on subsequent requests to the pre-signed URL, is also passing
> the Authorization header. I am using the configuration --config
> GDAL_HTTP_HEADERS "Authorization: Bearer xxxx" to set the initial
> header, so the Authorization header is being sent with every request.
>
>
> When the Authorization header is sent to AWS S3 along with the
> pre-signed URL, AWS returns the following error:
>
>
> ```Only one auth mechanism allowed; only the X-Amz-Algorithm query
> parameter, Signature query string parameter or the Authorization
> header should be specified.```
>
>
> *Observed Behavior:*
>
>
> •*First Request (302 OK):* GDAL sends a request to the proxy URL with
> the Authorization header. The proxy validates the token and redirects
> to the pre-signed S3 URL. GDAL follows the redirect, and since the
> pre-signed URL is accessed without the Authorization header, AWS S3
> responds with a *200 OK*. This behavior is as expected.
>
> •*Subsequent Requests (400 Bad Request):* GDAL reuses the pre-signed
> URL but includes the Authorization header in the request. AWS S3,
> seeing both the pre-signed URL’s query parameters and the
> Authorization header, returns a *400 Bad Request* error, stating that
> only one authentication mechanism is allowed.
>
>
> *Expected Behavior:*
>
>
> I expect that GDAL’s VSICURL should send the initial request with the
> Authorization header to the proxy URL. Upon receiving the redirect to
> the pre-signed URL, it should not include the Authorization header in
> subsequent requests to AWS S3. This would allow AWS S3 to accept the
> pre-signed URL without conflicts.
>
>
> *Questions:*
>
>
> •Is there a way to configure GDAL so that it does not pass the
> Authorization header to the redirected pre-signed URLs while retaining
> it for the initial request?
>
> •If this feature is not currently available, would it be feasible to
> implement such functionality?
>
> •Are there any plans to address the handling of HTTP redirect response
> codes 301 and 307 in future GDAL releases to better support this use case?
>
>
>
> *GDAL CLI Command:*
>
> *
> *
>
> *``` *gdalinfo --debug on --config CPL_CURL_VERBOSE YES --config
> GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config
> CPL_VSIL_CURL_USE_S3_REDIRECT YES --config
> GDAL_HTTP_HEADERS="Authorization: Bearer xxxx"
> /vsicurl/https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif ```
>
>
> *Example GDAL Logs:*
>
>
> Below are the GDAL logs illustrating the issue (sensitive information
> has been redacted):
>
>
> *First Request (200 OK):*
>
> *```*
>
> HTTP: libcurl/8.7.1 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12
> nghttp2/1.61.0
> HTTP: GDAL was built against curl 8.4.0, but is running against 8.7.1.
> CURL_INFO_TEXT: [HTTP/2] [1] OPENED stream for
> https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif
> CURL_INFO_TEXT: [HTTP/2] [1] [:method: HEAD]
> CURL_INFO_TEXT: [HTTP/2] [1] [:scheme: https]
> CURL_INFO_TEXT: [HTTP/2] [1] [:authority: example.com
> <http://example.com>]
> CURL_INFO_TEXT: [HTTP/2] [1] [:path:
> /stac/collections/items/assets/foo?path=s3://your-bucket/red.tif]
> CURL_INFO_TEXT: [HTTP/2] [1] [user-agent: GDAL/3.9.2]
> CURL_INFO_TEXT: [HTTP/2] [1] [accept: */*]
> CURL_INFO_TEXT: [HTTP/2] [1] [authorization: Bearer [REDACTED]]
> CURL_INFO_HEADER_OUT: HEAD
> /stac/collections/items/assets/foo?path=s3://your-bucket/red.tif HTTP/2
> Host: example.com <http://example.com>
> User-Agent: GDAL/3.9.2
> Accept: */*
> Authorization: Bearer [REDACTED]
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/2 301
> CURL_INFO_HEADER_IN: date: Sun, 22 Sep 2024 04:44:28 GMT
> CURL_INFO_HEADER_IN: content-type: text/plain; charset=utf-8
> CURL_INFO_HEADER_IN: content-length: 43
> CURL_INFO_HEADER_IN: location:
> https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> <https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868>
> CURL_INFO_HEADER_IN: x-content-length: 176995703
> CURL_INFO_HEADER_IN: apigw-requestid: [REDACTED]
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Ignoring the response-body
> CURL_INFO_TEXT: Connection #0 to host example.com <http://example.com>
> left intact
> CURL_INFO_TEXT: Issue another request to this URL:
> 'https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> <https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868>'
> CURL_INFO_TEXT: Couldn't find host s3-bucket-placeholder in the .netrc
> file; using defaults
> CURL_INFO_TEXT: Host s3-bucket-placeholder:443 was resolved.
> CURL_INFO_TEXT: Trying [REDACTED]...
> CURL_INFO_TEXT: Connected to s3-bucket-placeholder ([REDACTED]) port 443
> CURL_INFO_TEXT: SSL connection using TLSv1.3 / [REDACTED]
> CURL_INFO_TEXT: Server certificate:
> CURL_INFO_TEXT: subject: CN=*.s3.amazonaws.com <http://s3.amazonaws.com>
> CURL_INFO_TEXT: start date: Apr 22 00:00:00 2024 GMT
> CURL_INFO_TEXT: expire date: Apr 7 23:59:59 2025 GMT
> CURL_INFO_TEXT: issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01
> CURL_INFO_TEXT: SSL certificate verify ok.
> CURL_INFO_HEADER_OUT: HEAD
> /red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> HTTP/1.1
> Host: s3-bucket-placeholder
> User-Agent: GDAL/3.9.2
> Accept: */*
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/1.1 200 OK
> CURL_INFO_HEADER_IN: x-amz-id-2: [REDACTED]
> CURL_INFO_HEADER_IN: x-amz-request-id: [REDACTED]
> CURL_INFO_HEADER_IN: Date: Sun, 22 Sep 2024 04:44:29 GMT
> CURL_INFO_HEADER_IN: Last-Modified: Tue, 17 Sep 2024 17:44:40 GMT
> CURL_INFO_HEADER_IN: ETag: "[REDACTED]"
> CURL_INFO_HEADER_IN: x-amz-server-side-encryption: AES256
> CURL_INFO_HEADER_IN: Accept-Ranges: bytes
> CURL_INFO_HEADER_IN: Content-Type: image/tiff
> CURL_INFO_HEADER_IN: Server: AmazonS3
> CURL_INFO_HEADER_IN: Content-Length: 176995703
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Connection #1 to host s3-bucket-placeholder left intact
> VSICURL: Effective URL:
> https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> <https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868>
> VSICURL: Will use redirect URL for the next 3599 seconds
> VSICURL:
> GetFileSize(https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif)=176995703
> response_code=200*
> *
>
> ```
>
>
> *Subsequent Request (400 Bad Request):*
>
> *```*
>
> VSICURL: Using redirect URL as it looks to be still valid (3599
> seconds left)
> VSICURL: Downloading 0-16383
> (https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868).
> <https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868).>..
> CURL_INFO_TEXT: Couldn't find host s3-bucket-placeholder in the .netrc
> file; using defaults
> CURL_INFO_TEXT: Found bundle for host: 0x600001aa8270 [serially]
> CURL_INFO_TEXT: Can not multiplex, even if we wanted to
> CURL_INFO_TEXT: Re-using existing connection with host
> s3-bucket-placeholder
> CURL_INFO_HEADER_OUT: GET
> /red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> HTTP/1.1
> Host: s3-bucket-placeholder
> User-Agent: GDAL/3.9.2
> Accept: */*
> Authorization: Bearer [REDACTED]
> Range: bytes=0-16383
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/1.1 400 Bad Request
> CURL_INFO_HEADER_IN: x-amz-request-id: [REDACTED]
> CURL_INFO_HEADER_IN: x-amz-id-2: [REDACTED]
> CURL_INFO_HEADER_IN: Content-Type: application/xml
> CURL_INFO_HEADER_IN: Transfer-Encoding: chunked
> CURL_INFO_HEADER_IN: Date: Sun, 22 Sep 2024 04:44:28 GMT
> CURL_INFO_HEADER_IN: Server: AmazonS3
> CURL_INFO_HEADER_IN: Connection: close
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Closing connection
> VSICURL: Got response_code=400
> ERROR 4:
> `/vsicurl/https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif'
> not recognized as being in a supported file format.
> gdalinfo failed - unable to open
> '/vsicurl/https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif'.*
> *
>
> ```
>
> *Additional Information:*
>
>
> If you require any further details or clarification, please let me
> know. I would be happy to provide more information. Additionally, if
> necessary, I can create an issue on the GDAL GitHub repository to
> track this problem.
>
>
> Thank you very much for your time and assistance. I appreciate any
> guidance or suggestions you can provide to help resolve this issue.
>
>
> Best Regards,
> Pradeep Gulla
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240922/88689968/attachment-0001.htm>
More information about the gdal-dev
mailing list