[gdal-dev] Issue with GDAL VSICURL Passing Authorization Header to Redirected Pre-signed S3 URLs

Even Rouault even.rouault at spatialys.com
Sun Sep 22 05:07:38 PDT 2024


Hi Pradeep

Can you give a try to https://github.com/OSGeo/gdal/pull/10857 ?

Even

Le 22/09/2024 à 07:37, Pradeep kumar via gdal-dev a écrit :
>
> Dear GDAL Developers,
>
>
> I hope this message finds you well.
>
>
> I am experiencing an issue with GDAL’s VSICURL virtual file system 
> where the Authorization header is being passed to redirected 
> pre-signed S3 URLs, leading to errors when accessing AWS S3.
>
>
> *Background:*
>
>
> In my use case, I have a proxy URL that, when called with an 
> Authorization header, validates the token, verifies authorization, and 
> then generates a pre-signed URL to an S3 object. The proxy then 
> redirects the client to this pre-signed URL. When using GDAL with this 
> reverse proxy server and setting the configuration option --config 
> CPL_VSIL_CURL_USE_S3_REDIRECT NO, everything works correctly. However, 
> the issue is that GDAL makes too many round trips, and each time I end 
> up generating a new pre-signed URL. Ideally, I would like GDAL to send 
> the initial request and then reuse the pre-signed URLs for subsequent 
> requests.
>
>
> This desired behavior is achieved with --config 
> CPL_VSIL_CURL_USE_S3_REDIRECT YES. However, the problem now is that 
> GDAL, on subsequent requests to the pre-signed URL, is also passing 
> the Authorization header. I am using the configuration --config 
> GDAL_HTTP_HEADERS "Authorization: Bearer xxxx" to set the initial 
> header, so the Authorization header is being sent with every request.
>
>
> When the Authorization header is sent to AWS S3 along with the 
> pre-signed URL, AWS returns the following error:
>
>
> ```Only one auth mechanism allowed; only the X-Amz-Algorithm query 
> parameter, Signature query string parameter or the Authorization 
> header should be specified.```
>
>
> *Observed Behavior:*
>
>
> •*First Request (302 OK):* GDAL sends a request to the proxy URL with 
> the Authorization header. The proxy validates the token and redirects 
> to the pre-signed S3 URL. GDAL follows the redirect, and since the 
> pre-signed URL is accessed without the Authorization header, AWS S3 
> responds with a *200 OK*. This behavior is as expected.
>
> •*Subsequent Requests (400 Bad Request):* GDAL reuses the pre-signed 
> URL but includes the Authorization header in the request. AWS S3, 
> seeing both the pre-signed URL’s query parameters and the 
> Authorization header, returns a *400 Bad Request* error, stating that 
> only one authentication mechanism is allowed.
>
>
> *Expected Behavior:*
>
>
> I expect that GDAL’s VSICURL should send the initial request with the 
> Authorization header to the proxy URL. Upon receiving the redirect to 
> the pre-signed URL, it should not include the Authorization header in 
> subsequent requests to AWS S3. This would allow AWS S3 to accept the 
> pre-signed URL without conflicts.
>
>
> *Questions:*
>
>
> •Is there a way to configure GDAL so that it does not pass the 
> Authorization header to the redirected pre-signed URLs while retaining 
> it for the initial request?
>
> •If this feature is not currently available, would it be feasible to 
> implement such functionality?
>
> •Are there any plans to address the handling of HTTP redirect response 
> codes 301 and 307 in future GDAL releases to better support this use case?
>
>
>
> *GDAL CLI Command:*
>
> *
> *
>
> *``` *gdalinfo --debug on --config CPL_CURL_VERBOSE YES --config 
> GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config 
> CPL_VSIL_CURL_USE_S3_REDIRECT YES --config 
> GDAL_HTTP_HEADERS="Authorization: Bearer xxxx" 
> /vsicurl/https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif ```
>
>
> *Example GDAL Logs:*
>
>
> Below are the GDAL logs illustrating the issue (sensitive information 
> has been redacted):
>
>
> *First Request (200 OK):*
>
> *```*
>
> HTTP: libcurl/8.7.1 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 
> nghttp2/1.61.0
> HTTP: GDAL was built against curl 8.4.0, but is running against 8.7.1.
> CURL_INFO_TEXT: [HTTP/2] [1] OPENED stream for 
> https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif
> CURL_INFO_TEXT: [HTTP/2] [1] [:method: HEAD]
> CURL_INFO_TEXT: [HTTP/2] [1] [:scheme: https]
> CURL_INFO_TEXT: [HTTP/2] [1] [:authority: example.com 
> <http://example.com>]
> CURL_INFO_TEXT: [HTTP/2] [1] [:path: 
> /stac/collections/items/assets/foo?path=s3://your-bucket/red.tif]
> CURL_INFO_TEXT: [HTTP/2] [1] [user-agent: GDAL/3.9.2]
> CURL_INFO_TEXT: [HTTP/2] [1] [accept: */*]
> CURL_INFO_TEXT: [HTTP/2] [1] [authorization: Bearer [REDACTED]]
> CURL_INFO_HEADER_OUT: HEAD 
> /stac/collections/items/assets/foo?path=s3://your-bucket/red.tif HTTP/2
> Host: example.com <http://example.com>
> User-Agent: GDAL/3.9.2
> Accept: */*
> Authorization: Bearer [REDACTED]
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/2 301
> CURL_INFO_HEADER_IN: date: Sun, 22 Sep 2024 04:44:28 GMT
> CURL_INFO_HEADER_IN: content-type: text/plain; charset=utf-8
> CURL_INFO_HEADER_IN: content-length: 43
> CURL_INFO_HEADER_IN: location: 
> https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 
> <https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868>
> CURL_INFO_HEADER_IN: x-content-length: 176995703
> CURL_INFO_HEADER_IN: apigw-requestid: [REDACTED]
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Ignoring the response-body
> CURL_INFO_TEXT: Connection #0 to host example.com <http://example.com> 
> left intact
> CURL_INFO_TEXT: Issue another request to this URL: 
> 'https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 
> <https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868>'
> CURL_INFO_TEXT: Couldn't find host s3-bucket-placeholder in the .netrc 
> file; using defaults
> CURL_INFO_TEXT: Host s3-bucket-placeholder:443 was resolved.
> CURL_INFO_TEXT:   Trying [REDACTED]...
> CURL_INFO_TEXT: Connected to s3-bucket-placeholder ([REDACTED]) port 443
> CURL_INFO_TEXT: SSL connection using TLSv1.3 / [REDACTED]
> CURL_INFO_TEXT: Server certificate:
> CURL_INFO_TEXT:  subject: CN=*.s3.amazonaws.com <http://s3.amazonaws.com>
> CURL_INFO_TEXT:  start date: Apr 22 00:00:00 2024 GMT
> CURL_INFO_TEXT:  expire date: Apr  7 23:59:59 2025 GMT
> CURL_INFO_TEXT:  issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01
> CURL_INFO_TEXT:  SSL certificate verify ok.
> CURL_INFO_HEADER_OUT: HEAD 
> /red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 
> HTTP/1.1
> Host: s3-bucket-placeholder
> User-Agent: GDAL/3.9.2
> Accept: */*
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/1.1 200 OK
> CURL_INFO_HEADER_IN: x-amz-id-2: [REDACTED]
> CURL_INFO_HEADER_IN: x-amz-request-id: [REDACTED]
> CURL_INFO_HEADER_IN: Date: Sun, 22 Sep 2024 04:44:29 GMT
> CURL_INFO_HEADER_IN: Last-Modified: Tue, 17 Sep 2024 17:44:40 GMT
> CURL_INFO_HEADER_IN: ETag: "[REDACTED]"
> CURL_INFO_HEADER_IN: x-amz-server-side-encryption: AES256
> CURL_INFO_HEADER_IN: Accept-Ranges: bytes
> CURL_INFO_HEADER_IN: Content-Type: image/tiff
> CURL_INFO_HEADER_IN: Server: AmazonS3
> CURL_INFO_HEADER_IN: Content-Length: 176995703
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Connection #1 to host s3-bucket-placeholder left intact
> VSICURL: Effective URL: 
> https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 
> <https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868>
> VSICURL: Will use redirect URL for the next 3599 seconds
> VSICURL: 
> GetFileSize(https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif)=176995703 
>  response_code=200*
> *
>
> ```
>
>
> *Subsequent Request (400 Bad Request):*
>
> *```*
>
> VSICURL: Using redirect URL as it looks to be still valid (3599 
> seconds left)
> VSICURL: Downloading 0-16383 
> (https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868). 
> <https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868).>..
> CURL_INFO_TEXT: Couldn't find host s3-bucket-placeholder in the .netrc 
> file; using defaults
> CURL_INFO_TEXT: Found bundle for host: 0x600001aa8270 [serially]
> CURL_INFO_TEXT: Can not multiplex, even if we wanted to
> CURL_INFO_TEXT: Re-using existing connection with host 
> s3-bucket-placeholder
> CURL_INFO_HEADER_OUT: GET 
> /red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 
> HTTP/1.1
> Host: s3-bucket-placeholder
> User-Agent: GDAL/3.9.2
> Accept: */*
> Authorization: Bearer [REDACTED]
> Range: bytes=0-16383
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/1.1 400 Bad Request
> CURL_INFO_HEADER_IN: x-amz-request-id: [REDACTED]
> CURL_INFO_HEADER_IN: x-amz-id-2: [REDACTED]
> CURL_INFO_HEADER_IN: Content-Type: application/xml
> CURL_INFO_HEADER_IN: Transfer-Encoding: chunked
> CURL_INFO_HEADER_IN: Date: Sun, 22 Sep 2024 04:44:28 GMT
> CURL_INFO_HEADER_IN: Server: AmazonS3
> CURL_INFO_HEADER_IN: Connection: close
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Closing connection
> VSICURL: Got response_code=400
> ERROR 4: 
> `/vsicurl/https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif' 
> not recognized as being in a supported file format.
> gdalinfo failed - unable to open 
> '/vsicurl/https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif'.*
> *
>
> ```
>
> *Additional Information:*
>
>
> If you require any further details or clarification, please let me 
> know. I would be happy to provide more information. Additionally, if 
> necessary, I can create an issue on the GDAL GitHub repository to 
> track this problem.
>
>
> Thank you very much for your time and assistance. I appreciate any 
> guidance or suggestions you can provide to help resolve this issue.
>
>
> Best Regards,
> Pradeep Gulla
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240922/88689968/attachment-0001.htm>


More information about the gdal-dev mailing list