[gdal-dev] Issue with GDAL VSICURL Passing Authorization Header to Redirected Pre-signed S3 URLs

Pradeep kumar parthivpradeep at gmail.com
Sun Sep 22 17:18:24 PDT 2024


Hi Even,


Thank you very much for quickly looking into this issue and providing a
fix. I appreciate your prompt response and assistance.


I’ve tested the with `GDAL 3.10.0dev-58cc577359, released 2024/09/22 (debug
build)`, and the following command is now working without any issues:


```

gdalinfo --debug on \
         --config CPL_CURL_VERBOSE YES \
         --config GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR \
         --config CPL_VSIL_CURL_USE_HEAD NO \
         --config GDAL_HTTP_HEADERS="Authorization: Bearer xxxx" \
         /vsicurl/
https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif

```

This has resolved the problem I was encountering with the Authorization
header being passed to the redirected pre-signed S3 URLs.


Thank you again for your help and for the excellent support. Your work is
greatly appreciated.


Best regards,


Pradeep Gulla




On Sun, Sep 22, 2024 at 5:07 AM Even Rouault <even.rouault at spatialys.com>
wrote:

> Hi Pradeep
>
> Can you give a try to https://github.com/OSGeo/gdal/pull/10857 ?
>
> Even
> Le 22/09/2024 à 07:37, Pradeep kumar via gdal-dev a écrit :
>
> Dear GDAL Developers,
>
>
> I hope this message finds you well.
>
>
> I am experiencing an issue with GDAL’s VSICURL virtual file system where
> the Authorization header is being passed to redirected pre-signed S3
> URLs, leading to errors when accessing AWS S3.
>
>
> *Background:*
>
>
> In my use case, I have a proxy URL that, when called with an Authorization
> header, validates the token, verifies authorization, and then generates a
> pre-signed URL to an S3 object. The proxy then redirects the client to this
> pre-signed URL. When using GDAL with this reverse proxy server and setting
> the configuration option --config CPL_VSIL_CURL_USE_S3_REDIRECT NO,
> everything works correctly. However, the issue is that GDAL makes too many
> round trips, and each time I end up generating a new pre-signed URL.
> Ideally, I would like GDAL to send the initial request and then reuse the
> pre-signed URLs for subsequent requests.
>
>
> This desired behavior is achieved with --config
> CPL_VSIL_CURL_USE_S3_REDIRECT YES. However, the problem now is that GDAL,
> on subsequent requests to the pre-signed URL, is also passing the
> Authorization header. I am using the configuration --config
> GDAL_HTTP_HEADERS "Authorization: Bearer xxxx" to set the initial header,
> so the Authorization header is being sent with every request.
>
>
> When the Authorization header is sent to AWS S3 along with the pre-signed
> URL, AWS returns the following error:
>
>
> ```Only one auth mechanism allowed; only the X-Amz-Algorithm query
> parameter, Signature query string parameter or the Authorization header
> should be specified.```
>
>
> *Observed Behavior:*
>
>
> • *First Request (302 OK):* GDAL sends a request to the proxy URL with
> the Authorization header. The proxy validates the token and redirects to
> the pre-signed S3 URL. GDAL follows the redirect, and since the pre-signed
> URL is accessed without the Authorization header, AWS S3 responds with a *200
> OK*. This behavior is as expected.
>
> • *Subsequent Requests (400 Bad Request):* GDAL reuses the pre-signed URL
> but includes the Authorization header in the request. AWS S3, seeing both
> the pre-signed URL’s query parameters and the Authorization header,
> returns a *400 Bad Request* error, stating that only one authentication
> mechanism is allowed.
>
>
> *Expected Behavior:*
>
>
> I expect that GDAL’s VSICURL should send the initial request with the
> Authorization header to the proxy URL. Upon receiving the redirect to the
> pre-signed URL, it should not include the Authorization header in
> subsequent requests to AWS S3. This would allow AWS S3 to accept the
> pre-signed URL without conflicts.
>
>
> *Questions:*
>
>
> • Is there a way to configure GDAL so that it does not pass the
> Authorization header to the redirected pre-signed URLs while retaining it
> for the initial request?
>
> • If this feature is not currently available, would it be feasible to
> implement such functionality?
>
> • Are there any plans to address the handling of HTTP redirect response
> codes 301 and 307 in future GDAL releases to better support this use case?
>
>
>
> *GDAL CLI Command:*
>
>
> *``` *gdalinfo --debug on --config CPL_CURL_VERBOSE YES --config
> GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config
> CPL_VSIL_CURL_USE_S3_REDIRECT YES --config
> GDAL_HTTP_HEADERS="Authorization: Bearer xxxx" /vsicurl/
> https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif
>  ```
>
>
> *Example GDAL Logs:*
>
>
> Below are the GDAL logs illustrating the issue (sensitive information has
> been redacted):
>
>
> *First Request (200 OK):*
>
> *```*
>
> HTTP: libcurl/8.7.1 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12
> nghttp2/1.61.0
> HTTP: GDAL was built against curl 8.4.0, but is running against 8.7.1.
> CURL_INFO_TEXT: [HTTP/2] [1] OPENED stream for
> https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif
> CURL_INFO_TEXT: [HTTP/2] [1] [:method: HEAD]
> CURL_INFO_TEXT: [HTTP/2] [1] [:scheme: https]
> CURL_INFO_TEXT: [HTTP/2] [1] [:authority: example.com]
> CURL_INFO_TEXT: [HTTP/2] [1] [:path:
> /stac/collections/items/assets/foo?path=s3://your-bucket/red.tif]
> CURL_INFO_TEXT: [HTTP/2] [1] [user-agent: GDAL/3.9.2]
> CURL_INFO_TEXT: [HTTP/2] [1] [accept: */*]
> CURL_INFO_TEXT: [HTTP/2] [1] [authorization: Bearer [REDACTED]]
> CURL_INFO_HEADER_OUT: HEAD
> /stac/collections/items/assets/foo?path=s3://your-bucket/red.tif HTTP/2
> Host: example.com
> User-Agent: GDAL/3.9.2
> Accept: */*
> Authorization: Bearer [REDACTED]
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/2 301
> CURL_INFO_HEADER_IN: date: Sun, 22 Sep 2024 04:44:28 GMT
> CURL_INFO_HEADER_IN: content-type: text/plain; charset=utf-8
> CURL_INFO_HEADER_IN: content-length: 43
> CURL_INFO_HEADER_IN: location:
> https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> CURL_INFO_HEADER_IN: x-content-length: 176995703
> CURL_INFO_HEADER_IN: apigw-requestid: [REDACTED]
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Ignoring the response-body
> CURL_INFO_TEXT: Connection #0 to host example.com left intact
> CURL_INFO_TEXT: Issue another request to this URL: '
> https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> '
> CURL_INFO_TEXT: Couldn't find host s3-bucket-placeholder in the .netrc
> file; using defaults
> CURL_INFO_TEXT: Host s3-bucket-placeholder:443 was resolved.
> CURL_INFO_TEXT:   Trying [REDACTED]...
> CURL_INFO_TEXT: Connected to s3-bucket-placeholder ([REDACTED]) port 443
> CURL_INFO_TEXT: SSL connection using TLSv1.3 / [REDACTED]
> CURL_INFO_TEXT: Server certificate:
> CURL_INFO_TEXT:  subject: CN=*.s3.amazonaws.com
> CURL_INFO_TEXT:  start date: Apr 22 00:00:00 2024 GMT
> CURL_INFO_TEXT:  expire date: Apr  7 23:59:59 2025 GMT
> CURL_INFO_TEXT:  issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01
> CURL_INFO_TEXT:  SSL certificate verify ok.
> CURL_INFO_HEADER_OUT: HEAD
> /red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> HTTP/1.1
> Host: s3-bucket-placeholder
> User-Agent: GDAL/3.9.2
> Accept: */*
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/1.1 200 OK
> CURL_INFO_HEADER_IN: x-amz-id-2: [REDACTED]
> CURL_INFO_HEADER_IN: x-amz-request-id: [REDACTED]
> CURL_INFO_HEADER_IN: Date: Sun, 22 Sep 2024 04:44:29 GMT
> CURL_INFO_HEADER_IN: Last-Modified: Tue, 17 Sep 2024 17:44:40 GMT
> CURL_INFO_HEADER_IN: ETag: "[REDACTED]"
> CURL_INFO_HEADER_IN: x-amz-server-side-encryption: AES256
> CURL_INFO_HEADER_IN: Accept-Ranges: bytes
> CURL_INFO_HEADER_IN: Content-Type: image/tiff
> CURL_INFO_HEADER_IN: Server: AmazonS3
> CURL_INFO_HEADER_IN: Content-Length: 176995703
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Connection #1 to host s3-bucket-placeholder left intact
> VSICURL: Effective URL:
> https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> VSICURL: Will use redirect URL for the next 3599 seconds
> VSICURL: GetFileSize(
> https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif)=176995703
>  response_code=200
>
> ```
>
>
> *Subsequent Request (400 Bad Request):*
>
> *```*
>
> VSICURL: Using redirect URL as it looks to be still valid (3599 seconds
> left)
> VSICURL: Downloading 0-16383 (
> https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868).
> ..
> CURL_INFO_TEXT: Couldn't find host s3-bucket-placeholder in the .netrc
> file; using defaults
> CURL_INFO_TEXT: Found bundle for host: 0x600001aa8270 [serially]
> CURL_INFO_TEXT: Can not multiplex, even if we wanted to
> CURL_INFO_TEXT: Re-using existing connection with host
> s3-bucket-placeholder
> CURL_INFO_HEADER_OUT: GET
> /red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868
> HTTP/1.1
> Host: s3-bucket-placeholder
> User-Agent: GDAL/3.9.2
> Accept: */*
> Authorization: Bearer [REDACTED]
> Range: bytes=0-16383
>
> CURL_INFO_TEXT: Request completely sent off
> CURL_INFO_HEADER_IN: HTTP/1.1 400 Bad Request
> CURL_INFO_HEADER_IN: x-amz-request-id: [REDACTED]
> CURL_INFO_HEADER_IN: x-amz-id-2: [REDACTED]
> CURL_INFO_HEADER_IN: Content-Type: application/xml
> CURL_INFO_HEADER_IN: Transfer-Encoding: chunked
> CURL_INFO_HEADER_IN: Date: Sun, 22 Sep 2024 04:44:28 GMT
> CURL_INFO_HEADER_IN: Server: AmazonS3
> CURL_INFO_HEADER_IN: Connection: close
> CURL_INFO_HEADER_IN:
> CURL_INFO_TEXT: Closing connection
> VSICURL: Got response_code=400
> ERROR 4: `/vsicurl/
> https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif'
> not recognized as being in a supported file format.
> gdalinfo failed - unable to open '/vsicurl/
> https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif
> '.
>
> ```
>
> *Additional Information:*
>
>
> If you require any further details or clarification, please let me know. I
> would be happy to provide more information. Additionally, if necessary, I
> can create an issue on the GDAL GitHub repository to track this problem.
>
>
> Thank you very much for your time and assistance. I appreciate any
> guidance or suggestions you can provide to help resolve this issue.
>
>
> Best Regards,
> Pradeep Gulla
>
> _______________________________________________
> gdal-dev mailing listgdal-dev at lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> -- http://www.spatialys.com
> My software is free, but my time generally not.
>
>

-- 
Best Regards,
Pradeep Gulla
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240922/c158aa6f/attachment-0001.htm>


More information about the gdal-dev mailing list