[gdal-dev] writing parquet to /vsiaz/ fails when chunked
snorris at hillcrestgeo.ca
snorris at hillcrestgeo.ca
Fri Apr 4 11:40:40 PDT 2025
Perhaps another hapless user question:
Using ogr2ogr, I can write very small parquet files to /vsiaz/<container>/<file>.parquet without issue.
But with larger files (that ogr2ogr tries to write in chunks), Azure does not accept the first PUT request and ogr2ogr fails with:
ERROR 1: PUT of /vsiaz/<container>/<file>.parquet failed
ERROR 1: WriteColumnChunk() failed for <column>: Error while writing
ERROR 1: FileWriter::Close() failed with Only 28 out of 84 columns are initialized
ARROW: Memory pool (writer layer): bytes_allocated = 0
ARROW: Memory pool (writer layer): max_memory = 56286976
GDAL: GDALClose(/vsiaz/<container>/<file>.parquet, this=0x5910350ebb10)
GDAL: In GDALDestroy - unloading GDAL shared library.
Error: Process completed with exit code 1.
with --config CPL_CURL_VERBOSE=YES, this looks to be the issue:
CURL_INFO_HEADER_OUT: PUT /<container>/<file>.parquet?<token>
User-Agent: GDAL/3.10.2
Accept: */*
Content-Length: 0
x-ms-blob-type: AppendBlob
x-ms-date: Thu, 03 Apr 2025 23:02:46 GMT
CURL_INFO_TEXT: Request completely sent off
CURL_INFO_HEADER_IN: HTTP/1.1 409 The blob type is invalid for this operation.
CURL_INFO_HEADER_IN: Content-Length: 228
CURL_INFO_HEADER_IN: Content-Type: application/xml
CURL_INFO_HEADER_IN: Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
CURL_INFO_HEADER_IN: x-ms-request-id: c24de328-b01e-003a-1eec-a49ba5000000
CURL_INFO_HEADER_IN: x-ms-version: 2022-11-02
CURL_INFO_HEADER_IN: x-ms-error-code: InvalidBlobType
CURL_INFO_HEADER_IN: Date: Thu, 03 Apr 2025 23:02:45 GMT
- no issue with ogr2ogr write of same data as flat geobuf to /vsiaz/ (and resulting blob type is AppendBlob)
- no issue with `az copy` write of the same parquet to azure blob
- no issue with ogr2ogr write of the same parquet to /vsis3/
- same issue on macos/gdal 10.2 and ghcr.io/osgeo/gdal:ubuntu-full-3.10.0
I'd file an issue but if it was a bug I'd think it would have already have been found by other users?
thanks
Simon
More information about the gdal-dev
mailing list