[gdal-dev] writing parquet to /vsiaz/ fails when chunked

snorris at hillcrestgeo.ca snorris at hillcrestgeo.ca
Fri Apr 4 11:40:40 PDT 2025


Perhaps another hapless user question:

Using ogr2ogr, I can write very small parquet files to /vsiaz/<container>/<file>.parquet without issue.
But with larger files (that ogr2ogr tries to write in chunks), Azure does not accept the first PUT request and ogr2ogr fails with:

ERROR 1: PUT of /vsiaz/<container>/<file>.parquet failed
ERROR 1: WriteColumnChunk() failed for <column>: Error while writing
ERROR 1: FileWriter::Close() failed with Only 28 out of 84 columns are initialized
ARROW: Memory pool (writer layer): bytes_allocated = 0
ARROW: Memory pool (writer layer): max_memory = 56286976
GDAL: GDALClose(/vsiaz/<container>/<file>.parquet, this=0x5910350ebb10)
GDAL: In GDALDestroy - unloading GDAL shared library.
Error: Process completed with exit code 1.

with --config CPL_CURL_VERBOSE=YES, this looks to be the issue:

CURL_INFO_HEADER_OUT: PUT /<container>/<file>.parquet?<token>
User-Agent: GDAL/3.10.2
Accept: */*
Content-Length: 0
x-ms-blob-type: AppendBlob
x-ms-date: Thu, 03 Apr 2025 23:02:46 GMT

CURL_INFO_TEXT: Request completely sent off
CURL_INFO_HEADER_IN: HTTP/1.1 409 The blob type is invalid for this operation.
CURL_INFO_HEADER_IN: Content-Length: 228
CURL_INFO_HEADER_IN: Content-Type: application/xml
CURL_INFO_HEADER_IN: Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
CURL_INFO_HEADER_IN: x-ms-request-id: c24de328-b01e-003a-1eec-a49ba5000000
CURL_INFO_HEADER_IN: x-ms-version: 2022-11-02
CURL_INFO_HEADER_IN: x-ms-error-code: InvalidBlobType
CURL_INFO_HEADER_IN: Date: Thu, 03 Apr 2025 23:02:45 GMT

- no issue with ogr2ogr write of same data as flat geobuf to /vsiaz/ (and resulting blob type is AppendBlob)
- no issue with `az copy` write of the same parquet to azure blob
- no issue with ogr2ogr write of the same parquet to /vsis3/
- same issue on macos/gdal 10.2 and ghcr.io/osgeo/gdal:ubuntu-full-3.10.0

I'd file an issue but if it was a bug I'd think it would have already have been found by other users?

thanks
Simon



More information about the gdal-dev mailing list