[gdal-dev] GRIB file being scanned despite .idx being present

Daniel Evans daniel.fred.evans at gmail.com
Thu Oct 9 07:26:45 PDT 2025


Hi all,

I am attempting to read a single band from a NOAA GRIB2 file on S3, with an
associated .idx file. Reading the GRIB2 driver documentation, it is stated
that the existence of such an idx file allows a file to be opened without
reading all bands.

However, looking at the CPL_CURL_VERBOSE=True logs, it appears that GDAL is
still paging through the file from the start until reaching the requested
band.

GDAL identifies the existence of the .idx file:

DEBUG:CPLE_None in GRIB: Reading inventories from sidecar file
/vsis3/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx
DEBUG:CPLE_None in S3: Downloading 0-41215 (
https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx).
..

But it then appears to scan the file from the start until it has passed the
requested band:

DEBUG:CPLE_None in S3: Downloading 16384-999423 (
https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
..
DEBUG:CPLE_None in S3: Downloading 999424-2965503 (
https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
..
DEBUG:CPLE_None in S3: Downloading 2965504-6897663 (
https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
..
[...]
DEBUG:S3: Downloading 449626112-450461695 (
https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
..

Band 636 is listed in the .idx with offset 443333308, Band 637 having
offset 444174665. The total filesize is 545533166.


Do I need to do something extra to trigger GDAL to read only the requested
band based on the .idx? Are some GRIB/.idx files not able to be loaded in
this way?

I am running via rasterio v1.4.3 which is using GDAL v3.9.3. My code is
below, the file is in a public NOAA-hosted bucket.

Cheers,
Daniel

###

import logging
import rasterio

logging.basicConfig(format="%(levelname)s:%(message)s", level=logging.DEBUG)

with rasterio.Env(USE_IDX=True, AWS_VIRTUAL_HOSTING=False, CPL_DEBUG=True,
CPL_CURL_VERBOSE=True):
    with
rasterio.open("s3://noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012")
as ds:
        band = ds.read(636)

###
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20251009/e08eeb01/attachment.htm>


More information about the gdal-dev mailing list