[gdal-dev] GRIB file being scanned despite .idx being present
Daniel Baston
dbaston at gmail.com
Thu Oct 9 08:36:17 PDT 2025
FWIW, the following snippet is working with gdal master:
from osgeo import gdal
with gdal.config_options({"AWS_NO_SIGN_REQUEST":"True", "CPL_DEBUG":"True",
"CPL_CURL_VERBOSE":"True"}):
ds =
gdal.Open("/vsis3/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012")
band = ds.GetRasterBand(636)
x = band.ReadAsArray()
print(x.mean())
Dan
On Thu, Oct 9, 2025 at 10:27 AM Daniel Evans via gdal-dev <
gdal-dev at lists.osgeo.org> wrote:
> Hi all,
>
> I am attempting to read a single band from a NOAA GRIB2 file on S3, with
> an associated .idx file. Reading the GRIB2 driver documentation, it is
> stated that the existence of such an idx file allows a file to be opened
> without reading all bands.
>
> However, looking at the CPL_CURL_VERBOSE=True logs, it appears that GDAL
> is still paging through the file from the start until reaching the
> requested band.
>
> GDAL identifies the existence of the .idx file:
>
> DEBUG:CPLE_None in GRIB: Reading inventories from sidecar file
> /vsis3/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx
> DEBUG:CPLE_None in S3: Downloading 0-41215 (
> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx).
> ..
>
> But it then appears to scan the file from the start until it has passed
> the requested band:
>
> DEBUG:CPLE_None in S3: Downloading 16384-999423 (
> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
> ..
> DEBUG:CPLE_None in S3: Downloading 999424-2965503 (
> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
> ..
> DEBUG:CPLE_None in S3: Downloading 2965504-6897663 (
> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
> ..
> [...]
> DEBUG:S3: Downloading 449626112-450461695 (
> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
> ..
>
> Band 636 is listed in the .idx with offset 443333308, Band 637 having
> offset 444174665. The total filesize is 545533166.
>
>
> Do I need to do something extra to trigger GDAL to read only the requested
> band based on the .idx? Are some GRIB/.idx files not able to be loaded in
> this way?
>
> I am running via rasterio v1.4.3 which is using GDAL v3.9.3. My code is
> below, the file is in a public NOAA-hosted bucket.
>
> Cheers,
> Daniel
>
> ###
>
> import logging
> import rasterio
>
> logging.basicConfig(format="%(levelname)s:%(message)s",
> level=logging.DEBUG)
>
> with rasterio.Env(USE_IDX=True, AWS_VIRTUAL_HOSTING=False, CPL_DEBUG=True,
> CPL_CURL_VERBOSE=True):
> with
> rasterio.open("s3://noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012")
> as ds:
> band = ds.read(636)
>
> ###
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20251009/aba0fd66/attachment-0001.htm>
More information about the gdal-dev
mailing list