<div dir="ltr"><div>Hmm, yes - I see it jumping straight to the relevant band when run via a locally compiled GDAL 3.11.4 using your code, but when using rasterio built on top of that same GDAL 3.11.4, it's paging through the whole file. Seems like there's something with how my Python environment/code configures things when using rasterio, or something that rasterio configures, that is modifying the behaviour.</div><div><br></div><div>Thoughts on where to look welcome, but it doesn't appear to be a GDAL-level problem.</div><div><br></div><div>Cheers,</div><div>Daniel</div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Thu, 9 Oct 2025 at 16:36, Daniel Baston <<a href="mailto:dbaston@gmail.com">dbaston@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>FWIW, the following snippet is working with gdal master:</div><div><br></div><div>from osgeo import gdal<br><br>with gdal.config_options({"AWS_NO_SIGN_REQUEST":"True", "CPL_DEBUG":"True", "CPL_CURL_VERBOSE":"True"}):<br> ds = gdal.Open("/vsis3/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012")<br> band = ds.GetRasterBand(636)<br> x = band.ReadAsArray()<br> print(x.mean())<br><br></div><div>Dan</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 9, 2025 at 10:27 AM Daniel Evans via gdal-dev <<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi all,</div><div><br></div><div>I am attempting to read a single band from a NOAA GRIB2 file on S3, with an associated .idx file. Reading the GRIB2 driver documentation, it is stated that the existence of such an idx file allows a file to be opened without reading all bands.</div><div><br></div><div>However, looking at the CPL_CURL_VERBOSE=True logs, it appears that GDAL is still paging through the file from the start until reaching the requested band.</div><div><br></div><div>GDAL identifies the existence of the .idx file:</div><div><br></div><div>DEBUG:CPLE_None in GRIB: Reading inventories from sidecar file /vsis3/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx<br>DEBUG:CPLE_None in S3: Downloading 0-41215 (<a href="https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx)." target="_blank">https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx).</a>..</div><div><br></div><div>But it then appears to scan the file from the start until it has passed the requested band:</div><div><br></div><div>DEBUG:CPLE_None in S3: Downloading 16384-999423 (<a href="https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012)." target="_blank">https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).</a>..</div><div>DEBUG:CPLE_None in S3: Downloading 999424-2965503 (<a href="https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012)." target="_blank">https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).</a>..<br>DEBUG:CPLE_None in S3: Downloading 2965504-6897663 (<a href="https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012)." target="_blank">https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).</a>..<br>[...]</div><div>DEBUG:S3: Downloading 449626112-450461695 (<a href="https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012)." target="_blank">https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).</a>..<br><br></div><div>Band 636 is listed in the .idx with offset 443333308, Band 637 having offset 444174665. The total filesize is 545533166.</div><div><br></div><div><br></div><div>Do I need to do something extra to trigger GDAL to read only the requested band based on the .idx? Are some GRIB/.idx files not able to be loaded in this way?</div><div><br></div><div>I am running via rasterio v1.4.3 which is using GDAL v3.9.3. My code is below, the file is in a public NOAA-hosted bucket.</div><div><br></div><div>Cheers,</div><div>Daniel</div><div><br></div><div>###</div><div><br></div>import logging<br>import rasterio<br><br>logging.basicConfig(format="%(levelname)s:%(message)s", level=logging.DEBUG)<br><br>with rasterio.Env(USE_IDX=True, AWS_VIRTUAL_HOSTING=False, CPL_DEBUG=True, CPL_CURL_VERBOSE=True):<br> with rasterio.open("s3://noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012") as ds:<br><div> band = ds.read(636)</div><div><br></div><div>###</div><div><br></div></div>
_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>
</blockquote></div>
</blockquote></div>