[gdal-dev] gdal_translate is slow

Even Rouault even.rouault at spatialys.com
Wed Sep 4 08:52:59 PDT 2019


On mercredi 4 septembre 2019 17:26:14 CEST Denis Rykov wrote:
> Thanks for quick reply, I've uploaded grib file here:
> https://transfer.sh/5JCVX/download.grib

Turns out that my guess wasn't so bad after all. The uncompressed file size is
3601x1801x14(bands)x8(bytes_per_pixel) = 693 MB
whereas a GRIB dataset has an internal cache by default of only 100 MB.
As you write to a pixel-interleaved GTiff, there's constant back and forth 
between bands when reading chunks and thus the GRIB cache has no effect.
So 2 possible workarounds:
- increase GRIB_CACHEMAX to 1000 for example. Limited to a GRIB dataset that 
can fits uncompressed in memory.
- add "-co interleave=band" to generate a band-interleaved geotiff. That one 
can work with an arbitrarily large GRIB file

I've committed an improvement, so if you now row master with --debug on, 
you'll see a hint
"""
GRIB: Maximum band cache size reached for this dataset. Caching only one band 
at a time from now, which can negatively affect performance. Consider 
increasing GRIB_CACHEMAX to a higher value (in MB), at least 693 in that 
instance
"""

As far as I can see in 
https://github.com/mapbox/rasterio/blob/
e1c984bf0e4a18e039569bb3bafe6667bb5b3a69/rasterio/rio/convert.py#L76
it presumably ingest the whole dataset into memory (Sean, correct me if I'm 
wrong), and thus those caching issues don't trigger since the input dataset 
will be read band-per-band.

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list