[gdal-dev] Multi-threading GDALRasterBand::RasterIO() ?

Even Rouault even.rouault at spatialys.com
Mon Nov 1 14:05:45 PDT 2021


Yes regarding multithreading. Regarding GRIB and performance issues, you 
must be aware that the GRIB driver when accessing a single pixel of a 
band needs to decompress data for the whole band. Hence there's a 
per-dataset cache of band data which default to 100 MB (you can increase 
it by setting the GRIB_CACHEMAX config option to a number in megabytes). 
So the most performance access pattern for GRIB is to read band per 
band, and no all-bands-of-a-line

Le 01/11/2021 à 21:58, Simon Eves a écrit :
> You can ignore this.
>
> I have rather belatedly found the documentation that says that one 
> must open a GDALDataset per thread, even if it's on the same file.
>
> The multi-threading now works just fine.
>
> Interestingly, we're not actually doing that with our existing geo 
> importer. I guess it's OK because we're pulling the OGRFeatures out 
> with the process thread, and only converting and loading them with the 
> child threads. I guess I really ought to rewrite that code too now. Sigh.
>
> As you were...
>
> Simon
>
> On Sun, Oct 31, 2021 at 4:27 PM Simon Eves <simon.eves at omnisci.com 
> <mailto:simon.eves at omnisci.com>> wrote:
>
>     We are writing a raster importer, and finding that
>     GDALRasterBand::RasterIO() is unexpectedly slow for some GRIB2 files.
>
>     We have a file which is about 1800x1000 pixels, with 49 bands of
>     type DOUBLE. The file is about 47MB on disc.
>
>     Reading all the bands of a single scanline from this file takes
>     about 1300ms, which is about 26ms per band, hence the entire file
>     takes around 20 minutes to import. All the time seems to be spent
>     in the RasterIO() call, even though it's not doing any raster
>     rescaling or data format conversion (1:1 pixels, fetching as
>     GDT_Float64).
>
>     So, I figured we'd try multi-threading it, but evidently the call
>     is not thread-safe. Here is just one of various stack traces it
>     will throw.
>
>     libc.so.6!raise (Unknown Source:0)
>     libc.so.6!abort (Unknown Source:0)
>     libc.so.6![Unknown/Just-In-Time compiled code] (Unknown Source:0)
>     libgdal.so.28!GRIBRasterBand::UncacheData(GRIBRasterBand * const
>     this) (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:948)
>     libgdal.so.28!GRIBRasterBand::LoadData(GRIBRasterBand * const
>     this) (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:730)
>     libgdal.so.28!GRIBRasterBand::LoadData(GRIBRasterBand * const
>     this) (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:697)
>     libgdal.so.28!GRIBRasterBand::IReadBlock(GRIBRasterBand * const
>     this, int nBlockYOff, void * pImage)
>     (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:803)
>     libgdal.so.28!GDALRasterBand::GetLockedBlockRef(int
>     bJustInitialize, int nYBlockOff, int nXBlockOff, GDALRasterBand *
>     const this) (/build/scripts/gdal-3.2.2/gcore/gdal_priv.h:963)
>     libgdal.so.28!GDALRasterBand::GetLockedBlockRef(GDALRasterBand *
>     const this, int nXBlockOff, int nYBlockOff, int bJustInitialize)
>     (/build/scripts/gdal-3.2.2/gcore/gdalrasterband.cpp:1238)
>     libgdal.so.28!GDALRasterBand::IRasterIO(GDALRasterBand * const
>     this, GDALRWFlag eRWFlag, int nXOff, int nYOff, int nXSize, int
>     nYSize, void * pData, int nBufXSize, int nBufYSize, GDALDataType
>     eBufType, GSpacing nPixelSpace, GSpacing nLineSpace,
>     GDALRasterIOExtraArg * psExtraArg)
>     (/build/scripts/gdal-3.2.2/gcore/rasterio.cpp:149)
>     libgdal.so.28!GDALRasterBand::RasterIO(GDALRasterBand * const
>     this, GDALRWFlag eRWFlag, int nXOff, int nYOff, int nXSize, int
>     nYSize, void * pData, int nBufXSize, int nBufYSize, GDALDataType
>     eBufType, GSpacing nPixelSpace, GSpacing nLineSpace,
>     GDALRasterIOExtraArg * psExtraArg)
>     (/build/scripts/gdal-3.2.2/gcore/gdalrasterband.cpp:372)
>     import_export::Importer::<lambda(size_t, int)>::operator()(size_t,
>     int) const(const import_export::Importer::<lambda(size_t, int)> *
>     const __closure, const size_t thread_id, const int y)
>     (/home/simon.eves/work/omniscidb-internal/ImportExport/Importer.cpp:5721)
>     ...
>
>     All of the parameters to the call are either constant or
>     uncontended simple variables, and obviously there is a unique data
>     buffer (pData) per thread.
>
>     Is there anything we can do to make this work?
>
>     I was intending to look into the lower level block-based API, in
>     the hope that it will be faster, but have not yet done so.
>
>     This is all with a local static build of GDAL 3.2.2 on Ubuntu
>     20.04 with GCC 9.
>
>     Yours,
>
>     Simon Eves
>
>     -- 
>     <http://www.omnisci.com/>
>     	
>     Simon Eves
>     Senior Graphics Engineer, Rendering Group
>     100 Montgomery St (5th Floor), San Francisco, CA 94104, USA
>
>
>     	
>     Email: simon.eves at omnisci.com <mailto:simon.eves at omnisci.com> |
>     Cell: 	+1 (415) 902-1996
>
>
>
>
>
> -- 
> <http://www.omnisci.com/>
> 	
> Simon Eves
> Senior Graphics Engineer, Rendering Group
> 100 Montgomery St (5th Floor), San Francisco, CA 94104, USA
>
>
> 	
> Email: simon.eves at omnisci.com <mailto:simon.eves at omnisci.com> | Cell: 
> +1 (415) 902-1996
>
>
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
http://www.spatialys.com
My software is free, but my time generally not.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20211101/814ad8a6/attachment-0001.html>


More information about the gdal-dev mailing list