[gdal-dev] Intermittent memory-corruption crashes in GRIB2 import

Simon Eves simon.eves at omnisci.com
Sun Nov 14 16:32:52 PST 2021


We have recently implemented a geo raster importer, and all seems fine,
except that we hit an issue with a particular GRIB2 file from the NOAA
website, where we get an inconsistent crash inside GDAL after a few hundred
scanlines.

We have seen two different crashes inside GDAL, and a third in one of our
code paths, but given that there is a memory corruption, the latter is
perhaps unsurprising.

All crashes report "double free or corruption (fasttop)".

We are multi-threading the reading, but using a OGRDataSource per thread.
The child threads are only calling GetRasterBand(), GetRasterDataType() and
RasterIO() and only one one band at a time.

The GRIB2 file is 103MB with 73 Float64 bands, but only 2345x1597 "pixels".

We tried converting the file to GeoTIFF with gdal_translate (no options,
just in and out) and it took 28 minutes (on a ~2017 i7 quad 4.2), which is
surprising, as we have other GRIB2 files (between 2 and 12MB) which convert
"instantly". The resulting GeoTIFF is much bigger (21x) but seems to import
reliably, has basically the same schema (as reported by gdalinfo) and
results in the same data when imported into our system.

We only get the crash occasionally, and have only been able to trap it in
the Debugger a couple of times, with nothing obviously wrong.

Here is a link to the GRIB2 file in question:

https://drive.google.com/file/d/12Fo6jnIhxzCvnSsup9n0kHVKy9lrHD2l/view?usp=sharing

Attached is the most common stack trace.

With a DEBUG build of GDAL, looks like it's crashing trying to do a
realloc() on "buffer" which is NULL, although that is supposedly a copy of
"&errBuffer" at the frame above which seems fine.

Gonna try robustifying that code and see what happens...

This is all with GDAL 3.2.2 on Ubuntu 20.04 LTS with GCC 9.

-- 
<http://www.omnisci.com/>
Simon Eves
Senior Graphics Engineer, Rendering Group
100 Montgomery St (5th Floor), San Francisco, CA 94104, USA


Email: simon.eves at omnisci.com | Cell:  +1 (415) 902-1996
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20211114/18e03b51/attachment.html>
-------------- next part --------------
libc.so.6!raise (Unknown Source:0)
libc.so.6!abort (Unknown Source:0)
libc.so.6![Unknown/Just-In-Time compiled code] (Unknown Source:0)
libc.so.6!realloc (Unknown Source:0)
libgdal.so.28!AllocSprintf(char ** Ptr, size_t * LenBuff, const char * fmt, struct __va_list_tag * ap) (/build/scripts/gdal-3.2.2/frmts/grib/degrib/degrib/myerror.c:126)
libgdal.so.28!errSprintf(const char * fmt) (/build/scripts/gdal-3.2.2/frmts/grib/degrib/degrib/myerror.c:393)
libgdal.so.28!ElemNameProb(int templat, uChar genID, int * convert, char ** unit, char ** comment, char ** name, double upperProb, double lowerProb, uChar probType, uChar timeIncrType, uChar timeRangeUnit, sInt4 lenTime, uChar subcat, uChar cat, int prodType, uShort2 subcenter, uChar mstrVersion) (/build/scripts/gdal-3.2.2/frmts/grib/degrib/degrib/metaname.cpp:2778)
libgdal.so.28!ParseElemName(uChar mstrVersion, uShort2 center, uShort2 subcenter, int prodType, int templat, int cat, int subcat, sInt4 lenTime, uChar timeRangeUnit, uChar statProcessID, uChar timeIncrType, uChar genID, uChar probType, double lowerProb, double upperProb, uChar derivedFcst, char ** name, char ** comment, char ** unit, int * convert, sChar percentile, uChar genProcess, sChar f_fstValue, double fstSurfValue, sChar f_sndValue, double sndSurfValue) (/build/scripts/gdal-3.2.2/frmts/grib/degrib/degrib/metaname.cpp:3542)
libgdal.so.28!MetaParse(grib_MetaData * meta, sInt4 * is0, sInt4 ns0, sInt4 * is1, sInt4 ns1, sInt4 * is2, sInt4 ns2, float * rdat, sInt4 nrdat, sInt4 * idat, sInt4 nidat, sInt4 * is3, sInt4 ns3, sInt4 * is4, sInt4 ns4, sInt4 * is5, sInt4 ns5, sInt4 grib_len, float xmissp, float xmisss, int simpVer, int simpWWA) (/build/scripts/gdal-3.2.2/frmts/grib/degrib/degrib/metaparse.cpp:2461)
libgdal.so.28!ReadGrib2Record(VSILFILE * fp, sChar f_unit, double ** Grib_Data, uInt4 * grib_DataLen, grib_MetaData * meta, IS_dataType * IS, int subgNum, double majEarth, double minEarth, int simpVer, int simpWWA, sInt4 * f_endMsg, LatLon * lwlf, LatLon * uprt) (/build/scripts/gdal-3.2.2/frmts/grib/degrib/degrib/degrib2.cpp:1199)
libgdal.so.28!GRIBRasterBand::ReadGribData(VSILFILE * fp, vsi_l_offset start, int subgNum, double ** data, grib_MetaData ** metaData) (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:930)
libgdal.so.28!GRIBRasterBand::LoadData(GRIBRasterBand * const this) (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:746)
libgdal.so.28!GRIBRasterBand::LoadData(GRIBRasterBand * const this) (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:697)
libgdal.so.28!GRIBRasterBand::IReadBlock(GRIBRasterBand * const this, int nBlockYOff, void * pImage) (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:803)
libgdal.so.28!GDALRasterBand::GetLockedBlockRef(int bJustInitialize, int nYBlockOff, int nXBlockOff, GDALRasterBand * const this) (/build/scripts/gdal-3.2.2/gcore/gdal_priv.h:963)
libgdal.so.28!GDALRasterBand::GetLockedBlockRef(GDALRasterBand * const this, int nXBlockOff, int nYBlockOff, int bJustInitialize) (/build/scripts/gdal-3.2.2/gcore/gdalrasterband.cpp:1238)
libgdal.so.28!GDALRasterBand::IRasterIO(GDALRasterBand * const this, GDALRWFlag eRWFlag, int nXOff, int nYOff, int nXSize, int nYSize, void * pData, int nBufXSize, int nBufYSize, GDALDataType eBufType, GSpacing nPixelSpace, GSpacing nLineSpace, GDALRasterIOExtraArg * psExtraArg) (/build/scripts/gdal-3.2.2/gcore/rasterio.cpp:149)
libgdal.so.28!GDALRasterBand::RasterIO(GDALRasterBand * const this, GDALRWFlag eRWFlag, int nXOff, int nYOff, int nXSize, int nYSize, void * pData, int nBufXSize, int nBufYSize, GDALDataType eBufType, GSpacing nPixelSpace, GSpacing nLineSpace, GDALRasterIOExtraArg * psExtraArg) (/build/scripts/gdal-3.2.2/gcore/gdalrasterband.cpp:372)
import_export::RasterImporter::getRawPixels(import_export::RasterImporter * const this, const uint32_t thread_idx, const uint32_t band_idx, const int y_start, const int num_rows, std::vector<std::byte, std::allocator<std::byte> > & raw_pixel_bytes) (/home/simon.eves/work/omniscidb-internal/ImportExport/RasterImporter.cpp:427)
import_export::Importer::<lambda(size_t, int, int)>::operator()(size_t, int, int) const(const import_export::Importer::<lambda(size_t, int, int)> * const __closure, const size_t thread_idx, const int y_start, const int y_end) (/home/simon.eves/work/omniscidb-internal/ImportExport/Importer.cpp:5737)
std::__invoke_impl<std::tuple<import_export::ImportStatus, std::array<float, 3> >, import_export::Importer::importGDALRaster(import_export::ColumnNameToSourceNameMapType, const Catalog_Namespace::SessionInfo*)::<lambda(size_t, int, int)>, long unsigned int, int, int>(std::__invoke_other, import_export::Importer::<lambda(size_t, int, int)> &&)(import_export::Importer::<lambda(size_t, int, int)> && __f) (/usr/include/c++/9/bits/invoke.h:60)
std::__invoke<import_export::Importer::importGDALRaster(import_export::ColumnNameToSourceNameMapType, const Catalog_Namespace::SessionInfo*)::<lambda(size_t, int, int)>, long unsigned int, int, int>(import_export::Importer::<lambda(size_t, int, int)> &&)(import_export::Importer::<lambda(size_t, int, int)> && __fn) (/usr/include/c++/9/bits/invoke.h:96)
std::thread::_Invoker<std::tuple<import_export::Importer::importGDALRaster(import_export::ColumnNameToSourceNameMapType, const Catalog_Namespace::SessionInfo*)::<lambda(size_t, int, int)>, long unsigned int, int, int> >::_M_invoke<0, 1, 2, 3>(std::_Index_tuple<0, 1, 2, 3>)(std::thread::_Invoker<std::tuple<import_export::Importer::importGDALRaster(import_export::ColumnNameToSourceNameMapType, const Catalog_Namespace::SessionInfo*)::<lambda(size_t, int, int)>, long unsigned int, int, int> > * const this) (/usr/include/c++/9/thread:244)
std::thread::_Invoker<std::tuple<import_export::Importer::importGDALRaster(import_export::ColumnNameToSourceNameMapType, const Catalog_Namespace::SessionInfo*)::<lambda(size_t, int, int)>, long unsigned int, int, int> >::operator()(void)(std::thread::_Invoker<std::tuple<import_export::Importer::importGDALRaster(import_export::ColumnNameToSourceNameMapType, const Catalog_Namespace::SessionInfo*)::<lambda(size_t, int, int)>, long unsigned int, int, int> > * const this) (/usr/include/c++/9/thread:251)
...


More information about the gdal-dev mailing list