[gdal-dev] memory leak in GRIB reader (with Python bindings)

Even Rouault even.rouault at mines-paris.org
Thu May 24 17:06:21 EDT 2012


Chris,

ok I reproduce your issue.

The GRIB driver actually caches all the raster data from a band the first type 
you access it, and never releases it. This is to speed-up successive RasterIO 
operations on a band, which is a nice feature generally. But if you iterate 
over all the bands, it means that GDAL will end up allocating (number_of_bands 
* x_size * y_size * sizeof(double) ) bytes. In your case : 1129 * 720 * 360 * 
8 = 2.3 GB indeed.

I'm going to try to find a fix where GDAL wouldn't cache more than XXX bytes 
from a dataset to avoid this situation.

In the meantime, you can perhaps try reworking your algorithm to iterate on a 
limited number of bands (let's say 100) at a time. "At a time" means that 
between each iteration you close and re-open the dataset. (GDAL will recover 
nicely the memory it has cached at dataset closing). Or more simply close and 
re-open each time you process a band (opening time on your dataset doesn't 
seem to be so slow).

Best regards,

Even

Le jeudi 24 mai 2012 21:55:15, Chris Barker a écrit :
> Hi folks,
> 
> I"m finding what appears to be a memory leak, using the GRIB reader,
> with the python bindings.
> 
> What I'm trying to do is read the data one band at a time, then throw
> it away and read the next band -- there are 1129 bands in the file at
> hand, and I can't hold it all in memory (32 bit still...)
> 
> However, when I do this, memory use just keeps climbing.
> 
> Should the memory be freed? I would expect so.
> 
> I'm using RasterBand.ReadAsArray()
> 
> Is this a leak? or is supposed to keep it around in memory?
> 
> Either way, is there a way to force it to release that memory (I"m
> already doing and exlicite del and gc.collect call, so I dont think
> it's a python reference counting issue)
> 
> I've enclosed a simpel test script -- watch the memory climb.
> 
> The data file is to big (186MB) to enclose here, you can get it here:
> 
> http://nomads.ncep.noaa.gov/pub/data/nccf/com/cfs/prod/cfs/cfs.20120522/18/
> time_grib_01/ocnu5.01.2012052218.daily.grb2
> 
> If you want to give this a try.
> 
> 
> (note -- Grib giving some pretty good compression -- this climbs to
> 2.3GB when I read it)
> 
> I could actually live with the 2.3GB -- but in my real use case, I'm
> reading two of these at the same time, so I max out what I can do with
> 32 bit python...
> 
> GDAL 1.8.1
> Python 2.7 (32 bit Intel)
> OS-X 10.6
> I think it's the Kyng Chaos build of GDAL.
> 
> Thanks,
>    -Chris


More information about the gdal-dev mailing list