[gdal-dev] memory leak in GRIB reader (with Python bindings)
Even Rouault
even.rouault at mines-paris.org
Thu May 24 17:06:21 EDT 2012
Chris,
ok I reproduce your issue.
The GRIB driver actually caches all the raster data from a band the first type
you access it, and never releases it. This is to speed-up successive RasterIO
operations on a band, which is a nice feature generally. But if you iterate
over all the bands, it means that GDAL will end up allocating (number_of_bands
* x_size * y_size * sizeof(double) ) bytes. In your case : 1129 * 720 * 360 *
8 = 2.3 GB indeed.
I'm going to try to find a fix where GDAL wouldn't cache more than XXX bytes
from a dataset to avoid this situation.
In the meantime, you can perhaps try reworking your algorithm to iterate on a
limited number of bands (let's say 100) at a time. "At a time" means that
between each iteration you close and re-open the dataset. (GDAL will recover
nicely the memory it has cached at dataset closing). Or more simply close and
re-open each time you process a band (opening time on your dataset doesn't
seem to be so slow).
Best regards,
Even
Le jeudi 24 mai 2012 21:55:15, Chris Barker a écrit :
> Hi folks,
>
> I"m finding what appears to be a memory leak, using the GRIB reader,
> with the python bindings.
>
> What I'm trying to do is read the data one band at a time, then throw
> it away and read the next band -- there are 1129 bands in the file at
> hand, and I can't hold it all in memory (32 bit still...)
>
> However, when I do this, memory use just keeps climbing.
>
> Should the memory be freed? I would expect so.
>
> I'm using RasterBand.ReadAsArray()
>
> Is this a leak? or is supposed to keep it around in memory?
>
> Either way, is there a way to force it to release that memory (I"m
> already doing and exlicite del and gc.collect call, so I dont think
> it's a python reference counting issue)
>
> I've enclosed a simpel test script -- watch the memory climb.
>
> The data file is to big (186MB) to enclose here, you can get it here:
>
> http://nomads.ncep.noaa.gov/pub/data/nccf/com/cfs/prod/cfs/cfs.20120522/18/
> time_grib_01/ocnu5.01.2012052218.daily.grb2
>
> If you want to give this a try.
>
>
> (note -- Grib giving some pretty good compression -- this climbs to
> 2.3GB when I read it)
>
> I could actually live with the 2.3GB -- but in my real use case, I'm
> reading two of these at the same time, so I max out what I can do with
> 32 bit python...
>
> GDAL 1.8.1
> Python 2.7 (32 bit Intel)
> OS-X 10.6
> I think it's the Kyng Chaos build of GDAL.
>
> Thanks,
> -Chris
More information about the gdal-dev
mailing list