[gdal-dev] memory leak in GRIB reader (with Python bindings)
Chris Barker
chris.barker at noaa.gov
Thu May 24 17:18:04 EDT 2012
Even,
Thanks so much!
> ok I reproduce your issue.
>
> The GRIB driver actually caches all the raster data from a band the first type
> you access it, and never releases it.
I just tested reading only a subset of teh band (because I don't need
the whole thing), and it used exactly the same amount of memory --
which fits this data model.
> This is to speed-up successive RasterIO
> operations on a band, which is a nice feature generally.
maybe -- but what is GDAL policy usually? It doesn't read the data
until you ask for it, and I would have expected to keep copy myself if
want to use it again.
> But if you iterate
> over all the bands, it means that GDAL will end up allocating (number_of_bands
> * x_size * y_size * sizeof(double) ) bytes. In your case : 1129 * 720 * 360 *
> 8 = 2.3 GB indeed.
yup.
> I'm going to try to find a fix where GDAL wouldn't cache more than XXX bytes
> from a dataset to avoid this situation.
That would be great -- I can see caching a full band, so you could
pull out pieces efficiently, but caching the entier thing seems like a
bad idea. Even in the single bad case, I'd expect the user to pull the
whole thing if s/he wanted that.
> In the meantime, you can perhaps try reworking your algorithm to iterate on a
> limited number of bands (let's say 100) at a time. "At a time" means that
> between each iteration you close and re-open the dataset.
actually, the code already does that (I'm using this tro translate to
another file format, and break it up into smaller chunks). I hadn't
thought to close data set in between -- that will be easy to do.
Thanks for the very fast reply!
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the gdal-dev
mailing list