[gdal-dev] Memory use in GDALDriver::CreateCopy()

Wed Jan 13 14:46:22 EST 2010

Ozy,

The interesting info is that your input image is JPEG2000 compressed. 
This explains why you were able to read a scanline oriented NITF with 
blockwidth > 9999. My guess would be that the leak is in the JPEG2000 
driver in question, so this may be more a problem on the reading part 
than on the writing part. You can check that by running : gdalinfo 
-checksum NITF_IM:0:input.ntf. If you see the memory increasing again 
and again, there's definitely a problem. In case you have GDAL 
configured with several JPEG2000 drivers, you'll have to find which one 
is used : JP2KAK (Kakadu based), JP2ECW (ECW SDK based), JPEG2000 
(Jasper based, but I doubt you're using it with such a big dataset), 
JP2MRSID. Normally, they are selected in the order I've described 
(JP2KAK first, etc). As you're on Linux, it might be interesting that 
you run valgrind to see if it reports leaks. As it might very slow on 
such a big dataset, you could try translating just a smaller window of 
your input dataset, like

valgrind --leak-check=full gdal_translate NITF_IM:0:input.ntf output.tif 
-srcwin 0 0 37504 128

I've selected TIF as output format as it shouldn't matter if you confirm 
that the problem is in the reading part. As far as the window size is 
concerned, it's difficult to guess which value will show the leak.

Filing a ticket with your findings on GDAL Trac might be appropriate.

It might be good trying with GDAL trunk first though, in case the leak 
might have been fixed since 1.6.2. The beta2 source zip are to be found 
here : http://download.osgeo.org/gdal/gdal-1.7.0b2.tar.gz

Best regards,

Even

ozy sjahputera a écrit :
> Hi Even,
>
> yes, I tried:
> gdal_translate -of "NITF" -co "ICORDS=G" -co "BLOCKXSIZE=128" -co 
> "BLOCKYSIZE=128"  NITF_IM:0:input.ntf output.ntf
>
> I monitored the memory use using top and it was steadily increasing 
> till it reached 98.4% (I have 8GB of RAM and 140 GB of local disk for 
> swap etc.) before the node died (not just the program, but the whole 
> system just stopped responding).
>
> My GDAL version is 1.6.2.
>
> gdalinfo on this image shows the raster size of (37504, 98772) and 
> Block=37504x1. 
> The image is compressed using JPEG2000 option and contains two 
> subdatasets (data and cloud data ~ I used only the data for 
> gdal_translate test).
>
> Band info from gdalinfo:
> Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray
>
> Ozy
>
> On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault 
> <even.rouault at mines-paris.org <mailto:even.rouault at mines-paris.org>> 
> wrote:
>
>     Ozy,
>
>     Did you try with gdal_translate -of NITF src.tif output.tif -co
>     BLOCKSIZE=128 ? Does it give similar results ?
>
>     I'm a bit surprised that you even managed to read a 40Kx100K large
>     NITF
>     file organized as scanlines. There was a limit until very recently
>     that
>     prevented to read blocks whose one dimension was bigger than 9999.
>     This
>     was fixed recently in trunk ( see ticket
>     http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has
>     not yet been released to an officially released version. So which GDAL
>     version are you using ?
>
>     Does the output of gdalinfo on your scanline oriented input NITF gives
>     something like :
>     Band 1 Block=40000x1 Type=Byte, ColorInterp=Gray
>
>     Is your input NITF compressed or uncompressed ?
>
>     Anyway, with latest trunk, I've simulated creating a similarly large
>     NITF image with the following python snippet :
>
>     import gdal
>     ds = gdal.GetDriverByName('NITF').Create('scanline.ntf', 40000,
>     100000)
>     ds = None
>
>     and then creating the tiled NITF :
>
>     gdal_translate -of NITF scanline.ntf tiled.ntf -co BLOCKSIZE=128
>
>     The memory consumption is very reasonnable (less than 50 MB : the
>     default block cache size of 40 MB + temporary buffers ), so I'm not
>     clear why you would have a problem of increasing memory use.
>
>     ozy sjahputera a écrit :
>     > I was trying to make a copy of a very large NITF image (about
>     40Kx100K
>     > pixels) using GDALDriver::CreateCopy(). The new file was set to have
>     > different block-size (input was a scanline image, output is to
>     have a
>     > 128x128 blocksize). The program keeps getting killed by the system
>     > (Linux). I monitor the memory use of the program as it was executing
>     > CreateCopy and the memory use was steadily increasing as the
>     progress
>     > indicator from CreateCopy was moving forward.
>     >
>     > Why does CreateCopy() use so much memory? I have not perused the
>     > source code of CreateCopy() yet, but I am guessing it employs
>     > RasterIO() to perform the read/write?
>     >
>     > I was trying different sizes for GDAL  cache from 64MB, 256MB,
>     512MB,
>     > 1GB, and 2GB. The program got killed in all these cache sizes. In
>     > fact, my Linux box became unresponsive when I set
>     GDALSetCacheMax() to
>     > 64MB.
>     >
>     > Thank you.
>     > Ozy
>     >
>     >
>     >
>     ------------------------------------------------------------------------
>     >
>     > _______________________________________________
>     > gdal-dev mailing list
>     > gdal-dev at lists.osgeo.org <mailto:gdal-dev at lists.osgeo.org>
>     > http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
>
>