[gdal-dev] Slow createCopy()

Even Rouault even.rouault at mines-paris.org
Tue Nov 25 14:13:54 EST 2008


Tom,

I've created a 1.2 GB NITF image whose characteristics are :
Driver: NITF/National Imagery Transmission Format
Files: big.ntf
Size is 30000, 20000
....
Band 1 Block=1024x1024 Type=UInt16, ColorInterp=Gray

What is the size (width x height) of your input image ?

Then with GDAL 1.4.2, I did : "gdal_translate big.ntf big.tif"

and yes, it was going to take hours due to cache thrashing.

Then I retried with Frank's suggestion :
"gdal_translate --config GDAL_CACHEMAX 200 big.ntf big.tif"

and it took about 1 minute 30'', so I'm not sure why you don't see major 
improvement.

(In fact you just need 30000 (=image width) * 1024 (=block height) * 2 
(sizeof(uint16))~= 60 MB of cache, so 59 causes cache trashing, but 61 not)

The CreateCopy() in GDAL 1.4.X was a bit suboptimal when the input is tiled 
and the output TIFF is scanline oriented. This has been fixed in GDAL 1.5.0 
and later where you don't need a huge cache size.

A workaround would be to tile your resulting GeoTIFF image like :
"gdal_translate big.ntf big.tif -co TILED=YES -co BLOCKXSIZE=1024 -co 
BLOCKYSIZE=1024"

This runs in about 1 minute 30' too.

And with GDAL 1.5.3, "gdal_translate big.ntf big.tif" takes 1 minute 30'' too

Anyway I'd suggest you to retry with a more recent GDAL release (1.5.3 or 
1.6.0beta) to see if it helps improving things. Otherwise you should post a 
more comprehensive code snippet that reproduces your problem. 

You mention the use of a VRT but we don't know what's inside, etc... that's 
probably the difference between the above experiments and your use case.

Best regards,

Even

Le Tuesday 25 November 2008 13:14:26, vous avez écrit :
> Thanks for your help, this did help to speed it up, but it's still taking
> 1.5 hours (in debug mode, but still) to put out a 1.2 gb GeoTiff image.
>
> the input image is a one band (Type=UInt16) 1.2 gb NITF, with Band 1 Block
> = 1024X1024.  If there's any other information that would be useful, let
> me know.
>
> Is there anything else that could be choking it up?
>
> > And even if your input and output datasets are scanline oriented, you are
> > using a VRT as an intermediate, and VRT has blocks of size 128x128...
> >
> > Le Monday 24 November 2008 23:10:13 Frank Warmerdam, vous avez écrit :
> >> Tom V. wrote:
> >> > Hi,
> >> >
> >> > I'm using gdal 1.4.2 to write out NITF and GeoTiff images.  The
> >> > createCopy() call takes only a few minutes to write out a small file,
> >>
> >> 300
> >>
> >> > or so mb's, but takes anywhere from 1.5-3 hours to write out a single
> >> > band 1.2 gb NITF or GeoTiff.  I have not tried a larger multi-spectral
> >> > image.
> >> >
> >> > The call looks like:
> >> > GDALDatasetH hOutDS = GDALCreateCopy( hDriver, pszDest, (GDALDatasetH)
> >> > poVDS, false, papszCreateOptions, pfnProgress, NULL );
> >> >
> >> > with papszCreateOptions and pfnProgress being NULL in this case,
> >>
> >> hDriver
> >>
> >> > being the appropriate driver and, and poVDS being a vrtdataset.
> >> >
> >> > If anyone has any idea why the larger files would take so long, that
> >> > would be awesome!
> >>
> >> Tom,
> >>
> >> You didn't indicate much about the configuration of the input and output
> >> files.  But I suspect cache thrashing.  If that is the case, upping the
> >> memory cache size will help (dramatically).
> >>
> >> Try setting the GDAL_CACHEMAX environment variable to 200 before
> >> running,
> >> or call GDALSetCacheMax( 200 ).   This would use up to 200MB for the
> >> intermediate block cache.  This is particular important when going from
> >> scanline oriented formats to tile oriented formats or the other way
> >> around.
> >>
> >> If this isn't already in the FAQ, it really needs to be!
> >>
> >> Best regards,




More information about the gdal-dev mailing list