[gdal-dev] gdal_merge is very slow

Leith Bade leith at leithalweapon.geek.nz
Fri Jul 16 21:19:58 EDT 2010


Hi,

I am trying to use gdal_merge to mosaic a very large topo GeoTIFF set.

Uncompressed the data set is 60GB, but I keep it stored with DEFLATE
compression which results in a dataset under 10GB.

Mosaicked the uncompressed file will be 125GB because of the large regions
of nodata generated. Unfortunately this is too big to store on my HDD so I
need to apply DEFLATE to it as well.

I am experimenting with only 1/2 the data set at the moment with this
command:
gdal_merge -co COMPRESS=DEFLATE -co ZLEVEL=9 -co BIGTIFF=YES -o NI-50.tif
*-00.tif

On my AMD Phenom II X4 @ 3.2GHz, 64 bit Windows 7, 4GB DDR3 CAS-7 RAM
(basically a decently specced PC gaming machine)
it has been running for 16 hours and so far has only made it 60% (124 of 204
files)

The other issue is so far the file is 54GB so I will likely run out of disc
space before it finished. This indicates that no DEFLATE compression is
happening at all!

It currently maxing only 1 CPU core, so I assume it is trying to run DEFLATE
then somehow failing to compress at all?

Looking at gdal_merge.py I think the major performance issue is the order in
which it copies the data. Currently it copies an entire image at a time (1
colour channel at a time). Thus DEFLATE will not work due to the rather
random write pattern to each scanline.

I think a much faster method would be to calculate the destination
rectangles for each file into some sort of list. Aftter this generate the
destination file 1 scanline at a time, calculating which source images
intersect, and working left to right filling with solid color or copying
scanlines from the source image.

This allows the DEFLATE to work far more effciently as the write pattern is
horizontally linear.

What are your views/suggestions/etc.?

Thanks,
Leith Bade
leith at leithalweapon.geek.nz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/gdal-dev/attachments/20100717/70f77a89/attachment.html


More information about the gdal-dev mailing list