[gdal-dev] optimizing GTI performance (follow-up on #14063)

Even Rouault even.rouault at spatialys.com
Mon Mar 16 10:47:43 PDT 2026


Vincent,

The recent pull requests #14089 and #14094 were mostly tested on local 
datasets because it is faster&more easily reproducible w.r.t. timings, 
but it is true there's a significant difference in performance with 
remote datasets.

This comes from the fact the gdal_translate (and the underlying 
GDALDatasetCopyWholeRaster()  methd) and gdalwarp have different 
strategies in the way the compute their processing chunks. From my 
testing, you can significantly improve performance with 
gdal_translate on the GTI by using :

* -co INTERLEAVE=PIXEL. otherwise as both the input and output datasets 
are band interleaved, GDALDatasetCopyWholeRaster() will proceed band by 
band, which will not leave much opportunity to the GTiff driver to 
parallelize pixel acquisition. This is the main factor at play.

* and -co COMPRESS=ZSTD (or another method of your liking). This will 
force GDALDatasetCopyWholeRaster() to operate on chunks that are at 
least the block height tall

* and increase the tile height to 1024 for example

Even


Le 16/03/2026 à 16:58, Vincent Schut via gdal-dev a écrit :
> This is a follow-up on https://github.com/OSGeo/gdal/issues/14063. > > I decided to take it to 
the list instead, because I'm not sure > whether this is an issue or 
simply a discussion on performance. (Let > me know if you rather take 
this to gh issues again; happy to do so). > > Please read the original 
issue first, to avoid wasting time on > questions that were already 
answered there :-) > > Short context: trying to get good performance out 
of a GTI mosaic > which wraps (potentially a lot of) remote tifs (on AWS 
S3). The tifs > are in UTM, the mosaic is reprojecting to epgs:4326. > > 
After the performance optimizations made by Even as response to > #14063 
(https://github.com/OSGeo/gdal/pull/14089 and https:// > 
github.com/OSGeo/gdal/pull/14094), I did some testing. The results > 
were rather... confusing. > > All these tests were done with a fresh 
gdal master build (GDAL > 3.13.0dev-28b33d81c7, released 2026/03/15). > 
 > What I expected (hoped): - wrapping a (single) remote tif with a GTI 
 > should not affect performance too much - gdal_translate to have > 
better or similar performance on the GTI than gdalwarp, especially > 
with using -oo WARPING_MEMORY_SIZE. As Even said, using gdalwarp > means 
the warping logic is used twice, where it is only needed once. > Using 
gdal_translate should therefore show better performance. > > What I 
found: - gdalwarp on the GTI is about 50% slower than on the > remote 
tiff directly (but maybe that is simply the cost of having a > GTI 
wrapper? I hoped it would be less) - gdal_translate is still > waaay 
slower (10x) than gdal_warp. Using WARPING_MEMORY_SIZE hardly > seems to 
make a difference. - this is only/mainly when targeting > remote tifs. 
When using local files, the timings are very different. > But: the 
remote case is the case we need, and what want to optimize. > So I focus 
on that. > > My tests were on a 22-core (according to "nproc") machine, 
64G ram. > Results will probably vary with number of cores, network 
speed, > physical location (I'm in Europe, these tifs are probably in a 
 > bucket in the US), and also the size of the block extracted. All > 
tests below extract the same 0.2 degree block. > > These are the 
commands I ran as test: > > # export config options export 
GDAL_CACHEMAX=1GB > GDAL_NUM_THREADS=ALL_CPUS 
DISABLE_READDIR_ON_OPEN=EMPTY_DIR > 
CPL_VSIL_CURL_ALLOWED_EXTENSIONS="tiff" AWS_NO_SIGN_REQUEST=YES > > # 
create a GTI wrapping 1 remote tif gdal driver gti create -- > 
resolution 0.00006,0.00006 --ot Int8 --band-count 64 --nodata -128 -- > 
dst-crs epsg:4326 --of gpkg /vsis3/us-west-2.opendata.source.coop/ > 
tge-labs/aef/v1/annual/2024/30N/ > 
xpzba7dllw4la2007-0000008192-0000000000.tiff test_1tiff.gti.gpkg > > # 
start of tests # each test extracts the same 0.2 x 0.2 degree > block. > 
 > # gdal_warp on the remote tif directly, doing the exact same > 
reprojecting as the GTI is doing gdalwarp -overwrite -te -5.8 5.6 > -5.6 
5.8 -tr 0.00006 -0.00006 -t_srs epsg:4326 -wm 1G -co tiled=yes > -co 
blockxsize=512 -co blockysize=512 -co interleave=band /vsis3/us- > 
west-2.opendata.source.coop/tge-labs/aef/v1/annual/2024/30N/ > 
xpzba7dllw4la2007-0000008192-0000000000.tiff > result_tiff_allbands.tif 
# time: 0m33s > > # gdalwarp on the GTI which wraps the same tif 
gdalwarp -overwrite - > te -5.8 5.6 -5.6 5.8 -wm 1G -co tiled=yes -co 
blockxsize=512 -co > blockysize=512 -co interleave=band 
test_1tiff.gti.gpkg > result_gti_allbands.tif # time: 0m47s > > # 
gdal_translate on the GTI with -oo WARPING_MEMORY_SIZE set to 1G > 
gdal_translate -oo WARPING_MEMORY_SIZE=1GB -projwin -5.8 5.8 -5.6 > 5.6 
-co tiled=yes -co blockxsize=512 -co blockysize=512 -co > 
interleave=band test_1tiff.gti.gpkg gt_result_gti_allbands.tif # > time: 
5m31s > > # gdal_translate on the GTI without the -oo 
WARPING_MEMORY_SIZE > option gdal_translate -projwin -5.8 5.8 -5.6 5.6 
-co tiled=yes -co > blockxsize=512 -co blockysize=512 -co 
interleave=band > test_1tiff.gti.gpkg gt_result_gti_allbands.tif # time: 
5m16s > > -- > > > > Vincent Schut > > Remote Sensing Software Engineer 
 > > +31 302272679 ~ Maliebaan 22 | 3581CP | Utrecht | Netherlands > > 
Linkedin <https://www.linkedin.com/company/satelligence/> ~ > 
satelligence.com <http://www.satelligence.com><http:// > 
www.satelligence.com> > > > 
_______________________________________________ gdal-dev mailing > list 
gdal-dev at lists.osgeo.org https://lists.osgeo.org/mailman/ > 
listinfo/gdal-dev --
http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20260316/19d25143/attachment.htm>


More information about the gdal-dev mailing list