TIFF performance: overviews and internal tiles: Surprise!

Gregor Mosheh gregor at HOSTGIS.COM
Sat Sep 1 15:59:30 PDT 2007

This test compares the effect of overviews and internal tiling on the 
performance of a TIFF raster data source.


The data is USGS DOQQs (black-n-white) of San Francisco, California. 
Several DOQQs were downloaded and merged into a single GeoTIFF using 

I then created three copies of the TIFF, with overviews, with internal 
tiling, and with both:
    cp original.tif overviews.tif
    gdaladdo overviews.tif 2 4 8 16 32
    gdal_translate -co "TILED=YES" original.tif tiled.tif
    gdal_translate -co "TILED=YES" original.tif tiledandoverviews.tif
    gdaladdo tiledandoverviews.tif 2 4 8 16 32

This set of overviews for this particular TIFF gives the following 
overview resolutions:
    5984x21140, 2992x10570, 1496x5285, 748x2643, 374x1322

All images use the same projection: EPSG 26910, aka UTM zone 10N with 
NAD83 datum.
The spatial extent observed is:
    543577.000 4150151.000 555545.000 4192431.000

A mapfile is created specifying three layers, named BARE, OVERVIEW, 
TILE, BOTH, each one reading from the corresponding raster. No 
reprojection is being done in the mapfile.


original.tif            483 MB
overviews.tif           646 MB
tiled.tif               487 MB
tiledandoverviews.tif   651 MB

As was expected, adding overviews increases the file size by some 33% 
while internal tiling adds only 4-5 MB to the file size.


# full view of the region
shp2img -m mapfile.map -l BARE     -o bare-fullview.png
Time: 0.484s
shp2img -m mapfile.map -l OVERVIEW -o overview-fullview.png
Time: 0.460s
shp2img -m mapfile.map -l TILE     -o tile-fullview.png
Time: 1.187s
shp2img -m mapfile.map -l BOTH     -o both-fullview.png
Time: 0.457s

# 1 square kilometer pulled from the map
shp2img -m mapfile.map -l BARE     -e 549561 4150150 550561 4151150 -o 
Time: 1.072s
shp2img -m mapfile.map -l OVERVIEW -e 549561 4150150 550561 4151150 -o 
Time: 1.066s
shp2img -m mapfile.map -l TILE     -e 549561 4150150 550561 4151150 -o 
Time: 1.034s
shp2img -m mapfile.map -l BOTH     -e 549561 4150150 550561 4151150 -o 
Time: 1.042s

# a 3km square pulled from the map
shp2img -m mapfile.map -l BARE     -e 547561 4149150 550561 4152150 -o 
Time: 0.839s
shp2img -m mapfile.map -l OVERVIEW -e 547561 4149150 550561 4152150 -o 
Time: 0.822s
shp2img -m mapfile.map -l TILE     -e 547561 4149150 550561 4152150 -o 
Time: 0.840s
shp2img -m mapfile.map -l BOTH     -e 547561 4149150 550561 4152150 -o 
Time: 0.825s

At the full view, tiling actually hurt performance; presumably this was 
due to it seeking tiles and eventually not saving any time/seeks anyway.

At the close-up views, I was quite surprised to see that tiles and 
overviews did indeed have an effect, but that the effect was only in the 
dozens of microseconds.

I then repeated this experiment by fetching several more DOQQs and 
merging them into a 1 GB TIFF, then repeating the same generation and 
testing steps as above.

* The tiling and overview size increase was basically the same: 4 MB 
fixed growth for tiles, and 33% growth for overviews.

* The full views, all times remained in the same ratio, but increased by 
50%. Not bad, considering that there was a 100% increase in file size.

* For both the 1km and 2km extraction, the times were basically the same 
as for the 500 MB test above. The increased file size made a difference 
of 0.016s in the most dramatic case, which was that of the "bare" 
GeoTIFF. Other increases were 7 and 10ms.

I then repeated again with a 2 GB GeoTIFF (just keep adding counties til 
I hit the limit, right?).

Full view:
   BARE       0.806s
   OVERVIEW   0.759s
   TILE       2.238s
   BOTH       0.747s
2km square:
   BARE       0.867s
   OVERVIEW   0.832s
   TILE       0.840s
   BOTH       0.828s

Again the same results!

The conclusions are quite surprising:

* The presence of overviews makes little difference in the performance 
otherwise, about 1/20 of a second in the most dramatic case.

* Internal tiling seems likewise ineffectual, a difference of several 

* However, internal tiling will hurt performance substantially in a case 
where the request ends up grabbing the entire image anyway.

Is anybody else able to replicate these findings, or to achieve 
different results in a controlled experiment? Our hardware is rather 
beefy (8 CPU cores and a 4-disk RAID-5) so perhaps that's confounding my 
experiment? Perhaps these performance enhancers are mostly useful on a 
slower single-disk system?

