TIFF performance: overviews and internal tiles: Surprise!
Gregor Mosheh
gregor at HOSTGIS.COM
Sat Sep 1 15:59:30 PDT 2007
This test compares the effect of overviews and internal tiling on the
performance of a TIFF raster data source.
*** SETUP
The data is USGS DOQQs (black-n-white) of San Francisco, California.
Several DOQQs were downloaded and merged into a single GeoTIFF using
gdal_merge.py.
I then created three copies of the TIFF, with overviews, with internal
tiling, and with both:
cp original.tif overviews.tif
gdaladdo overviews.tif 2 4 8 16 32
gdal_translate -co "TILED=YES" original.tif tiled.tif
gdal_translate -co "TILED=YES" original.tif tiledandoverviews.tif
gdaladdo tiledandoverviews.tif 2 4 8 16 32
This set of overviews for this particular TIFF gives the following
overview resolutions:
5984x21140, 2992x10570, 1496x5285, 748x2643, 374x1322
All images use the same projection: EPSG 26910, aka UTM zone 10N with
NAD83 datum.
The spatial extent observed is:
543577.000 4150151.000 555545.000 4192431.000
A mapfile is created specifying three layers, named BARE, OVERVIEW,
TILE, BOTH, each one reading from the corresponding raster. No
reprojection is being done in the mapfile.
*** STORAGE SPACE
original.tif 483 MB
overviews.tif 646 MB
tiled.tif 487 MB
tiledandoverviews.tif 651 MB
As was expected, adding overviews increases the file size by some 33%
while internal tiling adds only 4-5 MB to the file size.
*** RUNTIME SPEED
# full view of the region
shp2img -m mapfile.map -l BARE -o bare-fullview.png
Time: 0.484s
shp2img -m mapfile.map -l OVERVIEW -o overview-fullview.png
Time: 0.460s
shp2img -m mapfile.map -l TILE -o tile-fullview.png
Time: 1.187s
shp2img -m mapfile.map -l BOTH -o both-fullview.png
Time: 0.457s
# 1 square kilometer pulled from the map
shp2img -m mapfile.map -l BARE -e 549561 4150150 550561 4151150 -o
bare-1km.png
Time: 1.072s
shp2img -m mapfile.map -l OVERVIEW -e 549561 4150150 550561 4151150 -o
overview-1km.png
Time: 1.066s
shp2img -m mapfile.map -l TILE -e 549561 4150150 550561 4151150 -o
tile-1km.png
Time: 1.034s
shp2img -m mapfile.map -l BOTH -e 549561 4150150 550561 4151150 -o
both-1km.png
Time: 1.042s
# a 3km square pulled from the map
shp2img -m mapfile.map -l BARE -e 547561 4149150 550561 4152150 -o
bare-1km.png
Time: 0.839s
shp2img -m mapfile.map -l OVERVIEW -e 547561 4149150 550561 4152150 -o
overview-1km.png
Time: 0.822s
shp2img -m mapfile.map -l TILE -e 547561 4149150 550561 4152150 -o
tile-1km.png
Time: 0.840s
shp2img -m mapfile.map -l BOTH -e 547561 4149150 550561 4152150 -o
both-1km.png
Time: 0.825s
At the full view, tiling actually hurt performance; presumably this was
due to it seeking tiles and eventually not saving any time/seeks anyway.
At the close-up views, I was quite surprised to see that tiles and
overviews did indeed have an effect, but that the effect was only in the
dozens of microseconds.
I then repeated this experiment by fetching several more DOQQs and
merging them into a 1 GB TIFF, then repeating the same generation and
testing steps as above.
* The tiling and overview size increase was basically the same: 4 MB
fixed growth for tiles, and 33% growth for overviews.
* The full views, all times remained in the same ratio, but increased by
50%. Not bad, considering that there was a 100% increase in file size.
* For both the 1km and 2km extraction, the times were basically the same
as for the 500 MB test above. The increased file size made a difference
of 0.016s in the most dramatic case, which was that of the "bare"
GeoTIFF. Other increases were 7 and 10ms.
I then repeated again with a 2 GB GeoTIFF (just keep adding counties til
I hit the limit, right?).
Full view:
BARE 0.806s
OVERVIEW 0.759s
TILE 2.238s
BOTH 0.747s
2km square:
BARE 0.867s
OVERVIEW 0.832s
TILE 0.840s
BOTH 0.828s
Again the same results!
The conclusions are quite surprising:
* The presence of overviews makes little difference in the performance
otherwise, about 1/20 of a second in the most dramatic case.
* Internal tiling seems likewise ineffectual, a difference of several
microseconds.
* However, internal tiling will hurt performance substantially in a case
where the request ends up grabbing the entire image anyway.
Is anybody else able to replicate these findings, or to achieve
different results in a controlled experiment? Our hardware is rather
beefy (8 CPU cores and a 4-disk RAID-5) so perhaps that's confounding my
experiment? Perhaps these performance enhancers are mostly useful on a
slower single-disk system?
More information about the MapServer-users
mailing list