[Gdal-dev] Advice on Raster Formats
Bill Binko
bill at binko.net
Tue Aug 30 02:30:03 EDT 2005
Hello everyone,
I need some advice on how to store and process raster images I'm working
with. The images have generally come in as either ECW (or JP2000 -- my
choice) or MrSID format. They are aerial images (DOQQs generally at
1meter accuracy).
I need to be able to serve these up efficiently using mapserver (through
GDAL, of course). The output will be a 24 bit JPEG (with overlay data
unfortunately compressed as well) that will be sent over the wire to the
client.
My problem is that both the ECW and the MrSIDs take a very long time to
decompress, but GeoTIFF takes an ungodly amount of disk space. Here are
some numbers I just grabbed at random:
I took the same data in GeoTIFF, ECW, and MrSID formats and decompressed
them (to MEM which I thought would be a good comparison to reading into
mapserver).
$ time gdal_translate -of MEM Q2918se.tiff /dev/null
Input file size is 6337, 7082
0...10...20...30...40...50...60...70...80...90...100 - done.
0.72user 0.52system 0:02.79elapsed 44%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (0major+37327minor)pagefaults 0swaps
$ time gdal_translate -of MEM Q2918se.sid /dev/null
Input file size is 6337, 7082
0...10...20...30...40...50...60...70...80...90...100 - done.
55.80user 7.78system 1:12.57elapsed 87%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (3major+631879minor)pagefaults 0swaps
$ time gdal_translate -of MEM Q2918se.ecw /dev/null
Input file size is 6337, 7082
0...10...20...30...40...50...60...70...80...90...100 - done.
25.92user 1.43system 0:29.80elapsed 91%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (0major+38181minor)pagefaults 0swaps
As you can see, MrSID is more than twice as slow as ECW, and GeoTIFF just
flies. The flip side (disk space) can be seen here in a directory
listing:
19226192 Aug 30 02:05 Q2918se.ecw
7601858 Aug 30 02:21 Q2918se.sid
134983605 Aug 30 02:00 Q2918se.tiff
What I'm considering is keeping the full-scale files in MrSID format, and
keeping overviews (pyramids) of lower resolution images in ECW. Assuming
I need four layers (dividing each side by 2 -- and the area by 4) or so
for adequate performance, the image above would take:
7.6 (base) + 19/4MB (layer1) + 19/16MB (layer2)+ 19/64MB (layer3) +
19/256MB (layer4).
That adds up to 13.9MB, a far cry from the 130MB TIFF.
This also gives me the ability to keep a lossless (MrSID) file in case I
ever want to convert to GeoTIFFs in the future.
Does this make sense? Are there better ways to get performance out of
MrSIDs? Should some of the higher (and smaller) layers stay GeoTIFF?
I'm sure that someone will say "Hard Drives are Cheap", and I agree, but
it seems that this data is growing at outrageous rates (uncompressed), and
that the time spent reading the entire GeoTIFF becomes a bottleneck
anyway.
I'd really appreciate any advice,
Thanks
Bill
More information about the Gdal-dev
mailing list