[Gdal-dev] Advice on Raster Formats

Bill Binko bill at binko.net
Tue Aug 30 02:30:03 EDT 2005


Hello everyone,

I need some advice on how to store and process raster images I'm working
with.  The images have generally come in as either ECW (or JP2000 -- my
choice) or MrSID format.  They are aerial images (DOQQs generally at
1meter accuracy).

I need to be able to serve these up efficiently using mapserver (through 
GDAL, of course).  The output will be a 24 bit JPEG (with overlay data 
unfortunately compressed as well) that will be sent over the wire to the 
client.

My problem is that both the ECW and the MrSIDs take a very long time to 
decompress, but GeoTIFF takes an ungodly amount of disk space.  Here are 
some numbers I just grabbed at random:

I took the same data in GeoTIFF, ECW, and MrSID formats and decompressed 
them (to MEM which I thought would be a good comparison to reading into 
mapserver).

$ time gdal_translate -of MEM Q2918se.tiff /dev/null
Input file size is 6337, 7082
0...10...20...30...40...50...60...70...80...90...100 - done.
0.72user 0.52system 0:02.79elapsed 44%CPU (0avgtext+0avgdata 
0maxresident)k 0inputs+0outputs (0major+37327minor)pagefaults 0swaps

$ time gdal_translate -of MEM Q2918se.sid /dev/null
Input file size is 6337, 7082
0...10...20...30...40...50...60...70...80...90...100 - done.
55.80user 7.78system 1:12.57elapsed 87%CPU (0avgtext+0avgdata 
0maxresident)k 0inputs+0outputs (3major+631879minor)pagefaults 0swaps

$ time gdal_translate -of MEM Q2918se.ecw /dev/null
Input file size is 6337, 7082
0...10...20...30...40...50...60...70...80...90...100 - done.
25.92user 1.43system 0:29.80elapsed 91%CPU (0avgtext+0avgdata 
0maxresident)k 0inputs+0outputs (0major+38181minor)pagefaults 0swaps

As you can see, MrSID is more than twice as slow as ECW, and GeoTIFF just
flies.  The flip side (disk space) can be seen here in a directory
listing:

 19226192 Aug 30 02:05 Q2918se.ecw
  7601858 Aug 30 02:21 Q2918se.sid
134983605 Aug 30 02:00 Q2918se.tiff

What I'm considering is keeping the full-scale files in MrSID format, and
keeping overviews (pyramids) of lower resolution images in ECW.  Assuming
I need four layers (dividing each side by 2 -- and the area by 4) or so
for adequate performance, the image above would take:

7.6 (base) + 19/4MB (layer1) + 19/16MB (layer2)+ 19/64MB (layer3) + 
19/256MB (layer4).

That adds up to 13.9MB, a far cry from the 130MB TIFF.

This also gives me the ability to keep a lossless (MrSID) file in case I 
ever want to convert to GeoTIFFs in the future.

Does this make sense?  Are there better ways to get performance out of 
MrSIDs?  Should some of the higher (and smaller) layers stay GeoTIFF?

I'm sure that someone will say "Hard Drives are Cheap", and I agree, but 
it seems that this data is growing at outrageous rates (uncompressed), and 
that the time spent reading the entire GeoTIFF becomes a bottleneck 
anyway.

I'd really appreciate any advice,

Thanks
Bill




More information about the Gdal-dev mailing list