[Gdal-dev] Advice on Raster Formats

Vincent Schut vincent at ecovla.nl
Tue Aug 30 03:44:32 EDT 2005


Bill,

Just sharing my experiences. Assuming your mapserver app will be more or
less randomly serving extents at more or less random scales, I would
advise using compressed, tiled geotiffs, with pyramids (you can use
gdaladdo for that). Do some test with your data which compressing
algorithm and tilesize suites you best, keeping your app, disksize,
decompression time in mind. I expect that will be the best compromise
for you between file size and serving time. Use tiles because then it is
easier to read only a small (zoomed in) part of your file. It you use
the default striped tiff, always entire rows will be read, also when
only a small part is needed.
Another hint: use lots of ram. Not only are disks cheap, memory is cheap
also and on a busy mapserver server machine, lots of ram can be used as
disk cache, eliminating the need to read often asked files each time.
You could also experiment with enlarging the gdal cache, I suppose that
will hold the *decompressed* files instead of the disk cache, which will
hold the file data itself. Probably someone else on this list (if not
Frank W.) can tell you how to set the gdal cache size.

Cheers,
Vincent.

Bill Binko wrote:
> Hello everyone,
> 
> I need some advice on how to store and process raster images I'm working
> with.  The images have generally come in as either ECW (or JP2000 -- my
> choice) or MrSID format.  They are aerial images (DOQQs generally at
> 1meter accuracy).
> 
> I need to be able to serve these up efficiently using mapserver (through 
> GDAL, of course).  The output will be a 24 bit JPEG (with overlay data 
> unfortunately compressed as well) that will be sent over the wire to the 
> client.
> 
> My problem is that both the ECW and the MrSIDs take a very long time to 
> decompress, but GeoTIFF takes an ungodly amount of disk space.  Here are 
> some numbers I just grabbed at random:
> 
> I took the same data in GeoTIFF, ECW, and MrSID formats and decompressed 
> them (to MEM which I thought would be a good comparison to reading into 
> mapserver).
> 
> $ time gdal_translate -of MEM Q2918se.tiff /dev/null
> Input file size is 6337, 7082
> 0...10...20...30...40...50...60...70...80...90...100 - done.
> 0.72user 0.52system 0:02.79elapsed 44%CPU (0avgtext+0avgdata 
> 0maxresident)k 0inputs+0outputs (0major+37327minor)pagefaults 0swaps
> 
> $ time gdal_translate -of MEM Q2918se.sid /dev/null
> Input file size is 6337, 7082
> 0...10...20...30...40...50...60...70...80...90...100 - done.
> 55.80user 7.78system 1:12.57elapsed 87%CPU (0avgtext+0avgdata 
> 0maxresident)k 0inputs+0outputs (3major+631879minor)pagefaults 0swaps
> 
> $ time gdal_translate -of MEM Q2918se.ecw /dev/null
> Input file size is 6337, 7082
> 0...10...20...30...40...50...60...70...80...90...100 - done.
> 25.92user 1.43system 0:29.80elapsed 91%CPU (0avgtext+0avgdata 
> 0maxresident)k 0inputs+0outputs (0major+38181minor)pagefaults 0swaps
> 
> As you can see, MrSID is more than twice as slow as ECW, and GeoTIFF just
> flies.  The flip side (disk space) can be seen here in a directory
> listing:
> 
>  19226192 Aug 30 02:05 Q2918se.ecw
>   7601858 Aug 30 02:21 Q2918se.sid
> 134983605 Aug 30 02:00 Q2918se.tiff
> 
> What I'm considering is keeping the full-scale files in MrSID format, and
> keeping overviews (pyramids) of lower resolution images in ECW.  Assuming
> I need four layers (dividing each side by 2 -- and the area by 4) or so
> for adequate performance, the image above would take:
> 
> 7.6 (base) + 19/4MB (layer1) + 19/16MB (layer2)+ 19/64MB (layer3) + 
> 19/256MB (layer4).
> 
> That adds up to 13.9MB, a far cry from the 130MB TIFF.
> 
> This also gives me the ability to keep a lossless (MrSID) file in case I 
> ever want to convert to GeoTIFFs in the future.
> 
> Does this make sense?  Are there better ways to get performance out of 
> MrSIDs?  Should some of the higher (and smaller) layers stay GeoTIFF?
> 
> I'm sure that someone will say "Hard Drives are Cheap", and I agree, but 
> it seems that this data is growing at outrageous rates (uncompressed), and 
> that the time spent reading the entire GeoTIFF becomes a bottleneck 
> anyway.
> 
> I'd really appreciate any advice,
> 
> Thanks
> Bill
> 
> _______________________________________________
> Gdal-dev mailing list
> Gdal-dev at lists.maptools.org
> http://lists.maptools.org/mailman/listinfo/gdal-dev




More information about the Gdal-dev mailing list