[mapserver-users] Ed's Rules for the Best Raster Performance

Ed McNierney ed at mcnierney.com
Tue Sep 16 08:45:40 PDT 2008


Jeff -

I'm not convinced, either, but I have never seen a real-world test that has shown otherwise.  There haven't been many such tests, but I have done them myself and several others have done them as well and posted the results on this list.  There may be tradeoffs which require a different implementation - that's life in the real world - but the data (the real, measured data, not theoretical speculation) has always been consistent.

If you want to shrink the file size in this thought experiment that's fine, but realize that you are thereby increasing the number of files that need to be opened for a random image request.  And each new open file incurs a relatively high cost (directory/disk seek overhead, etc.); those thousands or millions of JPEGs aren't just hard to keep track of - they hurt performance.  I have been the keeper of tens of millions of such files, and have seen some of those issues myself.

The example I gave (and my other examples) are, however, primarily intended to help people think about all the aspects of the problem.  File access performance in an application environment is a complex issue with many variables and any implementation should be prototyped and tested.  All I really care about is that you don't think it's simple and you try to think through all the consequences of an implementation plan.

I will also admit to being very guilty of not designing for "low-moderate load" situations, as I always like my Web sites to be able to survive the situation in which they accidentally turn out to be popular!

    - Ed


On 9/16/08 11:21 AM, "Jeff Hoffmann" <jeff.hoffmann at gmail.com> wrote:

Ed McNierney wrote:
>
> And remember that not all formats are created equal. In order to
> decompress ANY portion of a JPEG image, you must read the WHOLE file.
> If I have a 4,000x4,000 pixel 24-bit TIFF image that's 48 megabytes,
> and I want to read a 256x256 piece of it, I may only need to read one
> megabyte or less of that file. But if I convert it to a JPEG and
> compress it to only 10% of the TIFF's size, I'll have a 4.8 megabyte
> JPEG but I will need to read the whole 4.8 megabytes (and expand it
> into that RAM you're trying to conserve) in order to get that 256x256
> piece!
I have a feeling like I'm throwing myself into a religious war, but here
goes. I think the problem that you have in your estimates is that you're
using large (well, sort of large) jpegs. When you're using properly
sized jpegs on modern servers at low-moderate load, you can pretty much
disregard the processor time and memory issues, and just compare on the
basis of the slowest component, disk access. 4000x4000 is big & the
performance isn't going to be good (for the reasons you mention), but he
never claimed to be using images that big. What he claimed is that he's
using 1000x1000 jpegs. The 1000x1000 jpegs is pretty critical because
it's that sweet spot where the decompress time is small, the memory
demands manageable but the images are large enough that you keep the
number of tiles down to a minimum for most uses. Those jpegs might be in
the 200k size range, compared to a 256x256 block = 64k (x3 bands =192k?)
so he's reading a full 1000x1000 image in the disk space of 1 256x256
block. If you're serving up 500x500 finished image, you're using at
least 4 blocks in the geotiff, maybe 9 compared 1-4 with the 1000x1000
jpeg. You could easily be spending 2x the time reading the disk with
geotiff as you would be with jpegs. I haven't sat down and done any side
by side tests, but I can see how they would be competitive for certain
uses when you look at it that way. Of course there are other issues like
lossy compression on top of lossy compression, plus you've got to worry
about keeping track of thousands (millions?) of jpegs, but they're
probably manageable tradeoffs. Oh, and you don't really get the option
to have nodata areas with jpegs, either. There's probably other
drawbacks, too, but I'm not convinced that performance is one of them.

jeff

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/mapserver-users/attachments/20080916/f68f5aa6/attachment.htm>


More information about the MapServer-users mailing list