<HTML>

<HEAD>

<TITLE>Re: [mapserver-users] Ed's Rules for the Best Raster Performance</TITLE>

</HEAD>

<BODY>

<FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>Jeff -<BR>

<BR>

I’m not convinced, either, but I have never seen a real-world test that has shown otherwise.  There haven’t been many such tests, but I have done them myself and several others have done them as well and posted the results on this list.  There may be tradeoffs which require a different implementation – that’s life in the real world – but the data (the real, measured data, not theoretical speculation) has always been consistent.<BR>

<BR>

If you want to shrink the file size in this thought experiment that’s fine, but realize that you are thereby increasing the number of files that need to be opened for a random image request.  And each new open file incurs a relatively high cost (directory/disk seek overhead, etc.); those thousands or millions of JPEGs aren’t just hard to keep track of – they hurt performance.  I have been the keeper of tens of millions of such files, and have seen some of those issues myself.<BR>

<BR>

The example I gave (and my other examples) are, however, primarily intended to help people think about all the aspects of the problem.  File access performance in an application environment is a complex issue with many variables and any implementation should be prototyped and tested.  All I really care about is that you don’t think it’s simple and you try to think through all the consequences of an implementation plan.<BR>

<BR>

I will also admit to being very guilty of not designing for “low-moderate load” situations, as I always like my Web sites to be able to survive the situation in which they accidentally turn out to be popular!<BR>

<BR>

    - Ed<BR>

<BR>

<BR>

On 9/16/08 11:21 AM, "Jeff Hoffmann" <<a href="jeff.hoffmann@gmail.com">jeff.hoffmann@gmail.com</a>> wrote:<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>Ed McNierney wrote:<BR>

><BR>

> And remember that not all formats are created equal. In order to<BR>

> decompress ANY portion of a JPEG image, you must read the WHOLE file.<BR>

> If I have a 4,000x4,000 pixel 24-bit TIFF image that’s 48 megabytes,<BR>

> and I want to read a 256x256 piece of it, I may only need to read one<BR>

> megabyte or less of that file. But if I convert it to a JPEG and<BR>

> compress it to only 10% of the TIFF’s size, I’ll have a 4.8 megabyte<BR>

> JPEG but I will need to read the whole 4.8 megabytes (and expand it<BR>

> into that RAM you’re trying to conserve) in order to get that 256x256<BR>

> piece!<BR>

I have a feeling like I'm throwing myself into a religious war, but here<BR>

goes. I think the problem that you have in your estimates is that you're<BR>

using large (well, sort of large) jpegs. When you're using properly<BR>

sized jpegs on modern servers at low-moderate load, you can pretty much<BR>

disregard the processor time and memory issues, and just compare on the<BR>

basis of the slowest component, disk access. 4000x4000 is big & the<BR>

performance isn't going to be good (for the reasons you mention), but he<BR>

never claimed to be using images that big. What he claimed is that he's<BR>

using 1000x1000 jpegs. The 1000x1000 jpegs is pretty critical because<BR>

it's that sweet spot where the decompress time is small, the memory<BR>

demands manageable but the images are large enough that you keep the<BR>

number of tiles down to a minimum for most uses. Those jpegs might be in<BR>

the 200k size range, compared to a 256x256 block = 64k (x3 bands =192k?)<BR>

so he's reading a full 1000x1000 image in the disk space of 1 256x256<BR>

block. If you're serving up 500x500 finished image, you're using at<BR>

least 4 blocks in the geotiff, maybe 9 compared 1-4 with the 1000x1000<BR>

jpeg. You could easily be spending 2x the time reading the disk with<BR>

geotiff as you would be with jpegs. I haven't sat down and done any side<BR>

by side tests, but I can see how they would be competitive for certain<BR>

uses when you look at it that way. Of course there are other issues like<BR>

lossy compression on top of lossy compression, plus you've got to worry<BR>

about keeping track of thousands (millions?) of jpegs, but they're<BR>

probably manageable tradeoffs. Oh, and you don't really get the option<BR>

to have nodata areas with jpegs, either. There's probably other<BR>

drawbacks, too, but I'm not convinced that performance is one of them.<BR>

<BR>

jeff<BR>

<BR>

</SPAN></FONT></BLOCKQUOTE>

</BODY>

</HTML>