[mapserver-users] Ed's Rules for the Best Raster Performance

Jeff Hoffmann jeff.hoffmann at gmail.com
Tue Sep 16 17:02:42 EDT 2008


Ed McNierney wrote:
> If you want to shrink the file size in this thought experiment that’s 
> fine, but realize that you are thereby increasing the number of files 
> that need to be opened for a random image request. And each new open 
> file incurs a relatively high cost (directory/disk seek overhead, 
> etc.); those thousands or millions of JPEGs aren’t just hard to keep 
> track of – they hurt performance. I have been the keeper of tens of 
> millions of such files, and have seen some of those issues myself.
That's certainly a consideration, but you could also counter that by 
using jpeg compressed geotiffs. You'd want to make sure to tile them, 
otherwise you'd have that same big jpeg performance problem -- I think 
tiled effectively treats them as individual jpegs wrapped in one big 
file. No clue on what the actual performance of that would be, but it's 
something to consider if you've got filesystem performance problems.

> The example I gave (and my other examples) are, however, primarily 
> intended to help people think about all the aspects of the problem. 
> File access performance in an application environment is a complex 
> issue with many variables and any implementation should be prototyped 
> and tested. All I really care about is that you don’t think it’s 
> simple and you try to think through all the consequences of an 
> implementation plan.
One of the reasons why I replied to this originally is that I think it's 
good to keep options open so people can evaluate them for their specific 
circumstances. What I was hearing you say was "if you make bad choices, 
it'll perform badly" & I'm just trying to throw out some other choices 
that would better and probably be make it worth a try for a lot of 
people. It's pretty common for me to get imagery in 5000x5000 or 
10000x10000 geotiff tiles. I just got imagery for one county like that 
that weighs in at close to 1TB; if I were to decide I can't afford that 
kind of disk space for whatever reason, I'd investigate some compressed 
options. If I don't know any different, I might just compress that tile 
into one large jpeg (like in your example), discover the performance is 
terrible, discard it & file away in my mind that jpegs perform terribly. 
I might not understand that a 5000x5000 jpeg is going to use 75MB of 
memory and take an order of magnitude longer to decompress than that 
1000x1000 jpeg that only takes up 3MB in memory and decompresses nearly 
instantly while giving you that same 500x500 chunk of image. There are 
nice things about jpegs, like you don't need commercial libraries like 
you would with ecw, mrsid, jp2, you don't have to worry about licensing 
issues, size constraints, compiler environment, all that, which makes it 
a pretty attractive compressed format if you can get it to perform well, 
but if you don't know to break them up into smallish chunks I don't 
think getting to that performance level is really possible (for exactly 
the reasons you describe).
> I will also admit to being very guilty of not designing for 
> “low-moderate load” situations, as I always like my Web sites to be 
> able to survive the situation in which they accidentally turn out to 
> be popular!
I had second thoughts about saying this, because one man's "low" load 
might be "high" for someone else especially if you're talking to someone 
who has run a pretty high profile site, but I'd wager you're the 
exception and there are a lot of smaller fish out there. I'd think that 
Jim is probably more in line with an average user, a moderately sized 
city/county that would probably come nowhere near maxing out even modest 
hardware with those jpegs of his. It's probably those smaller fish where 
compression is more important, maybe they're fighting for space on a 
department-level server or can't get budget approval to upgrade their 
drives. I'd hate for those folks to have to settle for a slow (cpu 
intensive) wavelet-based compression when a properly configured jpeg 
layer might be the compromise they're looking for.

jeff


More information about the mapserver-users mailing list