[mapserver-users] Ed's Rules for the Best Raster Performance

Doug_Newcomb at fws.gov Doug_Newcomb at fws.gov
Wed Sep 17 08:26:41 EDT 2008


>One other thing I just noticed: to effectively use the tiled tiffs I need
BigTiff support which looks like it >is a real new thing (needs libtiff 4.0
which is still in beta according to the remotesensing.org ftp site.)
If you compile gdal 1.5.2 from source and use the tiff=internal option, you
get bigtiff support, without installing libtiff 4.0 yourself,
http://www.gdal.org/formats_list.html .  I have not tested this as an image
source for mapserver yet, but I have created a couple of bigtiff images.

Doug


Doug Newcomb
USFWS
Raleigh, NC
919-856-4520 ext. 14 doug_newcomb at fws.gov
---------------------------------------------------------------------------------------------------------

The opinions I express are my own and are not representative of the
official policy of the U.S.Fish and Wildlife Service or Dept. of Interior.
Life is too short for undocumented, proprietary data formats.


                                                                           
             "Jim Klassen"                                                 
             <Jim.Klassen at ci.s                                             
             tpaul.mn.us>                                               To 
             Sent by:                  <jeff.hoffmann at gmail.com>,          
             mapserver-users-b         <ed at mcnierney.com>                  
             ounces at lists.osge                                          cc 
             o.org                     mapserver-users at lists.osgeo.org     
                                                                   Subject 
                                       Re: [mapserver-users] Ed's Rules    
             09/16/2008 04:57          for the Best Raster Performance     
             PM                                                            
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




One other thing I just noticed: to effectively use the tiled tiffs I need
BigTiff support which looks like it is a real new thing (needs libtiff 4.0
which is still in beta according to the remotesensing.org ftp site.)
Anyway, I'm still going to give this a try and check the performance
difference.  For us, with existing hardware, disk space would be an issue
for uncompressed images so I will be trying JPEG in TIFF. All I can say
about the JPEG->JPEG re-compression artifacts is with the existing setup,
we haven't had any complaints.

The "large" number of files doesn't (in our case < 500k) doesn't seem to
effect the operational performance of our server in a meaningful way. (I
don't remember the exact numbers, but I have measured file access time in
directories with 10 files and 100k files and they the difference was much
less than the total mapserver run time.) It is a pain though when making
copies or running backups. As Bob said, for a typical image request sizes
of less than 1000px, the "tiles/overviews" design pretty much limits the
number of files mapserver has to touch to 4, so access time stays fairly
constant across different views.

Also, I forgot to mention that the disk subsystem here isn't exactly your
average desktop PC. (8*10K SCSI drives in RAID 10 with 1GB dedicated to the
RAID controller.) Similar requests on a much older machine we have around
here with slower disk/processor take about 800ms. I don't have any info to
attribute this to disk vs. cpu or both.

One of the main reasons we decided on JPEG here early on was the licensing
headaches surrounding mrsid/ecw/jp2. JPEG was easy and supported well by
just about everything. Actually, for that matter, TIFFs are a lot harder to
use directly by most (non-GIS) applications than JPEG/PNG too. We may be
past this being an issue, but once upon a time, before our use of
mapserver, the JPEG tiles were being accessed directly from the webserver
by various client applications (that didn't all understand tiff). Instead
of using world files to determine the extents, the tiles were accessed by a
predictable naming convention relating to the extent. When we started using
mapserver we retained the existing tiling scheme (adding world files and
tileindexes so mapserver could position the tiles) and it seemed to work
well, so haven't given it much thought since.

Thanks for all the interest and discussion around this.

Jim K

>>> Jeff Hoffmann <jeff.hoffmann at gmail.com> 09/16/08 4:03 PM >>>
Ed McNierney wrote:
> If you want to shrink the file size in this thought experiment that’s
> fine, but realize that you are thereby increasing the number of files
> that need to be opened for a random image request. And each new open
> file incurs a relatively high cost (directory/disk seek overhead,
> etc.); those thousands or millions of JPEGs aren’t just hard to keep
> track of – they hurt performance. I have been the keeper of tens of
> millions of such files, and have seen some of those issues myself.
That's certainly a consideration, but you could also counter that by
using jpeg compressed geotiffs. You'd want to make sure to tile them,
otherwise you'd have that same big jpeg performance problem -- I think
tiled effectively treats them as individual jpegs wrapped in one big
file. No clue on what the actual performance of that would be, but it's
something to consider if you've got filesystem performance problems.

> The example I gave (and my other examples) are, however, primarily
> intended to help people think about all the aspects of the problem.
> File access performance in an application environment is a complex
> issue with many variables and any implementation should be prototyped
> and tested. All I really care about is that you don’t think it’s
> simple and you try to think through all the consequences of an
> implementation plan.
One of the reasons why I replied to this originally is that I think it's
good to keep options open so people can evaluate them for their specific
circumstances. What I was hearing you say was "if you make bad choices,
it'll perform badly" & I'm just trying to throw out some other choices
that would better and probably be make it worth a try for a lot of
people. It's pretty common for me to get imagery in 5000x5000 or
10000x10000 geotiff tiles. I just got imagery for one county like that
that weighs in at close to 1TB; if I were to decide I can't afford that
kind of disk space for whatever reason, I'd investigate some compressed
options. If I don't know any different, I might just compress that tile
into one large jpeg (like in your example), discover the performance is
terrible, discard it & file away in my mind that jpegs perform terribly.
I might not understand that a 5000x5000 jpeg is going to use 75MB of
memory and take an order of magnitude longer to decompress than that
1000x1000 jpeg that only takes up 3MB in memory and decompresses nearly
instantly while giving you that same 500x500 chunk of image. There are
nice things about jpegs, like you don't need commercial libraries like
you would with ecw, mrsid, jp2, you don't have to worry about licensing
issues, size constraints, compiler environment, all that, which makes it
a pretty attractive compressed format if you can get it to perform well,
but if you don't know to break them up into smallish chunks I don't
think getting to that performance level is really possible (for exactly
the reasons you describe).
> I will also admit to being very guilty of not designing for
> “low-moderate load” situations, as I always like my Web sites to be
> able to survive the situation in which they accidentally turn out to
> be popular!
I had second thoughts about saying this, because one man's "low" load
might be "high" for someone else especially if you're talking to someone
who has run a pretty high profile site, but I'd wager you're the
exception and there are a lot of smaller fish out there. I'd think that
Jim is probably more in line with an average user, a moderately sized
city/county that would probably come nowhere near maxing out even modest
hardware with those jpegs of his. It's probably those smaller fish where
compression is more important, maybe they're fighting for space on a
department-level server or can't get budget approval to upgrade their
drives. I'd hate for those folks to have to settle for a slow (cpu
intensive) wavelet-based compression when a properly configured jpeg
layer might be the compromise they're looking for.

jeff

_______________________________________________
mapserver-users mailing list
mapserver-users at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


More information about the mapserver-users mailing list