System Configuration

Ed McNierney ed at TOPOZONE.COM
Thu Nov 8 13:58:56 PST 2007


Bruce -

 

Thanks; that's a good start.  5,000 requests at a time is a lot.  If
your estimate the average user will look at a map for 10 seconds before
requesting another one, you're talking about 50,000 simultaneous human
users.

 

Thanks for posting the map file - that helps.  There is one huge thing
you should be aware of and think about right away - the difference
between mode=browse and mode=map.  If you are using browse mode with an
HTML template (as it appears you are) you should reconsider that
decision.  When you use MapServer in browse mode, it needs to generate
the map image, legend image, etc. and then write them all to a local
disk before returning an HTML template to the client with the URLs of
those images embedded in it.  Writing to disk is the slowest thing you
can do on your server.  Having 5,000 simultaneous disk writes, with
simultaneous reads to the same disk, is very nearly pure evil.  That is
a huge drag on performance.  You really can't sustain nearly as many
users on a given machine in browse mode as you could in map mode.  What
kind of disk subsystem is being used for the temp files?

 

Map mode (mode=map) is a simple HTTP request for an image, and the
image, after being built, is sent directly to the client with no disk
I/O involved.  Much nicer.  Much faster.

 

If you must use browse mode, then I'd suggest you create a RAM disk to
hold the temp images, with appropriate monitoring to clean out old files
promptly.  But can you pre-build a set of legends, etc. so you can just
embed those and not crank them out every time?  You can also get some
benefit from storing our input shapefiles on a RAM disk, but don't
devote any RAM to that purpose if you're still writing temp files to
hard disk.

 

Let me first toss off a few generic suggestions that probably won't help
much but are worth doing:

 

1.       Get rid of unused fonts in your font file.  MapServer will
reach out and touch each one for each map request.

2.       If your shapefiles don't change often, preprocess them to
create a separate shapefile for each CLASS, so you don't have to filter
as much.

3.       If you must use CLASSes, organize them so the most
commonly-used class comes first in each LAYER.

4.       If you have a certain number of combinations of shapefiles used
in a given request, create multiple map files with only those layers.
For example, if your 12 layers are one set of 4 base layers that are
always used, and then 8 more layers only one of which is displayed at a
time, create 8 map files, each with only 5 layers instead of 12 - the
four base layers and one specific overlay layer.  Then use your
application code to figure out which map file to select.

 

And make sure you're using shptree to generate spatial indexes for all
your shapefiles!

 

This looks like it should be an application capable of running pretty
darn quickly.

 

-          Ed

 

Ed McNierney

Chief Mapmaker

Demand Media / TopoZone.com

73 Princeton Street, Suite 305

North Chelmsford, MA  01863

ed at topozone.com

Phone: +1 (978) 251-4242

Fax: +1 (978) 251-1396

 

 

 

 

 

From: UMN MapServer Users List [mailto:MAPSERVER-USERS at LISTS.UMN.EDU] On
Behalf Of Bruce Cheney
Sent: Thursday, November 08, 2007 12:17 PM
To: MAPSERVER-USERS at LISTS.UMN.EDU
Subject: Re: [UMN_MAPSERVER-USERS] System Configuration

 

Many good questions. I will see if I can catch them all. A bit on the
nature of the application. 

*	We are only using vector data (we assumed that the raster would
slow it down).
*	We are serving many different maps stored separately with the
same composition of layers. Each map has 6 layers (4 polygon and 2 point
layers for labeling). 
*	Each of the different map sets has differing quantities of
features.  For the most significant layers they average around 10,000
features but may be as high as 100,000. 
*	A majority of the requests are to display a small area of one of
the maps so the rendering focuses in to a few features.  The user will
query a database which will allow for viewing the map that is zoomed to
the area of interest.
*	No layer reprojecting (we assumed this would also slow it down).
*	The output map is PNG with dimensions 419 X 403.
*	We are using PHP_mapscript to generate the requests.  The
parameters for the map generation come from a database and the user
requested location.  So there are a few lines of code to find the
location on the map and generate the images.
*	The mapfile contains about 12 layers.  Several layers to display
the primary polygon layer thematically and a couple extra to show the
polygons with outlines. 
*	Data is stored in Shapefiles

 

I made attempts to stream-line the use of extra features to ensure the
speed.  I certainly may be using items that hurt instead of help.  Here
is the mapfile.  Now that I look at the mapfile there may be a couple
items that I originally had intended to use but are now just relics and
time wasters.

<<postforwebforum.map>> 
Now as for the users.  We are assuming 5000 simultaneous - all at the
same instant. This would assume a substantially larger group of users
accessing the site at the same time.   We assume this to be the peak
stress for Stage 1 of the app.  






Bruce - 

My channeling sensors went off when Frank rang <g>. 

It is certainly true that a bit of experience and contemplation can help
you discover optimization opportunities that aren't immediately
self-evident.  Can you describe the nature of your map application?  Are
you using raster data, vector data, or both?  What size is your data, in
numbers of features and/or files?  What kind of disk subsystem is being
used?  Is there layer reprojection going on?

Generalizations are rarely helpful (except for this one).  It's like
being told the average man is 5' 7" tall - it tells you nothing about
how tall I am.  MapServer performance depends on a number of factors,
but the best place to start is a detailed understanding of what exactly
you're trying to do with MapServer.

It would be most helpful to us if you could post your map file and a
sample URL request, preferably one that is externally (publicly)
visible.  And can you define what you mean by "simultaneous" users?  Do
you mean 5,000 map requests all being generated at exactly the same
time?  Or do you mean 5,000 human users asking for a new map every X
seconds or so?  And if the latter, what value are you using for X?

     - Ed 

Ed McNierney 
Chief Mapmaker 
Demand Media / TopoZone.com 
73 Princeton Street, Suite 305 
North Chelmsford, MA  01863 
Phone: 978-251-4242, Fax: 978-251-1396 
ed at topozone.com 

 

-----Original Message----- 
From: UMN MapServer Users List [mailto:MAPSERVER-USERS at LISTS.UMN.EDU
<mailto:MAPSERVER-USERS at LISTS.UMN.EDU> ] On Behalf Of Frank Warmerdam 
Sent: Wednesday, November 07, 2007 8:26 PM 
To: MAPSERVER-USERS at LISTS.UMN.EDU 
Subject: Re: [UMN_MAPSERVER-USERS] System Configuration 

Bruce Cheney wrote: 
> We have been given a requirement to support 5000 simultaneous users.

> What we are finding is that MapServer bogs down around 400 
> simultaneous users on a test machine.  It looks like it is likely 
> slowing because of the threading issue.  We haven't tested on a 
> production machine but are estimating that it should support double 
> what are test machine could handle (double the processor and RAM).  So

> at least 800 simultaneous users.  Divide that out with the 5000 and we

> need a minimum of 6-7 web servers supporting MapServer.  We will 
> certainly scale this as is needed but I do need some idea going in as
to what is going to be required. 

Bruce, 

I'm curious how many map requests per minute you expect 800 simultaneous
users to generate. 

> Does this sound like results that others expect or is this quantity 
> above what others have tested?  Also Does anyone know of a solution in

> the works to run make mapserver thread safe and/or up the overall 
> speed?  I am not complaining about the speed just wondering what is in

> the works. 

In various aspects MapServer is already thread safe though there are
also known "unsafe" components, and some components are wrapped by big
locks that significantly reduce the value of multiple threads.

Progress occurs by fits and starts, largely based on support from user
organizations depending on multi-threading.  For instance, in 5.0 I
implement locking around OGR for a client of mine in Australia.

(This is a subtle way of suggesting you hire someone to make this happen
if it is what you want!) 

All this aside, by default MapServer is *massively multi-threaded*. 
I say this since the default operation is to start a new cgi instance
for each request - each is essentially an independent thread.

Of course, the downside of whole-process cgi style multithreading is
that very little context is preserved from request to request.  Map
files, data file headers, etc all need to be reparsed for each request.

My point here is that you need to think carefully about the application
flow to take much advantage of multiple threading within a single
process.

Also, if I may channel Ed, if you wanted to squeeze more performance out
of mapserver, you really need to start by figuring out what it is
spending it's time doing.  Where is it spending it's time?

  o waiting for disk?  (perhaps you are reading more data than you
need?) 
  o rendering (perhaps your data is overdense, or you are using
expensive 
    rendering options?) 
  o parsing mapfiles (perhaps you mapfile has too many unused layers?)
etc. 

Best regards, 
-- 
---------------------------------------+--------------------------------

---------------------------------------+------ 
I set the clouds in motion - turn up   | Frank Warmerdam,
warmerdam at pobox.com 
light and sound - activate the windows | http://pobox.com/~warmerdam
<http://pobox.com/~warmerdam>  
and watch the world go round - Rush    | President OSGeo,
http://osgeo.org <http://osgeo.org>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/mapserver-users/attachments/20071108/ba789b32/attachment.htm>


More information about the MapServer-users mailing list