Is MapServer Thread-safe?

Dan Little danlittle at YAHOO.COM
Tue Jan 22 09:22:03 EST 2008


(steps out on a limb, presents neck to chopping block)

After writing the below email: This probably sounds (via email) much more negative and fire-starting than it is intended.  I've tried to edit it to not sound that way, so please take the following statements with a grain of salt, and I understand my general conclusion is, "Threads are nice, processes are probably better (for some reasons), and it's probably a better idea for Mapserver to work on additional features and bug fixes before thread safety."

Frankly, thread-safety-ness seems like a nice goal and all, but really it can convolute the code and can create for poor performance.

Very general:
http://badtux.org/home/eric/editorial/threads.php

Good academic discussion of thread programming problems:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

Also, I do not see thread-safety-ness serving the "mass appeal" of Mapserver users and developers.  Here are a (somewhat) limited set of Mapserver applications that I've found (in moderate-to-high demand environments):

1) WFS/WMS servers, Mapserver Image serving, and Imagemap generation.  From my experience these are the actions that the vast majority of Mapservers perform on the most regular basis.  Mapserver is instantiated as a CGI script, reads the mapfile, processes the CGI request and returns an appropriate result based on the request, after the result is returned Mapserver ends.   You might be saying, "Yes, but this creates the ever-so-scary heavyweight process." To which I say the following:
    a) There isn't a risk for as many memory leaks with a heavyweight process.  Since the heavy weight process can typically  be run in a sequential fashion there is no need for complicated semaphores and generally there is no need to write code that will try to grab the same part of memory simultaneously.
    b) Heavy weight processes can use more memory, yes, but given the Mapserver binary's relative size, even with every option under-the-sun compiled into the binary, a system with 1Gb of RAM and a Pentium IV can still adequately serve nearly 300 users without a significant bottle neck (assuming all other parts of the system are adequate for those demands, e.g., the network to which the machine is connected). The baseless claim is made from the experiences we had at the City of Saint Paul serving nearly that many users off of a relatively pathetic single-CPU mid-range Dell-server.
    c) "The operating system is better at process management than you are."  This is something I've head to learn the HARD WAY!   It's also something I've spent a lot of time coding that I learned I should not have spent coding because entire teams of developers had already done a much better job than I was doing.  Having to consider resource management a top actual application development and trying to do any form of optimization can leave any developer chasing their tail.   And even once that code is "perfected" there is still a fight with the operating system over threads.   Thread clean-up is a very difficult thing for the operating system to perform (especially Windows, which really only "emulates" POSIX thread management).  The OS has no deterministic way of knowing when the code is done with the thread and most will keep the thread around for a while after it thinks it's done (even if the code tells the OS to free the thread) because it doesn't
 want to upset the execution of the code.   When a process calls exits or reaches the end of the code block the OS can rightly say those pages of memory can be freed.
    d) "But the data will be unloaded from memory and my files are BIG."  Not true on any modern operating system.   Operating systems manage disk cache for a reason.  Every modern OS, all the UNIX/Linux derivatives and NT4+, implements a very good method for caching frequently used files.  Even if the file is large, it will be kept in memory if it is being called frequently enough.  
    In conclusion, these users would not see a major benefit from people slaving hours away at thread-safety, when in fact, greater optimization would "help" them the most.

2) Mapscripters!  Mapscript is fantastic, it is a very well documented API into Mapserver, which is a coalescing of a number of very useful libraries.  I'm quite sure a lot of folks here have spent some time working with Mapscript and a number of them in high-demand environments.  I under stand the .Net family of languages break this rule, but most of the "Mapscript languages" really do not require thread safety to run efficiently or to handle a great number of users.  Arguably most of the scripting languages in a CGI environment perform the following steps:  load the interpreter (php/python/perl/etc.), the interpreter parses the script, loads/links the appropriate modules/libraries, executes the code, cleans up the processes.  The vast majority of these are done using a "heavy weight" process (fork()'d!).  


----- Original Message ----
From: Umberto Nicoletti <umberto.nicoletti at GMAIL.COM>
To: MAPSERVER-DEV at LISTS.UMN.EDU
Sent: Tuesday, January 22, 2008 7:34:56 AM
Subject: Re: [UMN_MAPSERVER-DEV] Is MapServer Thread-safe?


>
> We're tending to summarize these in the thread safety faq. However
> it's slightly outdated and I'm not totally sure about the actual
 state
> of the components enumerated there.
>

I update that FAQ entry whenever I find out that a new component has
been made thread safe (and I have some spare time ;-)).
Anyway, I think it pretty accurately describes the current state,
except maybe for the new and not yet released SQLServer 2008 plugin.

Umberto






      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/mapserver-dev/attachments/20080122/9ed29916/attachment.html


More information about the mapserver-dev mailing list