[mapguide-internals] RE: std::string not thread safe on Linux

Trevor Wekel trevor_wekel at otxsystems.com
Wed Aug 4 18:33:05 EDT 2010


Hi Bruce,

Option 3 may also give us more consistent internationalization support (date formats, numeric formats, etc) for Windows and Linux since ICU is an actively maintained library that following the evolving Unicode character standard.

I am currently investigating option 1.  After some digging, I found out that the internal refcount for std::string on Linux uses atomic operations.  This should make "read only" copies of the string mostly thread safe.  However, the refcount is not getting maintained correctly.  I suspect that the following pattern in the code might be causing us grief.

"CREFSTRING someFunction(someParams);"

Switching it to "STRING someFunction(someParams);" will add an in-flight refcount to the internal string object.  This may help with the double frees.  I'm hoping... It is a much easier fix than ripping out std::string on Linux.

I should know if this corrects the problem sometime tomorrow.

std::string to versa string should be a drop in replacement since they have the same method signatures.  ICU only shares a few signatures with std::string so it will be much more work to implement. 


Regards,
Trevor


-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Bruce Dechant
Sent: August 4, 2010 4:23 PM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: std::string not thread safe on Linux

Trevor,

Nice find!

I would go with option 2, but is option 3 really needed if 2 works. 
I'm just thinking of the extra work needed to implement ICU - maybe the work is trivial.

Thanks,
Bruce

-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Trevor Wekel
Sent: Wednesday, August 04, 2010 3:41 PM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] std::string not thread safe on Linux

Hi everyone,

After some in-depth digging, I have determined that there is a fundamental difference between the implementation of std::string on Linux/GCC and std::string on Windows/Visual Studio 2008.  std::string on Linux uses an internally reference counted data structure to perform shallow copies of the string data during assignment or operator=().  On Windows, assignment and operator=() are deep copied so no information is shared between std::string instances.

The following GCC bug was logged years ago and was suspended due to performance implications "Lack of Posix compliant thread safety in std::basic_string" http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334.  In other words, std::string on Linux is absolutely not thread safe.

What does this mean to MapGuide?  Primarily, this issue comes into play with logging (access, trace, error, etc).  The log writing is performed on a separate thread and we use std::string to propagate information to that thread.   On Linux/GCC, the reference counted structure can be modified by both the logging thread and the worker thread simultaneously causing unexpected behaviour.  Unexpected behaviour takes the form of "glibc double free" and in some cases, a crash of the MapGuide Server process.
 
This is easily reproducible on servers processing very high operation rates with logging turned on.  For example, I can reproduce the "double free" in under five minutes when serving GETTILEIMAGE requests to 40+ simultaneous users on an 8 core box.  

How do we fix this?  We have a few options:

1.  Identify all areas where strings can propagate from one thread to another and recode them to avoid the propagation of std::string between threads.

2.  Replace std::string on Linux with something else that is thread safe.  On Linux/GCC there is another "string" implementation called versa_string defined in ext/vstring.h.  As far as I know, this implementation performs a deep copy like VS 2008 and should be safer.

3. Drop std::string altogether and use ICU on both Windows and Linux http://site.icu-project.org/.  The documentation seems to suggest that UnicodeString from ICU can be copied in a thread safe manner.  UnicodeString also uses an internally reference counted object so it should improve performance over the deep copied std::string.  As a side effect, this may boost performance on Windows.

I think option 1 will be difficult to achieve due to threading interactions an object caches in MapGuide.  Option 2 should be possible with some #ifdefs (I hope).  And option 3 might be an appropriate course of action for MapGuide 2.3.


So, what do we do?  I don't not believe we can release MapGuide 2.2 until this issue is resolved.


Regards,
Trevor 

_______________________________________________
mapguide-internals mailing list
mapguide-internals at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapguide-internals



More information about the mapguide-internals mailing list