[mapguide-internals] std::string not thread safe on Linux

Traian Stanev traian.stanev at autodesk.com
Thu Aug 5 15:31:57 EDT 2010


Hi Trevor,

Instead of:

    CREFSTRING function(params);

Is it not better to change to:

    const wchar_t* function(params);

?

In the change you describe (returning STRING) could you still potentially end up making an assignment of string object from one thread to another? I mean the STRING object which you return is constructed by function() and then copied by the calling code in the assignment? I admit I am not exactly clear on where the thread boundary is in this case though.

Regardless of all this, you would be making two string allocations instead of one (one by function() and one by the caller of function), while returning const wchar_t* would mean you are still only allocating one string.


Traian


-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Trevor Wekel
Sent: Thursday, August 05, 2010 3:26 PM
To: MapGuide Internals Mail List
Subject: RE: [mapguide-internals] std::string not thread safe on Linux

Hi everyone,

I started running through the implementation for option 2 with versa_string.  We have references to std::wstring (wide char std::string) all over the place and they all have to be replaced with versa_string.  After changing more than 50 files in Common alone, I decided to take a harder look at option 1.

So I did some "hacking" to try and decouple std::string instances which could be shared by multiple threads:

I replaced "CREFSTRING function(params);" with "STRING function(params);"  Decoupling is difficult when you pass object references around.

For selected objects (MgUserInformation for now), I forced a deep copy of string by calling c_str() during gets and sets.  For example:

MgUserInformation::MgUserInformation(CREFSTRING session)
{
  m_sessionId = session;
  // is replaced with
  m_sessionId = session.c_str();
}

STRING MgUserInformation::GetSessionId()
{
  return m_sessionId;
  // is replaced with
  return (STRING)m_sessionId.c_str();
}

I suspect there are other places in the code which will require the same workaround.  My load test is only executing Http GETRESOURCECONTENT and GETTILEIMAGE ops.  I also wonder if this will be maintainable moving forward...

Now for some good news.  So far, these workarounds seem to correct the behaviour.  Half an hour of five minute runs with different user loads did not throw any "double frees" or seg faults.  I have just started a one hour load test with 200 concurrent users to see what happens.

If that works, I will start a 24 hour test.  Not sure I will have enough disk space though.  MapGuide generates roughly 2GB of access log each hour under these test conditions (2500+ ops/second).
 
Regards,
Trevor


-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Haris Kurtagic
Sent: August 5, 2010 1:49 AM
To: MapGuide Internals Mail List
Subject: Re: [mapguide-internals] std::string not thread safe on Linux

I agree with Traian, that would be good regardless of string threading issue.

I am not sure if i understood correctly option 3, but I think moving
away from stl in general for MG strings wouldn't be good. We would
need to do string copying to use lot of standard libraries.

Haris

On Thu, Aug 5, 2010 at 1:38 AM, Traian Stanev
<traian.stanev at autodesk.com> wrote:
>
> But also, MapGuide should probably not be copying STL strings across thread boundaries (in fact it should avoid copying STL strings in all cases for performance reasons), so I'm not sure whether biting the bullet and fixing it as option 1 is not the best way to go...
>
> Traian
>
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Haris Kurtagic
> Sent: Wednesday, August 04, 2010 6:37 PM
> To: MapGuide Internals Mail List
> Subject: Re: [mapguide-internals] std::string not thread safe on Linux
>
> Huh, I would think that they are thread-safe, some time ago std
> libraries had issues with refcounting. Most of them drop ref counting
> or using atomic operations.
> Trevor can you point me to std library source code of implementation
> which you think is used in your compilation ?
>
> Haris
>
> On Wed, Aug 4, 2010 at 11:41 PM, Trevor Wekel
> <trevor_wekel at otxsystems.com> wrote:
>> Hi everyone,
>>
>> After some in-depth digging, I have determined that there is a fundamental difference between the implementation of std::string on Linux/GCC and std::string on Windows/Visual Studio 2008.  std::string on Linux uses an internally reference counted data structure to perform shallow copies of the string data during assignment or operator=().  On Windows, assignment and operator=() are deep copied so no information is shared between std::string instances.
>>
>> The following GCC bug was logged years ago and was suspended due to performance implications "Lack of Posix compliant thread safety in std::basic_string" http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334.  In other words, std::string on Linux is absolutely not thread safe.
>>
>> What does this mean to MapGuide?  Primarily, this issue comes into play with logging (access, trace, error, etc).  The log writing is performed on a separate thread and we use std::string to propagate information to that thread.   On Linux/GCC, the reference counted structure can be modified by both the logging thread and the worker thread simultaneously causing unexpected behaviour.  Unexpected behaviour takes the form of "glibc double free" and in some cases, a crash of the MapGuide Server process.
>>
>> This is easily reproducible on servers processing very high operation rates with logging turned on.  For example, I can reproduce the "double free" in under five minutes when serving GETTILEIMAGE requests to 40+ simultaneous users on an 8 core box.
>>
>> How do we fix this?  We have a few options:
>>
>> 1.  Identify all areas where strings can propagate from one thread to another and recode them to avoid the propagation of std::string between threads.
>>
>> 2.  Replace std::string on Linux with something else that is thread safe.  On Linux/GCC there is another "string" implementation called versa_string defined in ext/vstring.h.  As far as I know, this implementation performs a deep copy like VS 2008 and should be safer.
>>
>> 3. Drop std::string altogether and use ICU on both Windows and Linux http://site.icu-project.org/.  The documentation seems to suggest that UnicodeString from ICU can be copied in a thread safe manner.  UnicodeString also uses an internally reference counted object so it should improve performance over the deep copied std::string.  As a side effect, this may boost performance on Windows.
>>
>> I think option 1 will be difficult to achieve due to threading interactions an object caches in MapGuide.  Option 2 should be possible with some #ifdefs (I hope).  And option 3 might be an appropriate course of action for MapGuide 2.3.
>>
>>
>> So, what do we do?  I don't not believe we can release MapGuide 2.2 until this issue is resolved.
>>
>>
>> Regards,
>> Trevor
>>
>>
>> _______________________________________________
>> mapguide-internals mailing list
>> mapguide-internals at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>>
>>
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>
_______________________________________________
mapguide-internals mailing list
mapguide-internals at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapguide-internals


_______________________________________________
mapguide-internals mailing list
mapguide-internals at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapguide-internals


More information about the mapguide-internals mailing list