[mapguide-internals] RE: Death by mutex

Trevor Wekel trevor_wekel at otxsystems.com
Thu Jan 13 19:10:15 EST 2011


Hi Bruce,

I have a patch attached to http://trac.osgeo.org/mapguide/ticket/1584.  It seems to correct to issue.  I will run some longer length tests against the patch just to be sure.  

Regards,
Trevor

-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Bruce Dechant
Sent: January 13, 2011 10:17 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] RE: Death by mutex

Trevor,

Nice find. Issues like this are always fun to find :)
Please keep us posted and I would be happy to help with any code review for this.

Thanks,
Bruce

-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Trevor Wekel
Sent: Thursday, January 13, 2011 10:08 AM
To: MapGuide Internals Mail List
Subject: [mapguide-internals] Death by mutex

Hi everyone,

I have found a serious interaction issue in the MapGuide Server.  I did a quick check in Subversion and this issue is present in MGOS 2.1, MGOS 2.2, MGE 2010 and MGE 2011.

Basically, if you update any feature source while the MapGuide Server is under load serving pooled feature sources (ODBC, SQL Server, Oracle, etc.) then there is a risk of a deadlock occurring between the feature service cache and the Fdo connection manager.

What does this mean?  Creating or editing features sources on a live site can take it down.  How did I find it?  Would you believe a 10,000 line The Grinder script?


For those with code access, here's a pseudo stack:

Case 1: A thread has the connection manager lock and wants the feature service cache lock
MgFdoConnectionManager::Open(MgResourceIdentifier* resourceIdentifier)
  acquires MgFdoConnectionManager mutex
  if a pooled connection:
    MgFdoConnectionManager::FindFdoConnection(MgResourceIdentifier* resourceIdentifier)
      MgCacheManager::GetFeatureSourceCacheItem
        MgFeatureServiceCache::GetFeatureSource
          acquires MgFeatureServiceCache mutex

Case 2: A thread has the feature service cache lock and wants the connection manager lock MgApplicationRepositoryManager::NotifyResourceChanged
  MgCacheManager::NotifyResourceChanged(MgResourceIdentifier* resource)
    if a feature source:
      acquires MgFeatureServiceCache mutex
      MgFdoConnectionManager::RemoveCachedFdoConnection
        acquires MgFdoConnectionManager mutex


And this is the code we need to change.  If we acquire the connection manager mutex first, we should be able to avoid the deadlock. I have not done a complete code review yet so there may be other cases where this could occur.

Server\src\Common\Manager\CacheManager.cpp
void MgCacheManager::NotifyResourceChanged(CREFSTRING resource) {
    if (STRING::npos != resource.rfind(MgResourceType::FeatureSource))
    {
        // The mutex usage and the method call order here are important
        // because they ensure all the caches are in sync.
        ACE_MT(ACE_GUARD(ACE_Recursive_Thread_Mutex, ace_mon, m_featureServiceCache.m_mutex));

        m_fdoConnectionManager->RemoveCachedFdoConnection(resource);
        m_featureServiceCache.RemoveEntry(resource);
    }
}


Regards,
Trevor
_______________________________________________
mapguide-internals mailing list
mapguide-internals at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapguide-internals



More information about the mapguide-internals mailing list