[gdal-dev] RFC 47 and Threading

Tomaka, Jacek jacek.tomaka at hexagongeospatial.com
Sun Nov 30 20:58:45 PST 2014


Blake, Even,
It looks like https://gist.github.com/flippmoke/16cd26a0247422f25840 improves multi threaded access even in GDAL 1.11.
But as mentioned before I believe it is not thread safe ;) At least that is what purists would say.
Regards.
Jacek Tomaka

From: Tomaka, Jacek
Sent: Sunday, 30 November 2014 6:00 PM
To: 'Blake Thompson'; Even Rouault
Cc: gdal-dev at lists.osgeo.org
Subject: RE: [gdal-dev] RFC 47 and Threading

Blake,
I tested your proposed change. It is better but sleeps are still visible. BTW:  isn’t it an example of double checked locking?
Even,
I am using Intel VTune Amplifier 2013’s Locks and Waits analysis. Please find attached screenshots. The bottommost four threads are performing GDAL workload. Light green waits can be seen 99% of them is in quanta of 0.125s.    Having seen that I set up a breakpoint on sleep in CPLAcquireMutex to identify which mutex it was.

Regards.
Jacek Tomaka

From: Blake Thompson [mailto:flippmoke at gmail.com]
Sent: Saturday, 29 November 2014 2:12 AM
To: Even Rouault
Cc: Tomaka, Jacek; gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>
Subject: Re: [gdal-dev] RFC 47 and Threading

Jacek and Even,
I already made an improvement on the pthread version of this:

https://github.com/flippmoke/gdal/blob/trunk/gdal/port/cpl_multiproc.cpp#L1135
I am sure we could do something similar to it to make an improvement in the non pthread version.
Adding at line:

https://github.com/flippmoke/gdal/blob/trunk/gdal/port/cpl_multiproc.cpp#L165
We could do something like this:
https://gist.github.com/flippmoke/16cd26a0247422f25840
Thanks,

Blake

On Fri, Nov 28, 2014 at 11:49 AM, Even Rouault <even.rouault at spatialys.com<mailto:even.rouault at spatialys.com>> wrote:
Jacek,

Did you identify which call sites triggered hCOAMutex to be taken ? Ideally we
should try to minimize them.
For the sake of my curiosity, which method did you try to evaluate time spend
in mutex sleeping ?
Another improvement would be that instead of dynamically created mutexes, we
could use statically created mutexes ( PTHREAD_RECURSIVE_MUTEX_INITIALIZER ).
That would avoid that hCOAMutex to become the "BGL" (Big GDAL Lock :
http://en.wikipedia.org/wiki/Giant_lock)
Building GDAL with MUTEX_NONE and using it in a multi-threaded case is a
potential risky game. This might work for some particular workloads & drivers,
but not in the general case. For example even the GTiff driver needs a mutex to
initialize libtiff/libgeotiff. See GTiffOneTimeInit() and there might be more
subtle stuff elsewhere.
But yes, I agree that an open flag to indicate per-dataset cache is an
interesting track to explore.

Even
> Blake, Even,
> I played a bit with pull request #39. I enabled per dataset cache by setting
> GDAL_DATASET_CACHING to YES. Awesome work!
> Our application code opens file in GDAL and holds GDALDataset per thread.
> Different threads can request same blocks but because cache should be per
> dataset i expected to see very little contention at the expense of some of
> the blocks being read multiple times.
>
> To my surprise i saw that nothing changed in terms of time spent sleeping.
> After some investigation i realized that the contention point is static mutex
> hCOAMutex. Having noticed that i saw MUTEX_NONE ifdef there so i decided to
> define it to effectively get rid of mutexes in GDAL.
> Guess what happened after recompilation. Because we only ever access
> GDALDataset from single dataset and the new GDALBlockManager is now per
> GDALDataset everything so far* (with the exception of PAMDataset destruction
> because it does not like null mutexes) works fine and scales so much better
> with the number of threads.
>
> *Tested reading only using GTiff driver 4 threads reading tiles smaller than
> blocks, sometimes two threads can read data from the same block .
>
> Do you see the same problem with hCOAMutex?
> It would be good if i could say: give me per dataset cache, open this dataset
> with no locking. Unless the global cache provides its operations with less
> sleeping.
>
> Regards.
> Jacek Tomaka
> ________________________________
> Od: gdal-dev-bounces at lists.osgeo.org<mailto:gdal-dev-bounces at lists.osgeo.org> [gdal-dev-bounces at lists.osgeo.org<mailto:gdal-dev-bounces at lists.osgeo.org>] w
> imieniu Blake Thompson [flippmoke at gmail.com<mailto:flippmoke at gmail.com>]
> Wys³ano: 29 sierpnia 2014 02:35
> Do: Even Rouault
> DW: gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>
> Temat: Re: [gdal-dev] RFC 47 and Threading
>
> Even and Andre,
>
> > I want to start off by saying a big thanks to Blake for taking his time
> > to tackle what can only be a very difficult problem.
> >  From what I can observe, the current discussion seems to be around the
> > boundary of who should be responsible for ensuring thread safety around
> > the block cache. The core of GDAL versus the individual drivers.
>
> The core will necessarily have to know about thread-safety because the block
> cache is there. The discussion is more whether the drivers must also
> necessary
> be thread aware, or if core mechanisms are sufficient to hide this detail to
> the
> drivers. And potentially offering to drivers a mechanism to deal themselves
> with
> thread-safety if they can have a more optimized implementation than the
> default
> one.
>
> Agreed, most specifically how to hide the detail of the protection of the
> pointer to the block cache's data that is passed through IReadBlock,
> IWriteBlock, and IRasterIO.
>
>
> > While I
> > see why such a conversation is important, as far as I am concerned, the
> > most important part should be how it affects users of GDAL at the
> > interface level.
>
> Agreed.
>
> > That is, if an application that is threaded is trying
> > to use GDAL, how does it ensure thread-safety? What you have to keep in
> > mind is that having some parts of the library not thread-safe basically
> > just pushes the mutexing/locking to the calling applications.
>
> Not necessarily. That's what I suggested in my previous email : if the costs
> of
> the mutex are not too expensive for non-threaded usage, then the API could
> systematically return thread-safe versions, that are potentially wrapped by
> GDALDatasetThreadSafe
>
> So in the scope of my RFC, I am not certain how we could have a thread safe
> cache and a non-threadsafe cache in any simple manner. I know that you are
> specifically talking about the possiblity of thread safe datasets, which I
> feel is necessarily part of this discussion but wanted to separate the two.
> If the thread-safe cache is too expensive I feel like that is a major issue
> however, and I am doing my best to avoid any performance hits for this
> change.
>
>
> >
> > Also, while it is important to document thread-safety limitations, might
> > I suggest adding thread-safe related capabilities (TestCapability),
> > especially if all drivers do not end-up having the same thread-safety
> > constraints.
>
> That might be a solution, although not ideal from a usability point of view
> (if
> we come with something more complex that non-thread-safe vs thread-safe, that
> might be difficult to understand by users of GDAL API), and from a
> GDAL-developer point of view as well (need to assess thread-safety for each
> driver).
> There can have subtelties : imagine that the VRT driver is made to be thread
> safe, but uses sources of drivers that might be not thread safe...
>
> >
> > I personally do not see a GDALDatasetThreadSafe wrapper as adding much
> > complexity. For instance, if you were to add a capability that indicates
> > if a driver is inherently thread-safe, you could add a new open method
> > to open a dataset in a thread-safe way with something like the following
> > pseudocode:
> >
> > DataSet OpenThreadSafe(GDALOpenInfo openInfo)
> > {
> >      DataSet dataSet = Open(openInfo);
> >
> >      if (!dataSet.TestCapability(THREAD_SAFE))
> >      {
> >           dataSet = wrapWithThreadSafeWrapper(dataSet);
> >      }
> >
> >      return dataSet;
> > }
>
> Yes, that's similar to what I imagined with my idea of GDAL_OF_THREADSAFE
> open
> flag.
>
> On the topic of the thread safe wrapper, I have spent more time thinking
> about it and I think this is probably the best solution to the problem of
> making all Datasets read safe, and I am willing to champion another RFC to
> implement this. However, the scope of this is even larger because it should
> be required to work with the OGR datasets as well.
>
> Thanks,
>
> Blake
>
--
Spatialys - Geospatial professional services
http://www.spatialys.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20141201/5bca6e1a/attachment-0001.html>


More information about the gdal-dev mailing list