[gdal-dev] RFC 47 and Threading

Even Rouault even.rouault at spatialys.com
Fri Nov 28 09:49:28 PST 2014


Jacek,

Did you identify which call sites triggered hCOAMutex to be taken ? Ideally we
should try to minimize them.
For the sake of my curiosity, which method did you try to evaluate time spend
in mutex sleeping ?
Another improvement would be that instead of dynamically created mutexes, we
could use statically created mutexes ( PTHREAD_RECURSIVE_MUTEX_INITIALIZER ).
That would avoid that hCOAMutex to become the "BGL" (Big GDAL Lock :
http://en.wikipedia.org/wiki/Giant_lock)
Building GDAL with MUTEX_NONE and using it in a multi-threaded case is a
potential risky game. This might work for some particular workloads & drivers,
but not in the general case. For example even the GTiff driver needs a mutex to
initialize libtiff/libgeotiff. See GTiffOneTimeInit() and there might be more
subtle stuff elsewhere.
But yes, I agree that an open flag to indicate per-dataset cache is an
interesting track to explore.

Even

> Blake, Even,
> I played a bit with pull request #39. I enabled per dataset cache by setting
> GDAL_DATASET_CACHING to YES. Awesome work!
> Our application code opens file in GDAL and holds GDALDataset per thread.
> Different threads can request same blocks but because cache should be per
> dataset i expected to see very little contention at the expense of some of
> the blocks being read multiple times.
>
> To my surprise i saw that nothing changed in terms of time spent sleeping.
> After some investigation i realized that the contention point is static mutex
> hCOAMutex. Having noticed that i saw MUTEX_NONE ifdef there so i decided to
> define it to effectively get rid of mutexes in GDAL.
> Guess what happened after recompilation. Because we only ever access
> GDALDataset from single dataset and the new GDALBlockManager is now per
> GDALDataset everything so far* (with the exception of PAMDataset destruction
> because it does not like null mutexes) works fine and scales so much better
> with the number of threads.
>
> *Tested reading only using GTiff driver 4 threads reading tiles smaller than
> blocks, sometimes two threads can read data from the same block .
>
> Do you see the same problem with hCOAMutex?
> It would be good if i could say: give me per dataset cache, open this dataset
> with no locking. Unless the global cache provides its operations with less
> sleeping.
>
> Regards.
> Jacek Tomaka
> ________________________________
> Od: gdal-dev-bounces at lists.osgeo.org [gdal-dev-bounces at lists.osgeo.org] w
> imieniu Blake Thompson [flippmoke at gmail.com]
> Wys³ano: 29 sierpnia 2014 02:35
> Do: Even Rouault
> DW: gdal-dev at lists.osgeo.org
> Temat: Re: [gdal-dev] RFC 47 and Threading
>
> Even and Andre,
>
> > I want to start off by saying a big thanks to Blake for taking his time
> > to tackle what can only be a very difficult problem.
> >  From what I can observe, the current discussion seems to be around the
> > boundary of who should be responsible for ensuring thread safety around
> > the block cache. The core of GDAL versus the individual drivers.
>
> The core will necessarily have to know about thread-safety because the block
> cache is there. The discussion is more whether the drivers must also
> necessary
> be thread aware, or if core mechanisms are sufficient to hide this detail to
> the
> drivers. And potentially offering to drivers a mechanism to deal themselves
> with
> thread-safety if they can have a more optimized implementation than the
> default
> one.
>
> Agreed, most specifically how to hide the detail of the protection of the
> pointer to the block cache's data that is passed through IReadBlock,
> IWriteBlock, and IRasterIO.
>
>
> > While I
> > see why such a conversation is important, as far as I am concerned, the
> > most important part should be how it affects users of GDAL at the
> > interface level.
>
> Agreed.
>
> > That is, if an application that is threaded is trying
> > to use GDAL, how does it ensure thread-safety? What you have to keep in
> > mind is that having some parts of the library not thread-safe basically
> > just pushes the mutexing/locking to the calling applications.
>
> Not necessarily. That's what I suggested in my previous email : if the costs
> of
> the mutex are not too expensive for non-threaded usage, then the API could
> systematically return thread-safe versions, that are potentially wrapped by
> GDALDatasetThreadSafe
>
> So in the scope of my RFC, I am not certain how we could have a thread safe
> cache and a non-threadsafe cache in any simple manner. I know that you are
> specifically talking about the possiblity of thread safe datasets, which I
> feel is necessarily part of this discussion but wanted to separate the two.
> If the thread-safe cache is too expensive I feel like that is a major issue
> however, and I am doing my best to avoid any performance hits for this
> change.
>
>
> >
> > Also, while it is important to document thread-safety limitations, might
> > I suggest adding thread-safe related capabilities (TestCapability),
> > especially if all drivers do not end-up having the same thread-safety
> > constraints.
>
> That might be a solution, although not ideal from a usability point of view
> (if
> we come with something more complex that non-thread-safe vs thread-safe, that
> might be difficult to understand by users of GDAL API), and from a
> GDAL-developer point of view as well (need to assess thread-safety for each
> driver).
> There can have subtelties : imagine that the VRT driver is made to be thread
> safe, but uses sources of drivers that might be not thread safe...
>
> >
> > I personally do not see a GDALDatasetThreadSafe wrapper as adding much
> > complexity. For instance, if you were to add a capability that indicates
> > if a driver is inherently thread-safe, you could add a new open method
> > to open a dataset in a thread-safe way with something like the following
> > pseudocode:
> >
> > DataSet OpenThreadSafe(GDALOpenInfo openInfo)
> > {
> >      DataSet dataSet = Open(openInfo);
> >
> >      if (!dataSet.TestCapability(THREAD_SAFE))
> >      {
> >           dataSet = wrapWithThreadSafeWrapper(dataSet);
> >      }
> >
> >      return dataSet;
> > }
>
> Yes, that's similar to what I imagined with my idea of GDAL_OF_THREADSAFE
> open
> flag.
>
> On the topic of the thread safe wrapper, I have spent more time thinking
> about it and I think this is probably the best solution to the problem of
> making all Datasets read safe, and I am willing to champion another RFC to
> implement this. However, the scope of this is even larger because it should
> be required to work with the OGR datasets as well.
>
> Thanks,
>
> Blake
>


-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list