[gdal-dev] RFC 47 and Threading

Andre Vautour andre.vautour at caris.com
Mon Aug 25 05:14:21 PDT 2014


I want to start off by saying a big thanks to Blake for taking his time 
to tackle what can only be a very difficult problem.

 From what I can observe, the current discussion seems to be around the 
boundary of who should be responsible for ensuring thread safety around 
the block cache. The core of GDAL versus the individual drivers. While I 
see why such a conversation is important, as far as I am concerned, the 
most important part should be how it affects users of GDAL at the 
interface level. That is, if an application that is threaded is trying 
to use GDAL, how does it ensure thread-safety? What you have to keep in 
mind is that having some parts of the library not thread-safe basically 
just pushes the mutexing/locking to the calling applications.

Also, while it is important to document thread-safety limitations, might 
I suggest adding thread-safe related capabilities (TestCapability), 
especially if all drivers do not end-up having the same thread-safety 
constraints.

I personally do not see a GDALDatasetThreadSafe wrapper as adding much 
complexity. For instance, if you were to add a capability that indicates 
if a driver is inherently thread-safe, you could add a new open method 
to open a dataset in a thread-safe way with something like the following 
pseudocode:

DataSet OpenThreadSafe(GDALOpenInfo openInfo)
{
     DataSet dataSet = Open(openInfo);

     if (!dataSet.TestCapability(THREAD_SAFE))
     {
          dataSet = wrapWithThreadSafeWrapper(dataSet);
     }

     return dataSet;
}

Kind Regards,
André

On 2014-08-22 17:12, Even Rouault wrote:
> Le vendredi 22 août 2014 21:50:56, Blake Thompson a écrit :
>> Even,
>>
>> On Fri, Aug 22, 2014 at 1:33 PM, Even Rouault <even.rouault at spatialys.com>
>>
>> wrote:
>>> Note: after re-reading, I realize that I misread your above sentence as
>>> "it will work to translate multiple datasets in a parallel way with a
>>> *per- dataset* cache").
>>>
>>> So, even if you didn't write it, I'm afraid that people will assume that
>>> calling CreateCopy() on the same source dataset handle would be
>>> thread-safe (imagine that one thread translates to format F1, while the
>>> other one to format F2), whereas in the current state of the RFC it is
>>> not. Because CreateCopy() will call GetGeoTransform(),
>>> GetProjectionRef() etc which are generally thread unsafe. For example
>>> GTiff has a lazy loading approach for those 2 methods, and it is not the
>>> only one.
>> If we claim thread-safety, we should likely offer full thread-safety (at
>>
>>> least
>>> in reading scenarios), not partial one. Otherwise I'm afraid no one but
>>> the few people that have taken part to that discussion or read the RFC
>>> will know
>>> the limits.
>> Perhaps the goal of the RFC should be to allow CreateCopy to be threadsafe?
>> I have no tried yet to venture into other parts of thread safety on the
>> dataset (already bit off a very large chunk).
>>
>>> I'm wondering if it wouldn't be worth having GDALDatasetThreadSafe and
>>> GDALRasterBandThreadSafe classes (or whatever name is appropriate), that
>>> would
>>> follow the decorator pattern, i.e. they will own a thread unsafe "real"
>>> dataset/band and override the methods to lock them. Similarly to what I
>>> have
>>> done in OGR with ogr/ogrsf_frmts/generic/ogrmutexeddatasource.cpp and
>>> ogrmutexedlayer.cpp, needed for the FGDB driver (the one that depends on
>>> the
>>> ESRI SDK).
>>>
>>> My idea would be to have an open flag GDAL_OF_THREADSAFE for
>>> GDALOpenEx(). When set, GDALOpen() would do the usual job and get a (in
>>> most cases) unsafe
>>> dataset object. Then it would query a virtual method of the dataset to
>>> return
>>> a thread-safe version of it (GetThreadSafe()).
>>> - If not defined by the driver, the base implementation of
>>> GetThreadSafe() would return the dataset wrapped in
>>> GDALDatasetThreadSafe
>>> - If the implementation of the dataset is already thread-safe, it's
>>> GetThreadSafe() would return just self
>>> - For the in-between situations, it could for example override
>>> GDALDatasetThreadSafe/GDALRasterBandThreadSafe base implementation to
>>> specialize the methods that don't need locking.
>>>
>>> This idea might not work (and I've not though how it would combine with
>>> my previous IReadBlock_thread_safe/IReadBlock approach). I have just
>>> written it
>>> as it came to my mind. The main motivation is to make it easy to have
>>> thread-
>>> safe versions by default, without the driver having to care about that,
>>> while
>>> being flexible to make drivers that need finer control to do it.
>> This approach worries me, there is already a lot of inheritance of the
>> GDALDataset, its really hard to learn already.
> Well, the learning phase would be only be people who want to tweak thread
> safety of their drivers, i.e. not a lot of people I bet. The user wouldn't
> have to pay a lot (just specifying GDAL_OF_THREADSAFE, or nothing at all if
> the price of thread safety is not too big)
>
>> Adding another layer would
>> make it even harder to maintain in my opinion. It also provides one more
>> config option in an already overwhelming number of config options. I think
>> we might be better off if we made a long term plan on how we wanted to make
>> GDAL thread safe. Perhaps make a list of parts needing thread safety and
>> progress towards it over a period of time. As part of it maintain a webpage
>> that describes what is thread safe currently, so people are not confused on
>> what is and is not thread safe.
> People are already confused... and have a hard-time really understanding the
> implication of the few lines at
> http://trac.osgeo.org/gdal/wiki/FAQMiscellaneous#IstheGDALlibrarythread-safe
>
> I'm afraid more documentation of a more complex situation will not make it
> better. It is not reasonable to have to document that "GetProjectionRef() is
> thread-safe in drivers X and Y, whereas GetGeoTransform() is in Y and Z".
> Because that's the situation we will have for sure if we don't adopt a global
> default mechanism (unless someone spends weeks in adding locks in every
> virtual method of every driver).
>
> The external locking I suggested is not completely a new approach. I could
> mention  Collections.synchronizedList() used in Java to make their List
> implementations thread safe... Sorry for mentionning Java ;-)
>
> The ideal would be to have a solution were by default we have a guarantee
> thread-safety and people (GDAL driver developers) who want to tune it can do
> it if they are brave enough.
>
>>> If, through benchmark, we determine that the cost of the thread safe
>>> version
>>> is neglectable, then GDAL_OF_THREADSAFE might be useless, and GDALOpen()
>>> would
>>> always return the thread-safe version.
>> I wish there were lots of benchmarks in the test suite (if there are, I
>> don't know where they are),
> They don't yet exist. Most code is functionnal code in Python, so not
> appropriate for performance testing, especially regarding threading.
> The autotest/cpp should likely be enhanced to have such tests (the only
> performance benchmark I can think of is testperfcopywords that was created to
> justify the templated approach of GDALCopyWords)
>
> To be mentionned, there's also apps/multireadtest.cpp (not compiled by
> default). It just ensures that opening the same file in N dataset objects and
> reading them in N threads works.
>
>> so that we can even roughly measure changes to
>> the internals of the library. (I realize this is something that could be
>> difficult to do well).
>>
>> Thanks,
>>
>> Blake

-- 
*André Vautour*, B.Sc.
Application Developer
CARIS <http://www.caris.com>
115 Waggoners Lane
Fredericton, New Brunswick
Canada    E3B 2L4
Tel: +1.506.458.8533     Fax: +1.506.459.3849
www.caris.com <http://www.caris.com>

*Connect with CARIS*
Twitter <http://www.twitter.com/CARIS_GIS> LinkedIn 
<http://www.linkedin.com/groups?mostPopular=&gid=3217878> FaceBook 
<http://www.facebook.com/pages/CARIS-The-Marine-GIS-Experts/123907500987669?v=app_4949752878> 
YouTube <http://www.youtube.com/user/CARISGIS>

Download your free copy of CARIS Easy View today!
www.caris.com/easyview <http://www.caris.com/easyview>

_________________________________________________________________________
This email and any files transmitted with it are confidential and 
intended only for the addressee(s). If you are not the intended 
recipient(s) please notify us by email reply. You should not use, 
disclose, distribute or copy this communication if received in error.

Any views or opinions expressed in this email are solely those of the 
author and do not necessarily represent those of the company. No binding 
contract will result from this email until such time as a written 
document is signed on behalf of the company.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140825/d058810c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: caris.png
Type: image/png
Size: 3859 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140825/d058810c/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twitter.png
Type: image/png
Size: 1208 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140825/d058810c/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: linkedin.png
Type: image/png
Size: 1262 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140825/d058810c/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: facebook.png
Type: image/png
Size: 811 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140825/d058810c/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: youtube.png
Type: image/png
Size: 1299 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140825/d058810c/attachment-0009.png>


More information about the gdal-dev mailing list