<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Andrew,</p>
<p>what would be the purpose of thread-safe access: just making it
thread-safe without any particular requirement on how efficient
this would be (1), or hope for true concurrent access with ideally
close to linear scalability with the number of threads (2) ?</p>
<p>If (1), then we could add a GDALMutexedDataset class, similarly
to
<a class="moz-txt-link-freetext" href="https://github.com/OSGeo/gdal/blob/master/ogr/ogrsf_frmts/generic/ogrmutexeddatasource.h">https://github.com/OSGeo/gdal/blob/master/ogr/ogrsf_frmts/generic/ogrmutexeddatasource.h</a>
which exists on the vector side (just used by the FileGDB driver
due to the fact that the underlying SDK is not even re-entrant),
which uses the decorator pattern around all public API entry
points to call the underlying dataset under a mutex. One could
imagine to have a GDAL_OF_THREADSAFE open flag that GDALOpen()
would use to return such instance. Shouldn't be too hard to
implement, but probably not that useful IMHO. I can anticipate
most users would have higher expectations than a mutex-based
implementation.<br>
</p>
<p>If (2), it seems to me that it would require a huge effort, and
the programming language we use (C++) offers hardly any safety
belt to make sure we don't make mistakes, the main one being
forgetting to lock things that should be locked, or dead locks
situation. If we go into doing that, I'm not even sure how we can
reliably identify all parts of the code that must be modified<br>
</p>
<p>Neither GDAL raster core nor any driver are designed to be
thread-safe. For core, at least gcore/gdalarraybandblockcache.cpp
and gcore/gdalhashsetbandblockcache.cpp which interact with the
block cache should be made thread-safe, and "just" adding a lock
would defeat the aim to achieve linear scalability. The change in
GDALDataset::RasterIO() I did in
<a class="moz-txt-link-freetext" href="https://github.com/OSGeo/gdal/commit/7f3a0e582eb189744bc7cb8e4a751135edaecaf5">https://github.com/OSGeo/gdal/commit/7f3a0e582eb189744bc7cb8e4a751135edaecaf5</a>
isn't thread-safe either (would be easy to make thread-safe
though)</p>
<p>Once GDAL raster code is ready, the main challenge is making
drivers themselves thread-safe. Raster drivers may directly read
from a VSILFILE* handle, which isn't thread safe when using the
standard Seek() + Read() pair. A few VSIVirtualFileSystem have a
PRead() implementation, which is thread-safe, but not all). Or
they rely on using some instance of a "reader" returned by a
third-party library (libtiff, libjpeg, libpng, sqlite3, etc.)
(which in most cases also uses a VSILFILE*), none of which are
thread-safe (except sqlite3 that can be made thread-safe by
passing a flag at sqlite3_open() time, that will basically applies
strategy (1) by protecting all calls with a mutex). Perhaps using
thread-specific instances of VSILFILE* and third-party "reader"
objects could be a way of solving this. But realistically doing a
pass in all GDAL drivers would be a multi-month-man to
multi-year-man type of effort. A realistic plan should be designed
to allow combining (1) and (2): (2) for a few select drivers, and
(1) as a fallback for most drivers that wouldn't be updated.<br>
</p>
<p>Even<br>
</p>
<div class="moz-cite-prefix">Le 03/06/2024 à 15:44, Andrew Bell via
gdal-dev a écrit :<br>
</div>
<blockquote type="cite"
cite="mid:CACJ51z3+bGgGxD02NtdsKP7wJRp47RPXR9tZZtt7JZ9c2yth-w@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">Hi,
<div><br>
</div>
<div>I am aware that there isn't thread-safe raster access with
the current GDAL interface for various reasons. Given the
state of processors, I was wondering if it would be valuable
to take a look at providing the ability to do Raster I/O (at
least reads) in a thread-safe way. This could be done through
a new set of API calls or perhaps by modifications to what
currently exists -- I don't know what makes sense at this
point. I would be happy to spend some time looking at this if
there is interest, but I would also like to learn from
existing experience as to what kinds of things that I'm surely
not considering would have to be dealt with.<br>
<br>
Thanks,<br>
<div><br>
</div>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature"
data-smartmail="gmail_signature">Andrew Bell<br>
<a href="mailto:andrew.bell.ia@gmail.com" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">andrew.bell.ia@gmail.com</a></div>
</div>
</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
gdal-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/gdal-dev">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
</body>
</html>