[gdal-dev] Call for review on RFC 101: Raster dataset read-only thread-safety

Sat Sep 7 07:49:16 PDT 2024

>
> For read-only purposes I would have thought we only had to briefly 
> seize a mutex when introducing new blocks into the cache, and I 
> thought there was already some sort of lock around that.

When talking about the block cache, one needs to distinguish:

- the global least-recently-used (LRU) list of GDALRasterBlock* 
pointers, used by all datasets, that has indeed a lock since it is used 
already in a multi-threaded context

- GDALRasterBlock which have an atomic counter aimed at coordinating 
this LRU list with the below mentionned per-band data structures, to 
avoid in particular eviction from the LRU due to the max size of the 
block cache being reached and dataset closing to badly interfere

- per-band data structures ( either as an array of array, or a hash-set 
(for very large datasets)) that own GDALRasterBlock* instances, Those 
don't do any locking around access to those structures. Well, actually, 
careful observers would object that I'm lying here. There is a "secret" 
mutex per GDALDataset* instance, used currently only for datasets opened 
in update mode and taken in the non-virtual RasterIO() method, that 
addresses complex cases where the LRU logic would try to evict from a 
thread T2 a "dirty" block, causing IWriteBlock() to be called, while a 
thread T1 would IReadBlock() on the same dataset. That mutex actually 
leads to "interesting" issues with deadlocks with the mutex of the 
global LRU, which lead to some subtelties in the GetLockedBlockRef() 
implementation (cf lines 1960-1983 of gdalrasterband.cpp) Which would 
lead to another interesting discussion on the fact that for scenarios 
where multi-threaded writing of *different dataset* the current 
architecture of a global LRU is inefficient and would need to probably 
be replaced by a per-dataset LRU.

I agree that the per-band data structures could be made thread-safe with 
careful use of a mutex. One tricky point would remain if 2 threads try 
to access at the same time a block that hasn't been loaded yet in the 
cache (although an algorithm fully optimized for threading should really 
try to avoid accessing the same block from multiple threads)

2 possible strategies:

- each thread actually IReadBlock() (assuming that its implementation is 
thread-safe) the block and at the time they call AdoptBlock(), insertion 
in the data structure is locked, and only the first "version" 
of GDALRasterBlock is kept, and the one that arrives later is deleted. 
But one should make sure there's no risk of use-after-free.

- a more complex mechanism is used so that it is "noted somewhere" that 
block (xoff, yoff) is in loading and that the thread that arrives second 
is paused upon completion of the loading. That solution is more 
respectful of total CPU power consumption, but maybe less efficient in 
term of throughput.

>
>>      - no duplication of file handles per-thread
>
>     I just don't see how to do that. Even a mutex around file
>     operations would not be sufficient since we'd want both
>     Seek()+Read() to be atomic per thread.
>
> I do agree that we would need a mutex containing both the seek and 
> read for a block read.  Also, on reflection I can see that this could 
> add a lot of condition as the mutex would be held for very long 
> periods of time and it would intefere with the ability to do parallel 
> reads on filesystems where that is possible (I have to get out of a 
> "file system is a single spinning disk" mindset).  So perhaps I'm not 
> so attached to the avoiding per-thread file descriptors though I'm not 
> sure per thread file descriptors would always be the right answer.

One indeed needs to have in mind also the case of /vsicurl/ and friends 
where read operations are very slow, so locking isn't acceptable. That 
said /vsicurl/ in recent GDAL versions has a PRead() implementation 
(which must be used with care, because currently it doesn't do any 
caching, and if you ask to read 1 byte, it will issue a HTTP GET Range 
request for 1 byte, contrary to standard Read() that reads by chunks of 
16 kB and cache them)

> ...
>
>     libtiff itself has a lot of state variables, in its core or in its
>     codecs (e.g. the deflate codec has to maintain a zlib handle as a
>     state variable if doing partial reads of a tile/strip).  Even
>     getting the location of a TIFF tile is not thread-safe when
>     defered byteount[]/offset[] array loading is enabled (which is
>     necessary when dealing with extremely large remote COGs) since it
>     involves file accesses.
>
> I can see the point though it is not clear this can't be addressed, 
> and that it wouldn't be a useful thing in libtiff.

Taking a step back, I can't think of any popular image library which 
fills a similar role to libtiff (that is a libpng, libjpeg, openjpeg, 
libwebp, libjxl, etc etc.) that is thread-safe. For most of them 
(libpng, libjpeg, libwebp), partly because they implement scanline based 
streamed reading logic, where the PNG/JPEG/etc. handle holds obviously a 
mutable state, by API design. libjxl returns a fully decoded image, but 
has somehow an incremental decoding API. openjpeg could be the one that 
is the closest to libtiff situation, where one - in theory - could 
decode different regions in parallel, but there's mutable state involved 
underneath (including a state machine).

I would expect great (and, for once, justified) resistance & scepticism 
from libtiff community members to add thread-safety into it. A 
thread-safe TIFF reader would be more a separate library, or at least a 
clearly identified separate component within libtiff with a specific 
namespace. (*)

Even

(*) It could be argued that coding it in C wouldn't be the best idea. 
I've spotted a TIFF Rust library 
(https://github.com/image-rs/image-tiff) but it is functionally limited 
(at least in terms of supported codecs. didn't check thourougly) and 
also associates the TIFF "object" with a file reader with read+seek 
capabilities, and thus methods like read_chunk() 
(https://github.com/image-rs/image-tiff/blob/6dc7a266d30291db1e706c8133357931f9e2a053/src/decoder/mod.rs#L1012) 
take a mutable self reference, making it incompatible of thread-safety. 
So probably a C (or C++) library written and designed with thread-safety 
in mind from the start would probably the most reasonable option for a 
GDAL use case.

-- 

http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240907/6cba0655/attachment-0001.htm>