[gdal-dev] Race condition between forked processes with opened Tiff dataset on Linux

Jiri Drbalek jiri.drbalek at gmail.com
Thu Dec 14 08:49:49 PST 2017


Hello.

If a Linux process with opened Tiff dataset is forked, it is not possible
to read from the dataset concurrently in these forked processes, because
file offsets and other attributes of the opened Tiff file are shared
between those processes.

One solution would be to serialize calls to GDAL, but this obviously
completely destroy multiprocessing.

Another solution would be to open the dataset per each process, but this is
also not desirable. An opened Tiff allocates memory for list of tile or
strip offsets and sizes. These metadata can take hundreds of megabytes for
large Tiff files, not to mention opening more of them. Therefore forking
saves a lot of memory as these metadata are shared with parent process.

I've made a patch which optionally close the underlying Tiff file once a
dataset is opened. One can then fork safely, underlying file is lazily
opened again in each subprocess.

What do you think about this problem and proposed solutions? Is there some
more elegant solution?

Here are two variants of the patch:
https://github.com/mapycz/gdal/commit/2bc9227ab656ab7587a4fa7f6d9b6c
c1e4b761af
https://github.com/mapycz/gdal/commit/92823746743966459f3a2b3940e371
3bad31733a

Thank you for any help.

Jiri
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20171214/af2e95c8/attachment.html>


More information about the gdal-dev mailing list