[gdal-dev] Help requested: Concurrent read of a GeoTiff

Even Rouault even.rouault at spatialys.com
Tue Mar 21 00:44:14 PDT 2017


On mardi 21 mars 2017 05:51:18 CET Grégory Bataille wrote:
> Hey Even,
> 
> reaching out again, because I just don't know...
> 
> so here is gdal2tiles_parallel strategy (as far as I understand it)
> - in each process, open the input file, compute the autowarped vrt (in the
> case of the dataset I use at least) loop over tiles to generate and work
> only depending on the process number (so that this work is made in
> parallel). We therefore end up with, in each process
>     - the input file opened as a dataset (but not touched after the
> autowarped vrt is generated)
>     - a different autowarped vrt in memory (for each process) but pointing
> to the same source (as far as I can see)
> 
> 
> My strategy:
> - open the input file.
> - generate the autowarped vrt and save it to disk
> - compute all the tile details and store them in a data structure.
> Then in each process:

I'm confused. You're talking about process, but is it a real operating system process or a 
thread ? I guess the later.

> - take one tile detail
> - open the vrt (same file for each process, pointing to the same source
> TIFF)
> - read the vrt and write the tile
> And that's the read that is failing as mentioned in my first email.
> 
> I can't see much difference. To try and be complete, I tried to (or think I
> tried to):
> - open the input file in each process
> - generate a different vrt file for each thread
> 
> I even copy/pasted the code that gdal2tiles_parallel uses in their
> processes, but with no success, always the same error. The
> VRT_SHARED_SOURCE option
> does not seem to change anything (but looking at the wiki, it's not clear
> whether it applies to autowarped vrt too)

VRT_SHARED_SOURCE indeed only works for "regular" mosaicing VRTs. So not to warped 
VRT.

Warped VRT opens the source dataset as a shared dataset in
https://github.com/OSGeo/gdal/blob/trunk/gdal/alg/gdalwarper.cpp#L1675
So you could have an issue if you open the VRT twice in the same thread, and then use each 
handle in a different thread, since both VRT handles would point to the same shared dataset. 
But it doesn't look that's what you are doing.

As a workaround to investigate, you could try to remove the GDAL_OF_SHARED |  flag in the 
line I mentionned and see if that makes a difference (but we could potentially - not sure - 
have issues at closing time since different call sites might make an assumption on the dataset 
being opened with shared semantics)

What you could do too is run the script through your prefered debugger and set a breakpoint 
in GDALOpenEx and look at the calling sites. If you are using 2 threads, then if we want things 
to work correctly, GDALOpenEx() should be called twice on the TIFF dataset and return a 
different handle.

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170321/ecdde9d4/attachment-0001.html>


More information about the gdal-dev mailing list