[gdal-dev] Help requested: Concurrent read of a GeoTiff

Grégory Bataille gregory.bataille at gmail.com
Tue Mar 28 04:51:50 PDT 2017


Hey,

just to let you know, I found the issue.
So I think it has to do with the fact that I keep some handles opened on
the original TIFF In the parent thread. Even setting the variable to None
does not change anything, but I guess it's because the garbage collector is
asynchronous.
So if I do my initial code in a child process, return the data through a
pipe and then spawn a pool of child processes doing tiles one at a time, I
don't get this concurrency issue.

Now I "just" need to write it properly which is going to take a while :)

Cheers


---
Gregory Bataille

On Tue, Mar 21, 2017 at 9:38 AM, Grégory Bataille <
gregory.bataille at gmail.com> wrote:

> hum... debugging in the C layers, that'll be interesting...
> Ok, I'll continue to dig in, thanks
>
>
> ---
> Gregory Bataille
>
> On Tue, Mar 21, 2017 at 8:44 AM, Even Rouault <even.rouault at spatialys.com>
> wrote:
>
>> On mardi 21 mars 2017 05:51:18 CET Grégory Bataille wrote:
>>
>> > Hey Even,
>>
>> >
>>
>> > reaching out again, because I just don't know...
>>
>> >
>>
>> > so here is gdal2tiles_parallel strategy (as far as I understand it)
>>
>> > - in each process, open the input file, compute the autowarped vrt (in
>> the
>>
>> > case of the dataset I use at least) loop over tiles to generate and work
>>
>> > only depending on the process number (so that this work is made in
>>
>> > parallel). We therefore end up with, in each process
>>
>> > - the input file opened as a dataset (but not touched after the
>>
>> > autowarped vrt is generated)
>>
>> > - a different autowarped vrt in memory (for each process) but pointing
>>
>> > to the same source (as far as I can see)
>>
>> >
>>
>> >
>>
>> > My strategy:
>>
>> > - open the input file.
>>
>> > - generate the autowarped vrt and save it to disk
>>
>> > - compute all the tile details and store them in a data structure.
>>
>> > Then in each process:
>>
>>
>>
>> I'm confused. You're talking about process, but is it a real operating
>> system process or a thread ? I guess the later.
>>
>>
>>
>> > - take one tile detail
>>
>> > - open the vrt (same file for each process, pointing to the same source
>>
>> > TIFF)
>>
>> > - read the vrt and write the tile
>>
>> > And that's the read that is failing as mentioned in my first email.
>>
>> >
>>
>> > I can't see much difference. To try and be complete, I tried to (or
>> think I
>>
>> > tried to):
>>
>> > - open the input file in each process
>>
>> > - generate a different vrt file for each thread
>>
>> >
>>
>> > I even copy/pasted the code that gdal2tiles_parallel uses in their
>>
>> > processes, but with no success, always the same error. The
>>
>> > VRT_SHARED_SOURCE option
>>
>> > does not seem to change anything (but looking at the wiki, it's not
>> clear
>>
>> > whether it applies to autowarped vrt too)
>>
>>
>>
>> VRT_SHARED_SOURCE indeed only works for "regular" mosaicing VRTs. So not
>> to warped VRT.
>>
>>
>>
>> Warped VRT opens the source dataset as a shared dataset in
>>
>> https://github.com/OSGeo/gdal/blob/trunk/gdal/alg/gdalwarper.cpp#L1675
>>
>> So you could have an issue if you open the VRT twice in the same thread,
>> and then use each handle in a different thread, since both VRT handles
>> would point to the same shared dataset. But it doesn't look that's what you
>> are doing.
>>
>>
>>
>> As a workaround to investigate, you could try to remove the
>> GDAL_OF_SHARED | flag in the line I mentionned and see if that makes a
>> difference (but we could potentially - not sure - have issues at closing
>> time since different call sites might make an assumption on the dataset
>> being opened with shared semantics)
>>
>>
>>
>> What you could do too is run the script through your prefered debugger
>> and set a breakpoint in GDALOpenEx and look at the calling sites. If you
>> are using 2 threads, then if we want things to work correctly, GDALOpenEx()
>> should be called twice on the TIFF dataset and return a different handle.
>>
>>
>>
>> Even
>>
>>
>>
>> --
>>
>> Spatialys - Geospatial professional services
>>
>> http://www.spatialys.com
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170328/296953f5/attachment-0001.html>


More information about the gdal-dev mailing list