[gdal-dev] Help requested: Concurrent read of a GeoTiff

Grégory Bataille gregory.bataille at gmail.com
Mon Mar 20 21:51:18 PDT 2017


Hey Even,

reaching out again, because I just don't know...

so here is gdal2tiles_parallel strategy (as far as I understand it)
- in each process, open the input file, compute the autowarped vrt (in the
case of the dataset I use at least) loop over tiles to generate and work
only depending on the process number (so that this work is made in
parallel). We therefore end up with, in each process
    - the input file opened as a dataset (but not touched after the
autowarped vrt is generated)
    - a different autowarped vrt in memory (for each process) but pointing
to the same source (as far as I can see)


My strategy:
- open the input file.
- generate the autowarped vrt and save it to disk
- compute all the tile details and store them in a data structure.
Then in each process:
- take one tile detail
- open the vrt (same file for each process, pointing to the same source
TIFF)
- read the vrt and write the tile
And that's the read that is failing as mentioned in my first email.

I can't see much difference. To try and be complete, I tried to (or think I
tried to):
- open the input file in each process
- generate a different vrt file for each thread

I even copy/pasted the code that gdal2tiles_parallel uses in their
processes, but with no success, always the same error. The
VRT_SHARED_SOURCE option
does not seem to change anything (but looking at the wiki, it's not clear
whether it applies to autowarped vrt too)

If you have any idea, I'll take it, I have been stuck for a long time
without making any progress :(

Cheers



---
Gregory Bataille

On Fri, Mar 17, 2017 at 6:15 AM, Grégory Bataille <
gregory.bataille at gmail.com> wrote:

> no luck.
> I tried this config.
> I also tried (with this config) to do a copy of the vrt file (with shell
> utility, not gdal) in each thread before opening it (still goes to the same
> TIFF file though), but no luck.
>
> But at least now that you confirm that it's related to concurrent reads, I
> have something precise to search for. I'll try to investigate how the
> gdal2tiles_parallel does it because it did not look different than what I
> was doing, but likely I missed something.
>
> Thanks
>
>
> ---
> Gregory Bataille
>
> On Thu, Mar 16, 2017 at 7:58 PM, Even Rouault <even.rouault at spatialys.com>
> wrote:
>
>> On jeudi 16 mars 2017 18:45:03 CET Grégory Bataille wrote:
>>
>> > Does that mean a different vrt file or simply reopen the file to create
>> a
>>
>> > gdal object in each process. Because I'm reopening the vrt file in each
>>
>> > thread
>>
>>
>>
>> Ah, that must be issue of
>>
>> http://gdal.org/gdal_vrttut.html#gdal_vrttut_mt
>>
>>
>>
>> Try defining
>>
>> gdal.SetConfigOption('VRT_SHARED_SOURCE', '0')
>>
>> before opening the VRTs
>>
>>
>>
>> >
>>
>> >
>>
>> > On Thu, 16 Mar 2017 at 17:23, Even Rouault <even.rouault at spatialys.com>
>>
>> >
>>
>> > wrote:
>>
>> > > On jeudi 16 mars 2017 17:16:20 CET Grégory Bataille wrote:
>>
>> > > > Hello all,
>>
>> > > >
>>
>> > > >
>>
>> > > >
>>
>> > > > Reaching out to the community caused I have failed for the past few
>>
>> > > > days.
>>
>> > > >
>>
>> > > >
>>
>> > > >
>>
>> > > > *short version*, I'm trying to multithread the gdal2tiles utility,
>> and
>>
>> > >
>>
>> > > I'm
>>
>> > >
>>
>> > > > getting this
>>
>> > > >
>>
>> > > >
>>
>> > > >
>>
>> > > > Generating Base Tiles:
>>
>> > > >
>>
>> > > > ERROR 1: LZWDecode:Wrong length of decoded string: data probably
>>
>> > >
>>
>> > > corrupted
>>
>> > >
>>
>> > > > at scanline 256
>>
>> > > >
>>
>> > > > ERROR 1: TIFFReadEncodedTile() failed.
>>
>> > > >
>>
>> > > > ERROR 1:
>>
>> > > >
>>
>> > > > /Users/gbataille/Downloads/Project_58704_transparent_mosaic_
>> group1.tif,
>>
>> > > >
>>
>> > > > band 1: IReadBlock failed at X offset 1, Y offset 0
>>
>> > > >
>>
>> > > > ERROR 1: GetBlockRef failed at X block offset 1, Y block offset 0
>>
>> > > >
>>
>> > > > ERROR 1: gba.vrt, band 1: IReadBlock failed at X offset 0, Y offset
>> 0
>>
>> > > >
>>
>> > > >
>>
>> > > >
>>
>> > > > Any idea?
>>
>> > >
>>
>> > > Yes, you need one dataset object per thread . Dataset objects cannot
>> be
>>
>> > > used simultaneously from several threads.
>>
>> > >
>>
>> > >
>>
>> > >
>>
>> > > Even
>>
>> > >
>>
>> > >
>>
>> > >
>>
>> > >
>>
>> > >
>>
>> > > --
>>
>> > >
>>
>> > > Spatialys - Geospatial professional services
>>
>> > >
>>
>> > > http://www.spatialys.com
>>
>>
>>
>>
>>
>> --
>>
>> Spatialys - Geospatial professional services
>>
>> http://www.spatialys.com
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20170321/442a6abb/attachment.html>


More information about the gdal-dev mailing list