[gdal-dev] Python Bindings and Closed Datasets

Patrick Young patrick.mckendree.young at gmail.com
Thu Oct 31 22:27:20 PDT 2019


After some more investigation, I've managed to put together the script
below that manifests the issue on my machine (8 core, Ubuntu 19.04, GDAL
2.4.0, Python 3.7).  Where I first observed this was in using 8 band WV2
imagery, so that's what the complicated input image is modeled after.
Here's an example output:

/tmp/output-15.tif generated an unexpected checksum (5888, 11776, 50432,
23552, 62208, 35328, 55347, 28467)
/tmp/output-26.tif generated an unexpected checksum (55040, 44544, 50432,
23552, 62208, 35328, 55347, 28467)
/tmp/output-32.tif generated an unexpected checksum (55040, 11776, 50432,
23552, 62208, 35328, 55347, 28467)
/tmp/output-56.tif generated an unexpected checksum (38656, 44544, 34048,
23552, 13056, 2560, 55347, 28467)
/tmp/output-59.tif generated an unexpected checksum (55040, 44544, 50432,
23552, 62208, 35328, 55347, 28467)
generated 5 checksums during concurrent warping of same tile


The number different checksums you get back varies from run to run.  Rather
than calling gdal.Warp, I tried shelling out directly to the gdalwarp
executable with the same arguments and didn't see the issue, so that is
interesting.  Also, a threadpool of size 1 seems to not have the same
issue.  I'll keep poking at it and see if i can distill it down to
something more precise, but here's the script I've got at the moment:

import os
from concurrent.futures import ThreadPoolExecutor, as_completed

from osgeo import gdal, osr

def warp_tile(f_in, f_out, warp_opts):
    gdal_warp_opts = gdal.WarpOptions(**warp_opts,
creationOptions=["TILED=YES", "COMPRESS=DEFLATE"])
    warp_ds = gdal.Warp(f_out, f_in, options=gdal_warp_opts)

def create_input(f_in):
    dvr = gdal.GetDriverByName('GTiff')
    ds = dvr.Create(f_in, 8456, 7558, 8, gdal.GDT_UInt16)

    ds.SetGeoTransform( (533949.5611715792, 2.12, 0.0, 4183240.981122436,
0.0, -2.12) )

    srs = osr.SpatialReference()
    srs.SetUTM(10, 1)
    srs.SetWellKnownGeogCS('WGS84')
    ds.SetProjection(srs.ExportToWkt())

    for b in range(8):
        ds.GetRasterBand(b+1).Fill(b+1)

if __name__ == "__main__":
    gdal.UseExceptions()

    f_in = '/tmp/input.tif'
    create_input(f_in)

    warp_opts = {'outputBounds': (539750.0, 4179750.0, 550250.0,
4190250.0), 'width': 5376, 'height': 5376, 'dstSRS': 'EPSG:32610',
'resampleAlg': 'bilinear'}
    with ThreadPoolExecutor(max_workers=8) as executor:
        job_d = {}
        for i in range(100):
            f_out = f'/tmp/output-{i}.tif'
            job_d[executor.submit(warp_tile, f_in, f_out, warp_opts)] =
f_out

        check_sums = set()
        for future in as_completed(job_d):
            f_out = job_d[future]
            try:
                future.result()
            except Exception as e:
                print(f'failure warping {f_out}, {e}')
            else:
                ds = gdal.Open(f_out)
                cs = tuple(ds.GetRasterBand(b+1).Checksum() for b in
range(ds.RasterCount))
                if cs != (38656, 11776, 50432, 23552, 62208, 35328, 55347,
28467):
                    print(f'{f_out} generated an unexpected checksum {cs}')
                ds = None
                check_sums.add(cs)
            finally:
                os.remove(f_out)
        os.remove(f_in)
        print(f'generated {len(check_sums)} checksums during concurrent
warping of same tile')

On Wed, Oct 30, 2019 at 1:32 PM Even Rouault <even.rouault at spatialys.com>
wrote:

> I've, unsuccessfully, tried to reproduce your issue with the following
> script:
>
> -----
>
> from osgeo import gdal
> import concurrent.futures
>
> def worker(in_f, out_f):
>     gdal.Unlink(out_f)
>     gdal.Warp(out_f, in_f, options = '-co COMPRESS=DEFLATE -co TILED=YES
> -ts 2048 2048')
>
> jobs = []
>
> for i in range(1000):
>     jobs.append( ['byte.tif', '/tmp/byte%d.tif' % i] )
>
> with concurrent.futures.ThreadPoolExecutor(max_workers=8) as executor:
>     future_map = {executor.submit(worker, in_f, out_f): out_f for in_f,
> out_f in jobs}
>     for future in concurrent.futures.as_completed(future_map):
>         out_f = future_map[future]
>         try:
>             future.result()
>         except Exception as exc:
>             print('%r generated an exception: %s' % (out_f, exc))
>
>         print('Finished ' + out_f)
>         ds = gdal.Open(out_f)
>         if ds.GetRasterBand(1).Checksum() != 58226:
>             print('wrong checksum for ' + out_f)
>
> ------
>
> where the input byte.tif file is
>
> https://github.com/OSGeo/gdal/blob/master/autotest/gcore/data/byte.tif?raw=true
>
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20191031/1e8cba67/attachment.html>


More information about the gdal-dev mailing list