[gdal-dev] Hidden regression in GDAL 1.11.0 - gdal_rasterize with postgis silently fails to render many polygons.
Graeme B. Bell
grb at skogoglandskap.no
Mon Feb 9 09:30:09 PST 2015
Hi Even, thanks very much for this.
I setup the DB with createdb autotest; echo "create extension postgis;" | psql autotest
Then I did your test. Annoyingly, it worked fine!
Then I redid my own tests. They still didn't work.
So I varied all the parameters between the two examples until I found the one that triggers it.
GOT IT! :-)
It's the tables parameter that triggers it. I have used 'tables' since 2012 to avoid doing a scan of the DB because we have a LOT of tables here (e.g. >1 TB of postgis data).
gdal_rasterize -sql "select * from world_diamonds where st_intersects(wkb_geometry, ST_GeometryFromText('POLYGON((-180 -90,-180 90,180 90,180 -90,-180 -90))',4326))" "pg:dbname=autotest" out.tif -burn 255 -ts 4000 2000 -a_srs EPSG:4326 -te -180 -90 180 90 --debug ON
PG: DBName="autotest"
PG: PostgreSQL version string : 'PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit'
PG: PostGIS version string : '2.1 USE_GEOS=1 USE_PROJ=1 USE_STATS=1'
OGR: OGROpen(pg:dbname=autotest/0x9f65f0) succeeded as PostgreSQL.
GDAL: QuietDelete(out.tif) invoking Delete()
GDAL: GDALOpen(out.tif, this=0xa292e0) succeeds as GTiff.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALClose(out.tif, this=0xa292e0)
GDAL: GDALDriver::Create(GTiff,out.tif,4000,2000,1,Float64,(nil))
GDAL: Rasterizer operating on 7 swaths of 312 scanlines.
0...10...20...30...40...50...60...70...80...90...100 - done.
PG: 64800 features read on layer 'sql_statement'. <-------------
GDAL: GDALClose(out.tif, this=0xa283d0)
whereas
gdal_rasterize -sql "select * from world_diamonds where st_intersects(wkb_geometry, ST_GeometryFromText('POLYGON((-180 -90,-180 90,180 90,180 -90,-180 -90))',4326))" "pg:dbname=autotest tables='world_diamonds'" out.tif -burn 255 -ts 4000 2000 -a_srs EPSG:4326 -te -180 -90 180 90 --debug ON
PG: DBName="autotest"
PG: PostgreSQL version string : 'PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit'
PG: PostGIS version string : '2.1 USE_GEOS=1 USE_PROJ=1 USE_STATS=1'
OGR: OGROpen(pg:dbname=autotest tables='world_diamonds'/0x1668a10) succeeded as PostgreSQL.
GDAL: QuietDelete(out.tif) invoking Delete()
GDAL: GDALOpen(out.tif, this=0x1688b40) succeeds as GTiff.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALClose(out.tif, this=0x1688b40)
GDAL: GDALDriver::Create(GTiff,out.tif,36,18,1,Float64,(nil))
PG: Primary key name (FID): ogc_fid
PG: Using column 'ogc_fid' as FID for table 'world_diamonds'
GDAL: Rasterizer operating on 1 swaths of 18 scanlines.
0...10...20...30...40...50...60...70...80...90...100 - done.
PG: 500 features read on layer 'sql_statement'. <-------------
GDAL: GDALClose(out.tif, this=0x1686720)
Graeme Bell
On 09 Feb 2015, at 18:01, Even Rouault <even.rouault at spatialys.com> wrote:
> Graeme,
>
> I tried to reproduce your issue as close as possible but couldn't replicate the bug.
>
> I create a postgis layer with 360x180 polygons with the following Python script
>
> ~~~~
>
> from osgeo import ogr
> from osgeo import osr
>
> ds = ogr.Open('pg:dbname=autotest')
> srs = osr.SpatialReference()
> srs.ImportFromEPSG(4326)
> lyr = ds.CreateLayer('world_diamonds', geom_type = ogr.wkbPolygon, srs = srs, options = ['OVERWRITE=YES'])
> lyr.StartTransaction()
> for y in range(180):
> for x in range(360):
> f = ogr.Feature(lyr.GetLayerDefn())
> x0 = x - 180 + 0.5
> y0 = 90 - y - 1
> f.SetGeometry(ogr.CreateGeometryFromWkt('POLYGON((%f %f,%f %f,%f %f,%f %f,%f %f))' % (x0,y0,x0-0.5,y0+0.5,x0,y0+1,x0+0.5,y0+0.5,x0,y0)))
> lyr.CreateFeature(f)
> lyr.CommitTransaction()
>
> ~~~~
>
> So continuous diamond shaped polygons (at the beginning, I created straight
> polygons, but for some reason, the rasterizing algorithm doesn't manage to
> draw such polygons, but that's a totally unrelated issue)
>
> And then:
>
> gdal_rasterize -sql "select * from world_diamonds where st_intersects(wkb_geometry, ST_GeometryFromText('POLYGON((-180 -90,-180 90,180 90,180 -90,-180 -90))',4326))" \
> "pg:dbname=autotest" out.tif -burn 255 -ts 4000 2000 -a_srs EPSG:4326 -te -180 -90 180 90
>
> And the output looks good with both gdal 1.11 branch (haven't tested with 1.11.0 but
> there was no significant change in the PG driver) and trunk.
>
> Could you try the above ?
>
>
> Compiled in DEBUG mode, I can see many
> "PG: PQexecParams(FETCH 500 in OGRPGLayerReader0x2316c50) = PGRES_TUPLES_OK, ntuples = 500"
> traces as expected. I don't understand why in your use case it fails at the first FETCH.
>
> Could you try compiling GDAL with ./configure --enable-debug
> so as to have more verbose traces in the PG driver ?
>
>
> Even
>
>
> Le lundi 09 février 2015 16:37:20, Graeme B. Bell a écrit :
>> TLDR;
>>
>> gdal_rasterize has begun silently producing wrong output at least since
>> GDAL 1.11.0. Whenever >500 polygons are present in the database query
>> result (and perhaps under other circumstances), only 500 will be drawn.
>>
>>
>> Hi,
>>
>> I am rasterizing many maps and areas of maps in SRID 25833 using
>> gdal_rasterize with a postgis database.
>>
>> When I use GDAL 1.11, the output is missing much of the data, despite
>> gdal_rasterize reporting completing successfully. With GDAL 1.09 and GDAL
>> 1.10, the output is correct and complete and identical.
>>
>> I'm posting here to gather more information about the bug and also to raise
>> an alarm for anyone using postgis/gdal_rasterize (1.11) in their data
>> workflow over the last year.
>>
>> Background of bug
>> --------
>>
>> Versions:
>>
>> POSTGIS="2.1.5 r13152" GEOS="3.4.2-CAPI-1.8.2 r3921" PROJ="Rel. 4.8.0, 6
>> March 2012" GDAL="GDAL 1.11.0, released 2014/04/16" (from EPEL RPM)
>>
>> A. The source table in postgis is registered in geometry_columns correctly.
>>
>> B. I run this command in Centos linux.
>>
>> gdal_rasterize -te 1060000 7865000 1120000 7940000 -a jord -init 255 -tr
>> 100 100 -a_nodata 255 -a_srs 'EPSG:25833' -co 'TILED=YES' -co
>> 'BIGTIFF=IF_SAFER' -co 'COMPRESS=PACKBITS' -co 'PREDICTOR=1' -co
>> 'ZLEVEL=1' -ot Byte PG:"host=XXXX'' dbname='XXXX' user='XXXX'
>> password='XXXX' tables='XXXX'" -sql "SELECT xxxx FROM xxxx WHERE
>> st_intersects(geo, ST_GeometryFromText('POLYGON((1060000 7865000,1060000
>> 7940000,1120000 7940000, 1120000 7865000, 1060000 7865000))',25833)) "
>> xxxx.tif
>>
>> C. The SQL query (-sql), when run directly on postgres with COUNT(),
>> produces this output.
>>
>> count
>> -------
>> 3959
>> (1 row)
>>
>> D. GDAL 1.9.2 and GDAL 1.10.0 produce a TIF file of 12238 bytes which is
>> visually complete.
>>
>> E. GDAL 1.11.0 produces a TIF file of 10920 bytes which is missing many
>> polygons.
>>
>> F. Console output from gdal_rasterize in all 3 cases is as follows;
>> 0...10...20...30...40...50...60...70...80...90...100 - done.
>>
>> G. PG server log output for GDAL 1.9.2 and GDAL 1.10.0
>>
>> (nothing...)
>>
>> PG server log output for GDAL 1.11.0
>>
>> < 2015-02-09 13:52:11.720 CET >STATEMENT: FETCH 500 in
>> OGRPGLayerReader0xd5ce50 < 2015-02-09 13:52:11.720 CET >ERROR: cursor
>> "ogrpglayerreader0xd5ce50" does not exist < 2015-02-09 13:52:11.720 CET
>>> STATEMENT: CLOSE OGRPGLayerReader0xd5ce50 < 2015-02-09 13:52:18.657 CET
>>> ERROR: cursor "ogrpglayerreader0x1fd8e50" does not exist
>>
>> H. Debug mode output for GDAL 1.11.0
>>
>> OGR: OGROpen(...) succeeded as PostgreSQL.
>> GDAL: QuietDelete(xxxx.tif) invoking Delete()
>> GDAL: GDALOpen(xxxx.tif, this=0x259a8f0) succeeds as GTiff.
>> GDAL: GDALDefaultOverviews::OverviewScan()
>> GDAL: GDALClose(xxxx.tif, this=0x259a8f0)
>> GDAL: GDALDriver::Create(GTiff,xxxx.tif,600,750,1,Byte,0x25744e0)
>> PG: Primary key name (FID): gid
>> PG: Using column 'gid' as FID for table 'xxxx_25833'
>> GDAL: Rasterizer operating on 1 swaths of 750 scanlines.
>> 0...10...20...30...40...50...60...70...80...90...100 - done.
>> PG: 500 features read on layer 'sql_statement'.
>> GDAL: GDALClose(xxxx.tif, this=0x2598930)
>>
>> I. Debug mode output for GDAL 1.10.0
>>
>> OGR: OGROpen(xxxx) succeeded as PostgreSQL.
>> GDAL: QuietDelete(xxxx) invoking Delete()
>> GDAL: GDALOpen(xxxx) succeeds as GTiff.
>> GDAL: GDALDefaultOverviews::OverviewScan()
>> GDAL: GDALClose(xxxx.tif, this=0xeb5030)
>> GDAL: GDALDriver::Create(GTiff,xxxx.tif,600,750,1,Byte,0xe8ede0)
>> PG: Primary key name (FID): gid
>> PG: Using column 'gid' as FID for table 'dmk_25833'
>> PG: Layer 'xxxx_25833' geometry type: GEOMETRY:Unknown (any), Dim=2
>> 0...10...20...30...40...50...60...70...80...90...100 - done.
>> PG: 3959 features read on layer 'sql_statement'.
>> GDAL: GDALClose(xxxx.tif, this=0xeb2fd0)
>>
>>
>> As well as local unix sockets I've tested TCP connections to a remote PG
>> database from GDAL1.10 and GDAL1.11 clients, and I've tested with other
>> data layers. Every time, I'm getting the same problem e.g. 500 features
>> out 8000.
>>
>> Has anyone else experienced anything similar?
>>
>> I've spent some time looking through the source but I wasn't able to
>> determine exactly where the problem is occuring. It seems to somehow be
>> related to the select page size (500 rows) being used in place of the
>> total feature count; or perhaps some logic mistake causing the reads to
>> finish after 1 page is complete.
>>
>> Graeme Bell
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
More information about the gdal-dev
mailing list