[postgis-tickets] [PostGIS] #5176: no data check in raster2pgsql and also ST_BandIsNoData are slow

PostGIS trac at osgeo.org
Wed Jun 29 19:40:19 PDT 2022


#5176: no data check in raster2pgsql and also ST_BandIsNoData are slow
--------------------+-----------------------------
 Reporter:  robe    |      Owner:  robe
     Type:  defect  |     Status:  new
 Priority:  medium  |  Milestone:  PostGIS Fund Me
Component:  raster  |    Version:  master
 Keywords:          |
--------------------+-----------------------------
 I suspect it's too late to do anything about this in 3.3 and maybe nothing
 can be done.

 In PostGIS 3.2 we added the -k to raster2pgsql to allow skipping no data
 check.  From PostGIS 3.2 on, the no data check is automatically done
 unless -k is specified.

 I am finding that this check is taking quite a long time for large files.
 By large I mean like a tiff of 300MB or more.

 For example

 I've been trying to output to disk a raster2pgsql for a 500 MB file which
 gdalinfo shows this:


 {{{
 Size is 637200, 270000
 Coordinate System is:
 :
     ID["EPSG",4326]]
 Data axis to CRS axis mapping: 2,1
 Origin = (-125.000000000000000,49.000000000000000)
 Pixel Size = (0.000092592592593,-0.000092592592593)
 Metadata:
   AREA_OR_POINT=Area
 Image Structure Metadata:
   COMPRESSION=LZW
   INTERLEAVE=BAND
 Corner Coordinates:
 Upper Left  (-125.0000000,  49.0000000) (125d 0' 0.00"W, 49d 0' 0.00"N)
 Lower Left  (-125.0000000,  24.0000000) (125d 0' 0.00"W, 24d 0' 0.00"N)
 Upper Right ( -66.0000000,  49.0000000) ( 66d 0' 0.00"W, 49d 0' 0.00"N)
 Lower Right ( -66.0000000,  24.0000000) ( 66d 0' 0.00"W, 24d 0' 0.00"N)
 Center      ( -95.5000000,  36.5000000) ( 95d30' 0.00"W, 36d30' 0.00"N)
 Band 1 Block=637200x1 Type=Byte, ColorInterp=Gray
   NoData Value=0
 }}}

 my call looks something like this:
 My original test started 10 hrs ago hasn't spit out anything yet.  I
 assume it is still stuck in the no data check loop.

 {{{
 raster2pgsql -I -t auto -Y 15000 -e file.tif >> test2.sql
 }}}


 # this one with the -k option started spitting out to disk within 5
 minutes, after 2 hrs it's at 8GB
 {{{
 raster2pgsql -k -I -t auto -Y 15000 -e file.tif >> test2.sql
 }}}


 I was thinking that for these kinds of cases, perhaps loading all the junk
 in and purging the no data rows would be faster, but sadly

 http://postgis.net/docs/RT_ST_BandIsNoData.html
 Doing something like below to force the check is equally slow.


 {{{
 SELECT ST_BandIsNoData(rast, true)
 FROM sometable
 }}}
-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5176>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list