[gdal-dev] gdal_translate (3.1.0dev) "never" finishes on large jpeg cogs... REALLLLLY long time to unload.

Ritchie, Andrew C aritchie at usgs.gov
Tue Apr 21 12:47:20 PDT 2020


Just wanted to follow up on this because I didn't want to leave the impression that the issue was resolved, or limited to jpeg compression. It seems to be an issue with writing the internal mask band.

When I create a mask band in a large lzw-compressed or jpeg-compressed tif using the COG driver it dramatically increases processing time over writing RGBA (hours instead of minutes), so the issue is not jpeg compression, it's the creation of the mask band. Steps to reproduce:

  1.  Take a decent-sized RGBA LZW tiff
  2.  Generate a LZW COG with -b 1 -b 2 -b 3 -b 4 -config GDAL_TIFF_INTERNAL_MASK YES and time it
  3.  Generate a LZW COG with -b 1 -b 2 -b 3 -mask 4 -config GDAL_TIFF_INTERNAL_MASK YES and time it
  4.  Compare times. When I do this with a sourcefile that's RGBA COG 102600x91100 my time doing (1) is about 2 minutes and my time doing (2) is about 120 minutes

I also noticed that the -co BIGTIFF=NO option appears to be ignored in the COG driver. I can share a file if that's helpful (can not provide a link on the listserv)

Is there a faster way to generate an external nodata mask then add it? From reading the GTIFF and COG format notes on internal masks it wasn't clear but I didn't see a way to specify copying masks in the COG driver.

From: Ritchie, Andrew C
Sent: Wednesday, April 15, 2020 12:26 PM
To: Even Rouault <even.rouault at spatialys.com>; gdal-dev at lists.osgeo.org
Subject: RE: [EXTERNAL] Re: [gdal-dev] gdal_translate (3.1.0dev) "never" finishes on large jpeg cogs... REALLLLLY long time to unload.

Hi Even,

Thanks for the quick response! The source dataset is a LZW cog with RGBA, and I confirmed (I think) that the issue was the mask layer by playing with the switches I used to generate the LZW cog - I didn't even have to do a JPEG COG. I can cause the same, or very similar behavior, by changing from:

-b 1 -b 2 -b 3 -b 4
to:
-b 1 -b 2 -b 3 -mask 4

with GDAL_TIFF_INTERNAL_MASK YES.

With the -b 4 switch (or omitting all -b and -mask switches), I get LZW cogs in 2 minutes. With -mask 4 I get hung up at 20% with directory thrashing messages in debug for at least 30 minutes, and I'm guessing I'll get the same behavior at the "done" message if I care to wait.

Below are the two configurations that show such a difference in performance for me. I didn't play around with CACHEMAX or MAX_DATASET_POOL_SIZE, was trying to keep it simple.

2 minute TIFFs:
gdal_translate <infile> <outfile> -b 1 -b 2 -b 3 -b 4 -of COG -co COMPRESS=LZW -co PREDICTOR=2 -co NUM_THREADS=ALL_CPUS -co RESAMPLING=AVERAGE -config GDAL_TIFF_INTERNAL_MASK YES -config GDAL_TIF_OVR_BLOCKSIZE 128

A couple orders of magnitude longer:
gdal_translate <infile> <outfile> -b 1 -b 2 -b 3 -mask 4 -of COG -co COMPRESS=LZW -co PREDICTOR=2 -co NUM_THREADS=ALL_CPUS -co RESAMPLING=AVERAGE -config GDAL_TIFF_INTERNAL_MASK YES -config GDAL_TIF_OVR_BLOCKSIZE 128

From: Even Rouault <even.rouault at spatialys.com<mailto:even.rouault at spatialys.com>>
Sent: Wednesday, April 15, 2020 4:38 AM
To: gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>
Cc: Ritchie, Andrew C <aritchie at usgs.gov<mailto:aritchie at usgs.gov>>
Subject: [EXTERNAL] Re: [gdal-dev] gdal_translate (3.1.0dev) "never" finishes on large jpeg cogs... REALLLLLY long time to unload.


Andrew,

Has your source raster an alpha band ? That could explain the difference since it isn't possible to directly create a YCbCrA JPEG compressed file, but internally a mask band must be created. However I wouldn't anticipate such a huge difference in performance between compression schemes. I would suggest not setting GDAL_CACHEMAX at all and letting it at its 5% default (increasing it is not always a good idea), in case it would be a performance issue at de-allocating cached blocks.

Even

--

Spatialys - Geospatial professional services

http://www.spatialys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20200421/3c310d72/attachment.html>


More information about the gdal-dev mailing list