[gdal-dev] gdal_translate (3.1.0dev) "never" finishes on large jpeg cogs... REALLLLLY long time to unload.

Jeremy Palmer palmerjnz at gmail.com
Wed Apr 22 00:22:23 PDT 2020


Hi Andy,

On Wed, Apr 22, 2020 at 8:33 AM Ritchie, Andrew C <aritchie at usgs.gov> wrote:

>
>
> Sorry I should’ve run more tests to clarify the situation re BIGTIFFs. It
> looks like gdal_translate honors -co BIGTIFF=NO for the raster but not the
> mask.
>

What's the output size of your COG when it successful completes?


>
>
> Incidentally, when I kill the process with ctrl-C (on a windoze machine)
> GDAL fails to exit gracefully (2 of 2 times this run) with the following as
> the final debug message
>
>
>
> GDAL: Flushing dirty blocks: 0GTIFF: Waiting for worker job to finish
> handling block 0
>

In my experience, the progress reporting in GDAL is not very good and can
spend a lot of time in the flushing dirty blocks process. It might be that
you can't interrupt GDAL at this point. I would wait a little longer. Even
will be able to comment further on this.

>
>
> My cmd:
>
> gdal_translate <infile.tif> <outfile.tif> -b 1 -b 2 -b 3 -mask 4 -of cog
> -co COMPRESS=LZW -co PREDICTOR=2 -co NUM_THREADS=ALL_CPUs -co
> RESAMPLING=AVERAGE -co BIGTIFF=NO –config GDAL_TIF_OVR_BLOCKSIZE 128 –debug
> ON
>

Seems ok to me. For our processing of aerial RGB photos COGs, when we are
interested in web mapping use and a good balance between storage size and
quality, we go for something like:

gdalbuildvrt \
  -addalpha -hidenodata \
  $PWD/$TIF_FOLDER.vrt \
  $PWD/$TIF_FOLDER/*.tif

gdal_translate \
  -of COG \
  -co COMPRESS=WebP \
  -co NUM_THREADS=ALL_CPUS \
  -co BIGTIFF=YES \
  -co TILING_SCHEME=GoogleMapsCompatible \
  --config BIGTIFF_OVERVIEW YES \
  -co ALIGNED_LEVELS=3 \
  -co ADD_ALPHA=YES \
  -co BLOCKSIZE=512 \
  -co RESAMPLING=CUBIC \
  $PWD/$TIF_FOLDER.vrt $PWD/$TIF_FOLDER.webp.google.aligned.cog.tif


>
> Jeremy – to clarify, I have confirmed that if I wait long enough, the COG
> will finish, so generating in the background is feasible if slow. I was
> just surprised that including a transparency mask increases the processing
> time so much. It’s necessary to reduce the file size using jpeg or webp
> compression and still provide transparency I guess, but it’s a huge
> performance penalty to pay. I don’t have enough programming experience (or
> time) to do profiling and figure out what the bottleneck is, and don’t get
> me wrong – I ❤ gdal x 10^10, but I thought this was worth mentioning
> because of the increase in time (which is so long I initially thought it
> was actually a hang).
>

First, I would consider using WebP if you think your users can handle that.
It's way better than JPEG+Mask. Note I'm surprised that adding the mask to
the tiff is adding heaps of additional time. Can you generate your dataset
with and without the mask to see the time difference? As mentioned before,
most of the processing time is taken up in the overview generation
(especially when compared to the data compression stage, which can use all
of your CPU cores). Hopefully, some upcoming GDAL improvements can improve
this situation.


>
>
> As far as the steps to generate a COG – I output tiled tiffs, then create
> a VRT, then create a RGBA LZW cog, preview, and generate a JPEG COG. I only
> added the RGBA LZW cog because of the issues I was having generating the
> JPG cog – it’s actually a good point to delete the tiles in my workflow
> because I can go back to the LZW cog again and again if I need to since
> it’s lossless.
>

What was the issue you were having with JPEG compression? Just time to
process? I would try the above command to see if that gives a good result
(remove warping to GoogleMap projection if you don't need that as that adds
a lot to processing times)

Cheers,
Jeremy

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20200422/265f529d/attachment.html>


More information about the gdal-dev mailing list