[gdal-dev] [Tiff] Unnecessary bigger JPEG-compressed TIFF

Even Rouault even.rouault at spatialys.com
Mon Dec 15 14:49:08 PST 2014


Le lundi 15 décembre 2014 23:09:17, Joris Van Damme (AWare Systems) a écrit :
> Even,
> 
> > Hum http://www.remotesensing.org/libtiff/TIFFTechNote2.html mentions "An
> > image segment may not redefine any table defined in JPEGTables.", so my
> > understanding is that a strip/tiff that would want to have custom tables
> > should use different table numbers.
> 
> Yes, that is correct. My choice of words was unfortunate. From your
> next comments however, I can see you understood my bad choice of words
> hardly matters, using different tables number rather then overriding
> the same number, doesn't change anything.

Agreed

> 
> >> for whatever strips or tiles it wishes to update, but still leave the
> >> tag as is to apply to any other strip or tile that may not override
> >> it.
> > 
> > Yes I agree the JPEG-in-TIFF specification allows this, but my
> > understanding libtiff has never really supported this. It would require
> > comparing the tables generated with the current quality setting with the
> > ones in JpegTables tag, and if they do not match, install new tables and
> > reference them in the libjpeg component definition.
> 
> I'm not sure I understand correctly. Are you talking about what is
> required in the encoder changing a separate strip or tile this way, or
> the decoder handling the resulting file? Either case, I don't think I
> agree, but I may be missing something.

Was only about the encoder. I believe the decoder should work OK already.  
Obviously it did since for 4 years, it has produced files with redefined 
quantization tables in strip/tiles, and actually with the same table number as 
in the JpegTable tag...
I'm not sure however if it works in conformant files where the redefined tables 
would have different numbers (perhaps it does, just haven't a test example 
handy)

> 
> In the encoder, it should not be required to compare any tables at
> all, in order to use this updating scheme. The only reason why an
> encoder that proceeds this way, would even read the JpegTables data,
> would be so as to make sure it does indeed use different table
> numbers.
> 
> This is exactly the reason why I regard this "overriding with
> different table numbers" the most intuitive way to update single
> strips or tiles in a JPEG-in-TIFF file.
> 
> In the decoder; strictly speaking, what is required is to load the
> JpegTables data into the JPEG subcodec before decoding the first strip
> or tile data with the same said subcodec. At least, that's how it
> would work with LibJpeg, there's some level of implementation detail
> involved in the fact that fully decoding a tables-only stream is the
> standard way to "infuse" LibJpeg with those tables.
> 
> In a more robust implementation of the decoder, one may want to
> protect against a strip/tile redefining tables and mistakenly using
> the same table numbers. That would be reasonably simple, the way to do
> this would be to load the JpegTables data into the JPEG subcodec
> before decoding *any* strip or tile, instead of merely doing it before
> decoding *the first* strip or tile. That's how my codec does it. This
> more robust implementation is actually a requirement anyway for a
> decoder that wishes to support decoding different strips or tiles in
> different thread contexts, for better multicore usage, like mine.
> 
> If full decoding of a tables-only JPEG stream is the only way to
> infuse the JPEG subcodec with tables, I can see there may be a need to
> quantify the possible overhead involved in this before deciding upon
> this "more robust" implementation. In my own JPEG codec, I decided to
> make the tables a separate class for this exact reason, with the
> caller level being able to simply "transplant and reuse" decoded
> tables into other JPEG decoder objects. Maybe this feature got added
> to LibJpeg, I don't know anything about LibJpeg beyond version 6b.
> 
> So I'm unsure why any table comparison would be required with the
> scheme of strip/tile updating, anywhere.
> 
> Unless you're  specifically talking about space-saving in strip/tile
> updating, by comparing the planned "local overriding" tables with the
> default tables in the JpegTables tag, and not overriding at all if
> equal? I wasn't thinking along those lines in my implementation. With
> tables being implementation specific, possibly image-specific,
> possibly specific to other factors like LibJpeg quality inside the
> same single implementation, etc, I chose to regard matching tables as
> astronomically improbable. Besides, my reasoning is, if table size
> does contribute even remotely significantly to the file size, then
> RowsPerStrip or TileWidth+TileHeight is very badly chosen. 

Well, I think the intent of TIFFTAG_JPEGTABLESMODE = 3 (the default mode) is 
to avoid writing quantization&huffman tables in tiles/strips when it is not 
necessary. And more generally achieving the smallest binary size seems to be a 
reasonable objective.

> I can see
> how it may be justified to regard matching tables a legit and way more
> probable concept, though, in your part of the world. Working with
> LibTiff and LibJpeg you sort-off have the luxury you can assume 98% of
> the planet sharing your implementation so the words
> "implementation-specific" almost loose their meaning. If next you're
> able to work around the "libjpeg-implementation-specific
> quality-specific" issue as well... Still, there's the "image-specific"
> issue, at least for huffman tables, when huffman optimization is
> applied, isn't there? Or does LibTiff+LibJpeg not support huffman
> optimization in this context?

AFAICS, in JPEGTABLESMODE_HUFF mode, huffman optimization is disabled.

> 
> > libtiff + libjpeg... tricky
> 
> Absolutely. There's a humongous amount of detail to keep in your head
> wherever LibTiff and LibJpeg mix. I'm sorry if I added to it.

One of the issue that was encountered in the past and lead to the regression I 
fixed (tried to fix...) was when switching back and forth between directories 
that might have different current JPEG tables. Each time you do this, the JPEG 
state is reset, so you have to reload the tables in case the quality settings 
were different between the directories, and in the current state of tif_jpeg.c 
the easiest solution is to call jpeg_set_quality().

I'm not saying doing something better is impossible, it is just that I don't 
feel pursuing that right now ;-) Trivial changes in that code have a high 
chance of causing issues in some cases, so non-trivial changes, and especially 
since I don't think the test suite actually tests that a lot.

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list