[gdal-dev] GTiff bit shuffle compression feature request
Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC]
jesse.r.meyer at nasa.gov
Fri Dec 8 09:06:59 PST 2023
Hi,
When using horizonal differencing to reduce the numerical range of band data, the upper bytes in the produced stream are typically 0 which leverages LZ’s byte based compression model. But the least significant bytes can still have many significant bits as 0. Unless the whole byte is replicated, LZ compressors can’t do much to leverage the pattern however. For data with temporal and or spatial coherence, ‘shuffling’ is another effective strategy to losslessly reform the data stream to be favorable to LZ style compressors. And plays nicely off gains already provided by the PREDICTOR functionality.
The notion is to arrange the bit stream where the Nth “shuffled” byte contains the Nth bit from each byte in the sequence. The sequence length is usually determined by the data type bit length.
For example (for brevity, assume bytes are 4 bits long)
Byte 1, Byte 2, Byte 3, Byte 4
0001, 0011, 0111, 0001
They all share the top 0 bit and the bottom 1 bit,
“Shuffled”
0000, 0010, 0110, 1111
The algorithm is pretty simple to implement, and can be SIMD accelerated for high performance.
While we specifically are users of the GTIFF format, such a strategy could be employed generically for most raster and even vector formats.
Best,
Jesse
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20231208/d044b8c4/attachment.htm>
More information about the gdal-dev
mailing list