[GRASS5] thoughts about runlength encoding

Glynn Clements glynn.clements at virgin.net
Thu Aug 12 05:58:45 EDT 2004


Andrea Antonello wrote:

> just a few thoughts about runlevel encoding.
> Perhaps someone can tell me if what I'm saying is right.
> 
> When talking about rle in Grass, we are talking about integer (CELL) file 
> compression. 
> For what I know, Grass's rle is not a standard one.

There isn't any "standard" form of RLE.

> While standard rle starts compressing equal values after two (or three) of 
> them have passed, Grass's starts to compress from the first one.

Some RLE schemes have a "pass-through" option, where you can have
blocks of raw data interspersed with runs. E.g. the count byte might
be signed, with a positive value indicating that the next byte is
repeated n times, and a negative value indicating that abs(n) bytes of
raw data follow.

GRASS' RLE doesn't have this feature. Each row is either compressed or
raw; if the length of the row (as determined by examining the
difference between the row offsets) is less than the length of an
uncompressed row, then it's compressed, otherwise it's raw.

Except that, if cellhd.compressed is negative (pre-3.0 compression),
rows are always compressed, even if they would be longer than an
uncompressed row.

> When it compressed, every value gets its counter, so that we can say that a 
> bunch of coupled is build.
> By knowing the compressed size, simply dividing by two we get the number of 
> couples.

Actually, you need to divide by n+1, where n is the number of bytes
per cell. Each run consists of a one-byte count followed by an n-byte
value. For pre-3.0 compression, n is always the nbytes field from the
fileinfo structure (which is cellhd.format + 1); for the newer form
(cellhd.compressed == 1), n is stored as an extra byte at the
beginning of each line.

> Done that it's rather easy to get the uncompressed row.
> Is this right? This would mean, that there can't "hybrid" (i.e. couples and 
> singles) as in the standard rle. 

Correct.

BTW, if you're interested in the format of raster files, you should
probably look at the {get,put}_row2.c files which I posted recently
(in the "Raster lib and CELL files > 2GB" thread). Hopefully, these
should be somewhat easier to read than the original versions.

In the long run, I'm hoping to completely re-write the raster I/O
code. I'm not planning to support RLE compression (other than to allow
old files to be converted to the new format), but to use zlib for both
integer and FP formats.

However, a complete re-write is a long way off. In the mean time, I'm
considering implementing some of the less radical changes as an
intermediate measure. Primarily, I intend to add support for 64-bit
offsets on 32-bit platforms (so that raster files aren't limited to
2Gb). I'm also thinking about supporting the use of zlib for integer
maps, as well as the possibility of eliminating the null file.

-- 
Glynn Clements <glynn.clements at virgin.net>




More information about the grass-dev mailing list