[GRASS-dev] configure: testing arch endian

Mon May 8 00:06:00 EDT 2006

Brad Douglas wrote:

> I would like to add a macro into aclocal.m4/configure.in to test for
> architecture byte order.  This would then define something like
> G_BIGENDIAN 0/1 in include/config.h.in.
> 
> I think this would be better than testing for it constantly with
> G_is_little_endian().  This would also keep it from being reimplemented
> in libraries that are not interdependent (eg. lib/gis,
> lib/vector/dglib).
> 
> I noticed that endian is not even considered in lib/image.  I'll fix
> that if I get the clear for the above (or some reversion of).
> 
> Comments?  Caveats?

Personally, I would suggest just writing code which doesn't rely upon
the system's byte order; i.e. explicitly convert byte-arrays to words
using left-shift+OR and vice-versa using right-shift+AND.

Apart from endianness issues, this can also prevent alignment
problems. Explicit [de]serialisation doesn't require the byte stream
to be word-aligned (on platforms other than x86, this matters).

Looking at the places which currently use G_is_little_endian():

	lib/ogsf/gsd_img.c
	lib/ogsf/gsd_img_ppm.c
	lib/ogsf/gsd_img_tif.c
	raster/r.in.mat/main.c
	raster/r.out.bin/main.c
	raster/r.out.mat/main.c

The first three indicate a design flaw in gsd_getimage(), namely that
it's assuming that "unsigned long" is 4 bytes. That's the root of the
recently-reported "NVIZ image dump crashes on 64-bit systems" bug.

The solution there is to simply treat the buffer as an arrays of
unsigned char; there's no need to serialise words or to know the
system's byte order.

r.in.mat and r.out.mat are littered with "sizeof(long) == 4"
assumptions. Also, AFAICT, r.out.mat always writes the output in the
system's byte-order, and r.in.mat just assumes that the file is in the
system's byte-order (it checks, but doesn't do anything in the event
of a mismatch).

Both of those programs need to be substantially re-written.

Finally, the only reason that r.out.bin needs to know the endianness
is due to the brain-damaged -s flag. Rather than allowing the user to
specify directly whether the file is written in big- or little-endian
order, the user has a choice of "the same order as this system" (no -s
flag) or "the opposite order to this system" (-s flag given).

In practical terms, this means that the user has to figure out the
system's byte order in order to determine whether or not to use the -s
flag.

If you change the semantics of that flag so that the absence of the -s
switch means little-endian while the presence of the flag means
big-endian, r.out.bin doesn't need to know the host's byte order, and
a given r.out.bin command achieves the same result (file in big-endian
format or file in little-endian format) regardless of the system's
byte order.

As for backwards compatibility, making the default byte order (no -s
flag) little-endian means that anyone using x86 (i.e. most users) will
be unaffected.

IOW, I've yet to come across a situation which actually has a
legitimate reason to know the system's byte order.

BTW, when it comes to floating-point values, the situation isn't as
simple as big- or little-endian. On some systems, FP values may use a
different byte order to integers, or double-precision FP values may
have the 32-bit halves in a different order than the order of bytes
within a word.

-- 
Glynn Clements <glynn at gclements.plus.com>