[GRASS-dev] configure: testing arch endian
Glynn Clements
glynn at gclements.plus.com
Mon May 8 00:06:00 EDT 2006
Brad Douglas wrote:
> I would like to add a macro into aclocal.m4/configure.in to test for
> architecture byte order. This would then define something like
> G_BIGENDIAN 0/1 in include/config.h.in.
>
> I think this would be better than testing for it constantly with
> G_is_little_endian(). This would also keep it from being reimplemented
> in libraries that are not interdependent (eg. lib/gis,
> lib/vector/dglib).
>
> I noticed that endian is not even considered in lib/image. I'll fix
> that if I get the clear for the above (or some reversion of).
>
> Comments? Caveats?
Personally, I would suggest just writing code which doesn't rely upon
the system's byte order; i.e. explicitly convert byte-arrays to words
using left-shift+OR and vice-versa using right-shift+AND.
Apart from endianness issues, this can also prevent alignment
problems. Explicit [de]serialisation doesn't require the byte stream
to be word-aligned (on platforms other than x86, this matters).
Looking at the places which currently use G_is_little_endian():
lib/ogsf/gsd_img.c
lib/ogsf/gsd_img_ppm.c
lib/ogsf/gsd_img_tif.c
raster/r.in.mat/main.c
raster/r.out.bin/main.c
raster/r.out.mat/main.c
The first three indicate a design flaw in gsd_getimage(), namely that
it's assuming that "unsigned long" is 4 bytes. That's the root of the
recently-reported "NVIZ image dump crashes on 64-bit systems" bug.
The solution there is to simply treat the buffer as an arrays of
unsigned char; there's no need to serialise words or to know the
system's byte order.
r.in.mat and r.out.mat are littered with "sizeof(long) == 4"
assumptions. Also, AFAICT, r.out.mat always writes the output in the
system's byte-order, and r.in.mat just assumes that the file is in the
system's byte-order (it checks, but doesn't do anything in the event
of a mismatch).
Both of those programs need to be substantially re-written.
Finally, the only reason that r.out.bin needs to know the endianness
is due to the brain-damaged -s flag. Rather than allowing the user to
specify directly whether the file is written in big- or little-endian
order, the user has a choice of "the same order as this system" (no -s
flag) or "the opposite order to this system" (-s flag given).
In practical terms, this means that the user has to figure out the
system's byte order in order to determine whether or not to use the -s
flag.
If you change the semantics of that flag so that the absence of the -s
switch means little-endian while the presence of the flag means
big-endian, r.out.bin doesn't need to know the host's byte order, and
a given r.out.bin command achieves the same result (file in big-endian
format or file in little-endian format) regardless of the system's
byte order.
As for backwards compatibility, making the default byte order (no -s
flag) little-endian means that anyone using x86 (i.e. most users) will
be unaffected.
IOW, I've yet to come across a situation which actually has a
legitimate reason to know the system's byte order.
BTW, when it comes to floating-point values, the situation isn't as
simple as big- or little-endian. On some systems, FP values may use a
different byte order to integers, or double-precision FP values may
have the 32-bit halves in a different order than the order of bytes
within a word.
--
Glynn Clements <glynn at gclements.plus.com>
More information about the grass-dev
mailing list