[GRASS5] Portability issues

Glynn Clements glynn.clements at virgin.net
Thu Oct 3 11:51:37 EDT 2002


I've been looking into portability issues, including an analysis of
direct calls to non-ANSI functions from GRASS programs and libraries.
For programs, the statistics are:

symbol      |count
------------+-----
system      |  134
unlink      |   97
access      |   77
close       |   71
sleep       |   57
read        |   54
pclose      |   50
popen       |   50
isatty      |   48
open        |   46
write       |   40
fork        |   35
wait        |   34
snprintf    |   30
_exit       |   29
dup         |   26
pipe        |   26
creat       |   25
lseek       |   24
execl       |   23
execlp      |   19
fileno      |   19
opendir     |   18
closedir    |   17
fdopen      |   16
getpid      |   16
readdir     |   16
execvp      |   14
kill        |   14
stat        |   10
umask       |    8
mkdir       |    3
alarm       |    2
chdir       |    2
drand48     |    2
getpwuid    |    2
gettimeofday|    2
getuid      |    2
ioctl       |    2
lrand48     |    2
srand48     |    2
strcasecmp  |    2
strdup      |    2
swab        |    2
chmod       |    1
getopt      |    1
index       |    1
putenv      |    1
select      |    1
sigaction   |    1
sigemptyset |    1
strncasecmp |    1
sync        |    1
truncate    |    1
ttyname     |    1
tzset       |    1
usleep      |    1
waitpid     |    1

Some of these could simply be replaced with ANSI functions, while
others suggest that new functions should be added to e.g. libgis to
improve portability.

Some comments on specific functions:

+ open close creat read write lseek truncate

Many of these could probably be replaced with the ANSI stdio
equivalents.

+ mkdir chdir opendir closedir readdir

The ANSI libraries don't deal with directories. However, any system on
which GRASS runs will have equivalent functionality; we just need to
provide a portable interface.

+ drand48 lrand48 srand48

These can be replaced with rand/srand. Presently, all programs which
use (s.random, r.random, r.mapcalc) them can fall back to rand/srand,
but [rs].random attempt to guess whether the *rand48 functions are
available based upon platform macros rather than HAVE_DRAND48.

+ strcasecmp strncasecmp strdup swab index

Simple string processing functions which could easily be replaced with
generic versions. Actually, libgis already provides G_store and
G_strcasecmp, although the latter is a hand-coded implementation which
only works for ASCII characters (I don't know whether this is
intentional; there are valid arguments both for and against honouring
the locale settings).

+ snprintf

C9X defines this, so in a couple of decades it won't be a problem. For
now, a wide variety of solutions are possible, all with their own
advantages and disadvantages.

+ unlink

This can just be changed to remove(), which is ANSI.

+ sleep usleep

Suitable functionality should be available on all platforms; we just
need a portable interface.

+ sigaction sigemptyset

Only used by r.mapcalc. signal() can be used instead, although
signal() has problems of its own (BSD-vs-SysV signal semantics,
general lack of flexibility).

+ access stat umask chmod

Closely related to the Unix permission model, although many of the
callers of access() and stat() only use information for which portable
interfaces could be provided (e.g. whether a file exists, or its
size).

+ dup pipe select fileno fdopen

Specific to the core Unix I/O API. These would need to be analysed on
a case-by-case basis. Although, a significant number of these calls
are from db.*, *.db or p.* programs, which suggests that a lot of it
may be localised in libdbmi and paint/Interface/applib (these are
static libraries, so their dependencies become their clients'
dependencies as far as "nm" is concerned).

+ isatty ioctl ttyname

Unix terminal I/O.

The ioctl() calls are all terminal-related, and not widely used.

ttyname() is only used by mon.start, and is probably no longer
relevant. The Tek4105 driver isn't present in GRASS5, and even if it
was resurrected, it's unlikely to be used on non-Unix systems.

That just leaves isatty(); Cygwin manages to implement this, so there
must be some way to determine if input is being read from a console.

+ fork execl execlp execvp getpid kill wait waitpid

Unix process management. Providing a portable interface for spawning
processes would be quite involved, but also quite useful, particularly
in conjunction with the next point.

+ system popen pclose

These suffer from the same issues as the previous point, with the
additional complication that the command is passed to the shell. I.e. 
whichever shell happens to be /bin/sh on the system in question (the
original AT&T Bourne shell, ash, bash v1, bashv2 and zsh are all
plausible).

Many of the problems with spaces in filenames can be attributed to the
use of these functions. While using single quotes should solve those, it
won't prevent a web interface from being abused to execute commands on
the server.

+ _exit

Most of the programs which use this are the same programs which use
pipe() and dup(), which points to libdbmi and paint/Interface/applib. 
I don't know whether it's really necessary to use _exit() rather than
exit().

+ alarm

Used by i.class and v.digit, presumably to implement a timeout. In the
worst case, we could just provide a stub function which does nothing
(i.e. no timeout).

+ getopt

Used by s.sweep; could use G_parser(), or could just parse argv[]
manually.

+ sync

Used by v.apply.census; almost certainly gratuituous.

+ tzset

Used by r.spread; gratuitous.

+ gettimeofday

Used by XDRIVER and NVIZ; equivalent functionality is likely to exist
elsewhere.

+ getuid

Used by g.help and clean_temp. g.help only uses it (in conjunction
with getpwuid(); see below) to determine the user's home directory; it
should use getenv("HOME") instead.

+ getpwuid

Used by g.help and set_data. set_data uses it to determine the
username of the owner of the GISDBASE directory for printing a
diagnostic message; removing it wouldn't be a great loss.

+ putenv

Used by XDRIVER; it probably doesn't need to be portable.

In addition to the functions which are called directly from programs,
the following non-ANSI functions are called from the libraries:

+ setpriority setreuid setuid geteuid

Used by src/libes/gis/set_prior.c. Brief examination of the code
suggests that this is a hack which could be readily replaced by stubs.

geteuid() is also used by G__mapset_permissions(), which is mostly
ill-conceived anyhow.

+ socket bind listen accept connect

Specific to unix_socks.c, which should only be used by the monitor
interface. While Win32 provides the above functions, the Unix
functions return a file descriptor which can be used with other Unix
API functions, while Win32' SOCKET type is specific to the WinSock
API.

+ cuserid

Used by libdbmi; I'm not sure whether there's a reason it uses
cuserid() instead of getlogin().

+ gethostname

Used by G__machine_name(), which is sensible enough. Presumably
equivalent functionality is available on any networked system (and, on
non-networked systems, you don't really need a per-machine
identifier).

+ getlogin

Used by G_done_msg().

+ link

Used by close_new (G_{close,unopen}_cell) to rename the temporary
file; it should probably use rename() instead.

+ rewinddir

Used by libdbmi; should be dealt with in conjunction with
opendir/closedir/readdir above.

+ setpgrp

Used by G_fork(), and the main() function from the driver library;
should be dealt with in conjunction with other process spawning
functions (fork, execl etc).

+ tempnam

Used by the gmath library, but not directly. It appears to come from
libg2c (the gcc F77-C interface), so not an issue.

-- 
Glynn Clements <glynn.clements at virgin.net>




More information about the grass-dev mailing list