[GRASS-dev] Improving the G_calloc: unable to allocate xx bytes of memory message?

Mon Oct 27 10:42:02 PDT 2014

Markus Neteler wrote:

> GRASS 7.1.svn (nc_spm_08_grass7):~ > v.to.rast roadsmajor out=test
> use=attr attrcol=PROPYEAR memory=3000000 --o
> WARNING: No areas selected from vector map <roadsmajor>
> Current region rows: 135000, cols: 150000
> ERROR: G_calloc: unable to allocate 18446744072484715136 * 4 bytes of
>        memory at vector/v.to.rast/raster.c:84

$ printf '%x' 18446744072484715136
ffffffffb6fe7a80

Those leading "f"s are a giveway. It's almost certain that a
calculation was done with 32-bit arithmetic, which overflowed to
produce a negative result (0xb6fe7a80 = -1224836480), which was then
converted to 64 bits with sign extension, then cast to unsigned.

> ... only, it does not return to command line (have to use CTRL-C).

It's possible that memory corruption has already occurred, resulting
in e.g. an infinite loop.

A great deal of GRASS code still uses "int" where it should be using
"long", "long long", "size_t" or "ssize_t". Fixing this is far from
straightforward as:

1. It's likely to be a lot of work.

2.  I don't know of any easy way to detect such cases.

3. We still haven't decided upon the appropriate type. All of the
options have drawbacks; to recap:

	long		- only 32 bits on 64-bit versions of Windows
	long long	- not specified by C89
	size_t		- unsigned
	ssize_t		- POSIX; not specified by any version of ISO C

Also: Windows' write() function uses "int" for the count argument and
the return value, so it can't handle more than 2 GiB at a time.
Likewise for read().

> >> Yet a bit unhelpful :-) It comes from lib/segment/format.c but I
> >> didn't manage to improve that.
> >
> > Specifically:
> >
> >         if (write(fd, buf, n) != n) {
> >             G_warning("segment zero_fill(): Unable to write (%s)", strerror(errno));
> >
> > ENOENT ("No such file or directory") isn't a valid error code for
> > write(). But errno is only set if write() returns -1, so it's likely
> > that write() returned a short count rather than failing and the errno
> > value is left over from a previous system call (sucessful calls don't
> > reset errno to zero). In all probability, the next call to write()
> > would have failed due to ENOSPC, EDQUOT, or EFBIG (or raised SIGXFSZ
> > if RLIMIT_FSIZE is being exceeded).
> 
> I see and now darkly remember something about discussion in this list
> about clearing the previous error state.

The main issue here is that errno is being examined when write() has
reported a short count (a non-negative return value less than the
value of the third argument) rather than an error (returning -1).

The code should either distinguish these cases, or use something like:

ssize_t writen(int fd, const void *buf, size_t count)
{
    size_t left = count;
    const char *ptr = buf;
    while (left > 0) {
        ssize_t n = write(fd, ptr, left);
        if (n <= 0)
            return -1;
        ptr += n;
        left -= n;
    }
    return count;
}

This never returns a short count; it either writes (and returns) the
exact number of bytes requested or returns -1 (errno won't be changed
if write() returns 0, but that should only occur when the third
argument is 0; if write() can't write *anything*, that's an error
condition).

-- 
Glynn Clements <glynn at gclements.plus.com>