[GRASS-dev] r.mapcal rand() strangeness

Tue Feb 26 00:56:07 EST 2008

Maciej Sieczka wrote:

> >> The following silently forces 2147483647 on anything bigger:
> >>
> >> $ r.mapcalc 'map=9999999999999999999'
> >> $ r.info -rt map
> >> min=2147483647
> >> max=2147483647
> >> datatype=CELL
> 
> > That is down to the atoi() function used to parse integers.
> 
> Sorry if I got it wrong - is r.mapcalc limited to 32 bit integers? If 
> so, does it have to?

CELL maps are limited to 32-bit integers. There doesn't seem much
point in extending r.mapcalc to handle anything larger; too much work
for too little reward.

> > I could change it to use strtol(), which sets errno to ERANGE on
> > overflow.
> 
> Would that make r.mapcalc accept integers bigger than int32?

No; it would mean that out-of-range integers would produce an error
instead of being silently truncated to INT_MAX.

> >> Trying the same by adding a decimal place separator, results in 
> >> something bizzare to me:
> >>
> >> $ r.mapcalc 'map=99999999999999999999999.0'
> >> $ r.info -rt map
> >> min=99999999999999991611392.000000
> >> max=99999999999999991611392.000000
> >> datatype=DCELL
> > 
> > The presence of a dot causes the value to be parsed as a
> > floating-point value using atof().
> > 
> > A double has a precision of ~16 decimal digits, which matches what you
> > see above.
> 
> So processing any number with more than 16 decimal diggits in r.mapcalc 
> must yield such "strange" values?

The precision of a "double" corresponds to roughly 16 significant
decimal digits (it's exactly 52 binary digits).

The problem arises when something (in this case, r.info) tries to
print numbers where the number of digits to the left of the decimal
point exceeds the precision.

If you use exponential notation, you can limit the number of digits to
match the precision, so the problem doesn't arise.

> And how many diggits after the decimal 
> separator are safe?

Floating point numbers have a fixed relative error rather than a fixed
absolute error. The issue is the number of significant digits. The
position of the decimal point doesn't matter.

> >> At the edges of Float32 range other strange things happen:
> >>
> >> Shouldn't the below be rather 
> >> min,max=340282000000000000000000000000000000000.0?:
> >>
> >> $ r.mapcalc 'map=3.40282e38'
> >> $ r.info -rt map
> >> min=340282000000000014192072600942972764160.000000
> >> max=340282000000000014192072600942972764160.000000
> >> datatype=DCELL
> >>
> >> The the one below should be -340282000000000000000000000000000000000.0 I 
> >> guess:
> >>
> >> $ r.mapcalc 'map=-3.40282e38'
> >> $ r.info -rt map
> >> min=-340282000000000014192072600942972764160.000000
> >> max=-340282000000000014192072600942972764160.000000
> >> datatype=DCEL
> 
> > Again, you're running into the limits of the precision of the double
> > type.
> > 
> > Also, bear in mind that the conversion from a binary FP value to a
> > string of decimal digits is being performed by r.info; r.mapcalc has
> > no control over that part of the process.
> 
> Does that mean that actually in the raster the value is something else 
> than what r.info reports?

What is in the raster is a binary floating point value:

	http://en.wikipedia.org/wiki/IEEE_floating-point

E.g. 3.40282e38 is stored as

	(9007189542424620/(2^53))*(2^128)
=	9007189542424620*(2^(128-53))
=	340282000000000014192072600942972764160

> > Essentially, the %f format behaves badly for large numbers. If you use
> > %e or %g, it will use exponential notation, with a specific relative
> > precision rather than a specific absolute precision.
> 
> >> At bigger Float64 values r.mapcalc crashes:
> >>
> >> $r.mapcalc 'map=3.40282e110'
> >>
> >> *** stack smashing detected ***: r.mapcalc terminated
> >> Aborted (core dumped)
> 
> > Right. It's sprintf()ing the value into a 64-byte buffer using %f.
> > 
> > I'll change it to use %g:
> 
> That fixes it. Thanks. I have noticed that r.null has the same problem 
> though. More modules could?

Quite possibly.

> To reproduce, please run the attached script 
>   (having set region big enough) and proceed utnil the last, Float64. 
> step. r.null with crash with "*** stack smashing detected ***".

That's a different problem. It's crashing in parse_d_mask_rule(), at
line 230:

    else if (sscanf (vallist,"%[^ -\t]-%lf", junk, &a) == 2)

The "%[^ -\t]" matches the entire string, which is 311 bytes long, but
the "junk" buffer is only 128 bytes. The fact that the value is
supposed to be a floating-point number doesn't actually matter.

If you use 1.79769e+308, it won't crash. Most sane people would use
exponential form rather than a >300-digit string (especially when only
~16 digits are significant).

The simplest fix is to add a maximum field width, e.g. "%100[^ -\t]".

BTW, that pattern is bogus. The dash (minus) is acting as a range
specifier, i.e. everything from space to tab inclusive. But space (32)
comes after tab (9). To match anything except a space, a dash or a
tab, the dash should come last.

-- 
Glynn Clements <glynn at gclements.plus.com>