[postgis-devel] [wktraster] Core tests failure for r5841

Mateusz Loskot mateusz at loskot.net
Wed Jan 12 17:12:58 PST 2011


On 12/01/11 18:47, Pierre Racine wrote:
> Jorge, Regina,
> 
> I get a fresh build of PostGIS raster and the regress test in 
> rt_addband.sql fail on my machine. It was modified by Mateusz which 
> was obtaining different results.
> 
> In the first case:
> 
> SELECT St_Value(ST_AddBand(ST_MakeEmptyRaster(1000, 1000, 10, 10, 2, 
> 2, 0, 0, -1), 1, '16BSI', -32769, NULL), 3, 3);
> 
> Mateusz gets 32767 and I get -32768
> 
> In the second case:
> 
> SELECT St_Value(ST_AddBand(ST_MakeEmptyRaster(1000, 1000, 10, 10, 2, 
> 2, 0, 0, -1), 1, '16BSI', 210000.46, NULL), 3, 3);
> 
> Mateusz gets 13392 and I get -32768
> 
> I suspect this is a Windows/Linux difference. Do the test
> works/fails for you? What results do you get?

Pierre,

I'm glad you raised this problem once again.

First, this is not related to Windows vs Linux.
The problem is caused by the fact conversions used in the values
clamping cause undefined behaviour (UB) in terms of C programming language.

======
C Standard/6.3.1.4

When a finite value of real floating type is converted to an integer
type other than _Bool, the fractional part is discarded (i.e., the value
is truncated toward zero). If the value of the integral part cannot be
represented by the integer type, the behavior is undefined.
======

Second, in the C programming language, signed integer overflow
causes undefined behavior. Note, 16BSI is signed integer conversion.

The undefined behaviour means *everything* can happen, even PostGIS can
crash. Luckily, we are only experiencing funny results.

The problem in casting float-point values to integers, especially small
ones like 16-bit integers:

http://trac.osgeo.org/postgis/browser/trunk/raster/rt_pg/rt_pg.c?rev=6608#L1735

Shortly, this is UB if value of a overflows int16_t
float a;
int16_t b = (int16_t)a;

Here is good summary of float-point to integer conversions
and their results:

http://msdn.microsoft.com/en-us/library/d3d6fhea.aspx

Here is sample program I tested on different architectures
and several compilation variants requesting compiler to play the data
internally (optimise) that show the problem leaking:

(here it is compiled online http://codepad.org/lOQXoJhh)

/***************************************************************/
#include <stdio.h>
#include <stdint.h>

int main()
{
    double a;

    /* UB and truncated values out of range of 16-bit number to fit
16-bit  */

    a = -32769;

    printf("%g -> %d\n", a, (int16_t)a);
    printf("%g -> %d\n", a, (int16_t)(int32_t)a);
    printf("%g -> %d\n", a, (int16_t)(int64_t)a);

    a = 210000.46;
    printf("%g -> %d\n", a, (int16_t)a);
    printf("%g -> %d\n", a, (int16_t)(int32_t)a);
    printf("%g -> %d\n", a, (int16_t)(int64_t)a);

    a = -2147483649.0;
    printf("%g -> %d\n", a, (int16_t)a);
    printf("%g -> %d\n", a, (int16_t)(int32_t)a);
    printf("%g -> %d\n", a, (int16_t)(int64_t)a);

    return 0;
}

/* Linux 64-bit */
/*
mloskot at dog:~/tmp$ gcc -O0 trunc.c
mloskot at dog:~/tmp$ ./a.out
-32769 -> 32767
-32769 -> 32767
-32769 -> 32767
210000 -> 13392
210000 -> 13392
210000 -> 13392
-2.14748e+09 -> 0
-2.14748e+09 -> 0
-2.14748e+09 -> -1

mloskot at dog:~/tmp$ gcc -O3 trunc.c
mloskot at dog:~/tmp$ ./a.out
-32769 -> -32768
-32769 -> 32767
-32769 -> 32767
210000 -> 32767
210000 -> 13392
210000 -> 13392
-2.14748e+09 -> -32768
-2.14748e+09 -> 0
-2.14748e+09 -> -1
*/

/* Linux 32-bit */
/*
mloskot at vb-ubuntu910:~/tmp$ g++ -O0 trunc.c
mloskot at vb-ubuntu910:~/tmp$ ./a.out
-32769 -> -32768
-32769 -> 32767
-32769 -> 32767
210000 -> -32768
210000 -> 13392
210000 -> 13392
-2.14748e+09 -> -32768
-2.14748e+09 -> 0
-2.14748e+09 -> -1

mloskot at vb-ubuntu910:~/tmp$ g++ -O3 trunc.c
mloskot at vb-ubuntu910:~/tmp$ ./a.out
-32769 -> -32768
-32769 -> 32767
-32769 -> 32767
210000 -> 32767
210000 -> 13392
210000 -> 13392
-2.14748e+09 -> -32768
-2.14748e+09 -> 0
-2.14748e+09 -> -1

*/
/***************************************************************/


The commit (http://trac.osgeo.org/postgis/changeset/6605) does not fix
anything, but silences the bug related to portability.

"Something must have gone wrong on Mat`s one" means either GCC team has
screwed their compiler and it generates rubbish, or we have problem in
PostGIS. Which one you vote for?

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
Member of ACCU, http://accu.org



More information about the postgis-devel mailing list