[GRASS-dev] [grass-code I][507] r.in.xyz: a segfault with particular dataset

Glynn Clements glynn at gclements.plus.com
Tue Oct 9 19:40:09 EDT 2007


grass-dev at grass.itc.it wrote:

> code I item #507, was opened at 2007-10-09 17:11
> Status: Open
> Priority: 3
> Submitted By: Maciej Sieczka (msieczka)
> Assigned to: Hamish Bowman (hamish)
> Summary: r.in.xyz: a segfault with particular dataset 
> Issue type: module bug
> Issue status: None
> GRASS version: CVS HEAD
> GRASS component: raster
> Operating system: Linux
> Operating system version: Ubuntu Dapper 64bit
> GRASS CVS checkout date, if applies (YYMMDD): 071009
> 
> 
> Initial Comment:

> r.in.xyz has been segfaulting with a particular dataset. It's quite
> big (23 MB bzipped) and I don't know how to provide it but let me
> know if ther's a need+way to.
> 
> Details:
> 
> $r.in.xyz -g input=tin_nieciag.xyz output=tin_nie ciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100
> 
> n=5725102.149034 s=5705608.286234 e=587119.577053 w=568715.830892 b=69.431216 t= 224.696106
> 
> $ g.region n=5725102.149034 s=5705608.286234 e=587 119.577053 w=568715.830892 res=1 -a

If res == 1, then:

cols = 587119.577053 - 568715.830892 = 18403.74616099999 = 18404
rows = 5725102.149034 - 5705608.286234 = 19493.86280000024 = 19494

At 8 bytes per cell, the total size of the map is:

 18404 * 19494 * 8 = 2870140608 bytes

This is larger than 2GiB, causing the arithmetic (which uses "int") to
overflow.

> $ gdb r.in.xyz
> $ run input=tin_nieciag.xyz output=tin_nie ciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100
> 
> Starting program: /usr/local/grass-6.3.cvs/bin/r.in.xyz input=tin_nieciag.xyz output=tin_nieciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100
> Scanning data ...
>   67%
> Program received signal SIGSEGV, Segmentation fault.
> 0x00002aaaaabf3c09 in G_is_d_null_value (dcellVal=0x2aaa2cd4e960) at null_val.c:502
> 502             if(((unsigned char *) dcellVal)[i] !=
> (gdb) bt

> #2  0x000000000040439c in update_min (array=0x2aaaab3b4010, cols=18405, row=14767, col=4767, map_type=2, value=155.11966910999999) at support.c:74

int update_min(void *array, int cols, int row, int col, RASTER_MAP_TYPE map_type, double value)
{
    void *ptr;
    DCELL old_val;

    ptr = array;
    ptr = G_incr_void_ptr(ptr, ((row*cols)+col)*G_raster_size(map_type));

    if( G_is_null_value(ptr, map_type) )

There are two problems here:

1. row, col, and cols are all "int"s (typically 32-bit signed
integers), so (row*cols)+col is performed at that width, which will
overflow at 2GiB.

2. G_incr_void_ptr() takes the increment as an "int", which is limited
to 2GiB.

For this to work, we need to use a 64-bit type on 64-bit systems. We
also need to use an unsigned type, as a 32-bit system can allocate
memory up to 4GiB, not 2GiB (unlike off_t, size_t is unsigned).

Which types are 64-bit on 64-bit Linux systems? ptrdiff_t has to be;
size_t probably will be.

[I was going to point out that malloc() is limited to a size_t, so the
array couldn't be larger than that, but the array is allocated with
calloc(), which could theoretically produce a >4GiB array with only a
32-bit size_t.]

I suggest:

1. Changing G_incr_void_ptr() to use size_t.
2. Changing G_raster_size() to return size_t (that's what sizeof
returns, which is the reason for size_t's existence).
3. Casting "cols" to a size_t, so that the index calculation is
performed at size_t width.

If size_t isn't enough to index the array (i.e. you have a 32-bit
size_t on a platform which can calloc() more than 4GiB), you lose
regardless.

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-dev mailing list