[GRASS-dev] [grass-code I][507] r.in.xyz: a segfault with
particular dataset
Glynn Clements
glynn at gclements.plus.com
Tue Oct 9 19:40:09 EDT 2007
grass-dev at grass.itc.it wrote:
> code I item #507, was opened at 2007-10-09 17:11
> Status: Open
> Priority: 3
> Submitted By: Maciej Sieczka (msieczka)
> Assigned to: Hamish Bowman (hamish)
> Summary: r.in.xyz: a segfault with particular dataset
> Issue type: module bug
> Issue status: None
> GRASS version: CVS HEAD
> GRASS component: raster
> Operating system: Linux
> Operating system version: Ubuntu Dapper 64bit
> GRASS CVS checkout date, if applies (YYMMDD): 071009
>
>
> Initial Comment:
> r.in.xyz has been segfaulting with a particular dataset. It's quite
> big (23 MB bzipped) and I don't know how to provide it but let me
> know if ther's a need+way to.
>
> Details:
>
> $r.in.xyz -g input=tin_nieciag.xyz output=tin_nie ciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100
>
> n=5725102.149034 s=5705608.286234 e=587119.577053 w=568715.830892 b=69.431216 t= 224.696106
>
> $ g.region n=5725102.149034 s=5705608.286234 e=587 119.577053 w=568715.830892 res=1 -a
If res == 1, then:
cols = 587119.577053 - 568715.830892 = 18403.74616099999 = 18404
rows = 5725102.149034 - 5705608.286234 = 19493.86280000024 = 19494
At 8 bytes per cell, the total size of the map is:
18404 * 19494 * 8 = 2870140608 bytes
This is larger than 2GiB, causing the arithmetic (which uses "int") to
overflow.
> $ gdb r.in.xyz
> $ run input=tin_nieciag.xyz output=tin_nie ciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100
>
> Starting program: /usr/local/grass-6.3.cvs/bin/r.in.xyz input=tin_nieciag.xyz output=tin_nieciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100
> Scanning data ...
> 67%
> Program received signal SIGSEGV, Segmentation fault.
> 0x00002aaaaabf3c09 in G_is_d_null_value (dcellVal=0x2aaa2cd4e960) at null_val.c:502
> 502 if(((unsigned char *) dcellVal)[i] !=
> (gdb) bt
> #2 0x000000000040439c in update_min (array=0x2aaaab3b4010, cols=18405, row=14767, col=4767, map_type=2, value=155.11966910999999) at support.c:74
int update_min(void *array, int cols, int row, int col, RASTER_MAP_TYPE map_type, double value)
{
void *ptr;
DCELL old_val;
ptr = array;
ptr = G_incr_void_ptr(ptr, ((row*cols)+col)*G_raster_size(map_type));
if( G_is_null_value(ptr, map_type) )
There are two problems here:
1. row, col, and cols are all "int"s (typically 32-bit signed
integers), so (row*cols)+col is performed at that width, which will
overflow at 2GiB.
2. G_incr_void_ptr() takes the increment as an "int", which is limited
to 2GiB.
For this to work, we need to use a 64-bit type on 64-bit systems. We
also need to use an unsigned type, as a 32-bit system can allocate
memory up to 4GiB, not 2GiB (unlike off_t, size_t is unsigned).
Which types are 64-bit on 64-bit Linux systems? ptrdiff_t has to be;
size_t probably will be.
[I was going to point out that malloc() is limited to a size_t, so the
array couldn't be larger than that, but the array is allocated with
calloc(), which could theoretically produce a >4GiB array with only a
32-bit size_t.]
I suggest:
1. Changing G_incr_void_ptr() to use size_t.
2. Changing G_raster_size() to return size_t (that's what sizeof
returns, which is the reason for size_t's existence).
3. Casting "cols" to a size_t, so that the index calculation is
performed at size_t width.
If size_t isn't enough to index the array (i.e. you have a 32-bit
size_t on a platform which can calloc() more than 4GiB), you lose
regardless.
--
Glynn Clements <glynn at gclements.plus.com>
More information about the grass-dev
mailing list