[Tiff] Re: [gdal-dev] Problem with libTiff on Solaris 10

Even Rouault even.rouault at mines-paris.org
Fri May 30 15:19:36 EDT 2008


Yes, it could be an issue about a NULL pointer of course, but libtiff 4.0.0 
alpha/beta works well on i386, x86_64, PPC etc... (those are the platforms 
tested by the GDAL buildbot)

I've done a quick 'grep "*(int32*)" *.c ' and 'grep "*(uint32*)" *.c' on 
libtiff 3.8.2 sources and libtiff 4.0.0 sources, and found much more of those 
in the later than in the former, so it is a hint that this may be the 
problem.

Anyway, we could be sure of the reason if Dan could print the value of the n 
pointer when the debugger stops on the crash. And I think that 'sigbus' is 
more typical of alignment problems. A null pointer would have given 
a 'sigsegv' I think.


Related topic, do people know if there's a Valgrind option/patch that could 
help us to detect that ? I looked a bit but couldn't find one. I imagine 
that's it's "easy" for Valgrind to detect and report such unaligned memory 
accesses. That could enable people not having the "chance" of getting access 
to a SPARC platform to anticipate such problems even with i386 hardware.

Le Friday 30 May 2008 20:40:43 Andy Cave, vous avez écrit :
> Hi Even,
>
> Yes IIRC from when I worked/coded on them SPARCS need to do aligned
> accesses, or you get a trap.
>
> But without knowing the code, also, o could be a null pointer (is it
> asserted as not being null on entry to the function or where it's
> set/calculated/extracted?).
>
> Regards,
>
> Andy.
>
> ----- Original Message -----
> From: "Even Rouault" <even.rouault at mines-paris.org>
> To: <gdal-dev at lists.osgeo.org>
> Cc: "Dan Greve" <grevedan at hotmail.com>; "TIFF mailing list"
> <tiff at lists.maptools.org>; "Gong,Shawn (Contractor)"
> <Shawn.Gong at drdc-rddc.gc.ca>
> Sent: Friday, May 30, 2008 7:18 PM
> Subject: [Tiff] Re: [gdal-dev] Problem with libTiff on Solaris 10
>
>
> Hi,
>
> Hum, I'm not a specialist of SPARC at all, but doesn't this architecture
> require proper memory alignment accesses ?
>
> I say that because the crash on " *(uint32*)n=(uint32)o->tdir_count " could
> be
> easily explained if the memory address pointed by 'n' is not aligned on a 4
> byte address....
> In the next few lines of the code extract Dan Greve has quoted, I
> see "_TIFFmemcpy(n,&o->tdir_offset,4)" which is a safer way of dealing with
> this alignment problems...
>
> It looks like a libtiff problem, so I add the libtiff mailing list in CC
> too.
>
> Best regards,
> Even
>
> ---------------------------------------------------------------------------
>----------
>
> RE: [gdal-dev] Problem with libTiff on Solaris 10
> De :
> "Gong, Shawn (Contractor)" <Shawn.Gong at drdc-rddc.gc.ca>
>   À :
> "Dan Greve" <grevedan at hotmail.com>, gdal-dev at lists.osgeo.org
>   Date :
> Aujourd'hui 18:54:44
>
> Dan and list,
>
> This error looks very similar to what I have experienced on our Sun
> machine, trying to build and run gdal 1.5.1.
> Our Sun system is SPARC Solaris 9 64-bit.
>
> Error message when Python codes call "gtiff_driver.CreateCopy( fname,
> vrtds, ...)"
> ERROR 1: /home/sgong/../working_dir/test2.tif:No space to read TIFF
> directory
> ERROR 1: TIFFReadDirectory:Failed to read directory at offset 8 Bus error
> (core dumped)
>
>
> thanks,
> Shawn
>
> ________________________________________
> From: gdal-dev-bounces at lists.osgeo.org
> [mailto:gdal-dev-bounces at lists.osgeo.org] On Behalf Of Dan Greve
> Sent: Friday, May 30, 2008 11:51 AM
> To: gdal-dev at lists.osgeo.org
> Subject: [gdal-dev] Problem with libTiff on Solaris 10
>
> Greetings,
>
> I am using the latest stable build, gdal-1.5.1 on a SPARC Solaris 10 64-bit
> system. It is configured with the default drivers, but is
> using --with-jpeg=internal --with-libtiff=internal --with-geotiff=internal.
> Using the 32-bit compiler options and libraries, I can successfully build
> libgdal.so and the utility programs. However, when trying to do a
> gdal_translate from one simple uncompressed 3-band geotiff into another
> simple uncompressed 3-band geotiff, I get a "bus error" segfault. Also, if
> the output tiff already exists I get two errors. "No space to read tiff
> directory" and "TiffReadDirectory: failed to read directory at offset 8". A
> quick google revealed two similiar errors, here and here. Diving into the
> debugger, the problem seems to lie in the tif_dirwrite.c, starting with
> line 755. I believe it's due to _TIFFmalloc not allocating memory properly.
> The crash actual occurs on line 781, *(uint32*)n=(uint32)o->tdir_count;
>
> o->tdir_count is valid, but setting n to any value causes a segfault, as
> though TIFFmalloc did not allocate the whole requested 186 bytes. Any
> ideas?
>
> <snip file=tif_dirwrite.c>
> dirmem=_TIFFmalloc(dirsize);
> if (dirmem==NULL)
> {
> TIFFErrorExt(tif->tif_clientdata,module,"Out of memory");
> goto bad;
> }
> if (!(tif->tif_flags&TIFF_BIGTIFF))
> {
> uint8* n;
> TIFFDirEntry* o;
> n=dirmem;
> *(uint16*)n=ndir;
> if (tif->tif_flags&TIFF_SWAB)
> TIFFSwabShort((uint16*)n);
> n+=2;
> o=dir;
> for (m=0; m<ndir; m++)
> {
> *(uint16*)n=o->tdir_tag;
> if (tif->tif_flags&TIFF_SWAB)
> TIFFSwabShort((uint16*)n);
> n+=2;
> *(uint16*)n=o->tdir_type;
> if (tif->tif_flags&TIFF_SWAB)
> TIFFSwabShort((uint16*)n);
> n+=2;
> *(uint32*)n=(uint32)o->tdir_count;
> if (tif->tif_flags&TIFF_SWAB)
> TIFFSwabLong((uint32*)n);
> n+=4;
> _TIFFmemcpy(n,&o->tdir_offset,4);
> n+=4;
> o++;
> }
> *(uint32*)n = (uint32)tif->tif_nextdiroff;
> }
> </snip>
>
> -- Dan Greve
> -- Software Engineer
> -- Northrop Grumman Corp.
>
> _______________________________________________
> Tiff mailing list: Tiff at lists.maptools.org
> http://lists.maptools.org/mailman/listinfo/tiff
> http://www.remotesensing.org/libtiff/




More information about the gdal-dev mailing list