[gdal-dev] Re: OpenCL, GDAL, and You

Fri Dec 17 18:31:05 EST 2010

Le samedi 18 décembre 2010 00:09:44, Seth Price a écrit :
> > I can't help for MacOS build system.
> 
> Good thing I'm here. :) Do you want an account on my machine for
> testing?

well, we can try, but do a full backup before I break things due to my Mac 
illiteracy ;-)

> 
> > It's me. You can find the culprit of any commit by doing a svn blame
> > or looking
> > at the log ;-)
> 
> Most of that clampToDst() code was brought over from
> GWKSetPixelValue(). I'm not sure if it's a good idea to remove it. The
> original .cpp file should be the same as the new OpenCL code/. My goal
> is to make the code execution identical so I don't need to do any post-
> processing of pixels when I copy them out of OpenCL memory.

If you actually look at the GWKSetPixelValue() code, you'll see that the 
GDT_Float32 has a very special implementation (actually, that's the integer 
cases that have a very special implementation), which doesn't do the clamping 
done for the integer case, so it's legitimate to find the same distinction in 
the OpenCL code.

> 
> > Which error did you get ? You are just changing some whitespace
> > change that
> > seem to be equivalent to the trunk code. Or do I miss something ?
> 
> I was getting a weird build options error that seems to be caused by
> an extra space. So you're right that it's a whitespace change, but
> it's non-trivial.

ok, I've just applied your change. It didn't hurt on my side.

> ~Seth
> 
> On Dec 17, 2010, at 3:54 PM, Even Rouault wrote:
> > Le vendredi 17 décembre 2010 23:41:53, Seth Price a écrit :
> >> I'm working on the trunk OpenCL build on my Mac now.
> >> 
> >> ****** First, on my mac I get an error at the end of make:
> >> [...] ./ogr/.libs/ogr_srs_xml.o ./ogr/.libs/ograssemblepolygon.o ./
> >> ogr/.libs/ogr2gmlgeometry.o ./ogr/.libs/gml2ogrgeometry.o ./
> >> ogr/.libs/
> >> ogr_expat.o   /opt/local/lib/libsqlite3.dylib -L/opt/local/lib -L/
> >> usr/
> >> local/lib /usr/local/lib/libexpat.dylib /usr/local/lib/
> >> libjpeg.dylib /
> >> usr/local/lib/libtiff.dylib /usr/local/lib/libpng12.dylib -lpthread -
> >> ldl /opt/local/lib/libcurl.dylib /opt/local/lib/libidn.dylib -lssl -
> >> lcrypto -lz -lOpenCL    -install_name  /usr/local/lib/libgdal.
> >> 1.dylib -
> >> compatibility_version 16 -current_version 16.0 -Wl,-single_module
> >> ld: library not found for -lOpenCL
> >> collect2: ld returned 1 exit status
> >> make[1]: *** [libgdal.la] Error 1
> >> make: *** [check-lib] Error 2
> >> seth:gdal-svn-trunk-2010.12.17 sprice$
> >> 
> >> To fix this, I changed "OPENCL_LIB      =       -lOpenCL" to
> >> "OPENCL_LIB      =       -framework OpenCL" in GDALMmake.opt.
> > 
> > I can't help for MacOS build system.
> > 
> >> *****  Why is USE_CLAMP_TO_DST_FLOAT in there? I would think that is
> >> required even if it isn't on ATI. I'm not sure who inserted its use,
> >> but I'm just wondering if the reasoning is documented somewhere.
> > 
> > It's me. You can find the culprit of any commit by doing a svn blame
> > or looking
> > at the log ;-)
> > 
> > The reasoning was that I observed with my ATI card that float
> > buffers didn't get
> > to be multiplied by FLT_MAX. The values weren't normalized between 0
> > and 1,
> > but had directly the expected value. (in the float case, I also
> > dropped the
> > rounding to nearest integer which is clearly not desired, and not in
> > the CPU
> > implementation)
> > 
> > This is perhaps also needed for NVidia, but I have no way to check,
> > so I just
> > put the condition you saw. If you can check this is unneeded, the
> > condition
> > can go away.
> > 
> >> ***** I had to make this change to alg/gdalwarpkernel_opencl.c to get
> >> it to build without a build option error.
> >> 
> >> @@ -1168,7 +1168,7 @@
> >> 
> >>      //Assemble the compiler arg string for speed. All invariants
> >> 
> >> should be defined here.
> >> 
> >>      sprintf(buffer, "-cl-fast-relaxed-math -Werror -D FALSE=0 -D
> >> 
> >> TRUE=1 "
> >> -            "%s "
> >> +            "%s"
> >> 
> >>              "-D iSrcWidth=%d -D iSrcHeight=%d -D iDstWidth=%d -D
> >> 
> >> iDstHeight=%d "
> >> 
> >>              "-D useUnifiedSrcDensity=%d -D useUnifiedSrcValid=%d "
> >>              "-D useDstDensity=%d -D useDstValid=%d -D useImag=%d "
> >> 
> >> @@ -1176,9 +1176,9 @@
> >> 
> >>              "-D nXRadius=%d -D nYRadius=%d -D nFiltInitX=%d -D
> >> 
> >> nFiltInitY=%d "
> >> 
> >>              "-D PI=%015.15lff -D outType=%s -D dstMinVal=
> >> 
> >> %015.15lff -
> >> D dstMaxVal=%015.15lff "
> >> 
> >>              "-D useDstNoDataReal=%d -D vecf=%s %s -D doCubicSpline=
> >> 
> >> %d "
> >> -            "-D useUseBandSrcValid=%d -D iCoordMult=%d",
> >> +            "-D useUseBandSrcValid=%d -D iCoordMult=%d ",
> >> 
> >>              /* FIXME: Is it really a ATI specific thing ? */
> >> 
> >> -            (warper->imageFormat == CL_FLOAT && warper->bIsATI) ?
> >> "-D
> >> USE_CLAMP_TO_DST_FLOAT=1" : "",
> >> +            (warper->imageFormat == CL_FLOAT && warper->bIsATI) ?
> >> "-D
> >> USE_CLAMP_TO_DST_FLOAT=1 " : "",
> >> 
> >>              warper->srcWidth, warper->srcHeight, warper->dstWidth,
> >> 
> >> warper->dstHeight,
> >> 
> >>              warper->useUnifiedSrcDensity, warper-
> >> >
> >> >useUnifiedSrcValid,
> >> >
> >>              warper->useDstDensity, warper->useDstValid, warper-
> >>> 
> >>> imagWorkCL != NULL,
> > 
> > Which error did you get ? You are just changing some whitespace
> > change that
> > seem to be equivalent to the trunk code. Or do I miss something ?
> > 
> >> **** After doing all of the above to make things compile, I don't get
> >> the bug described below. I'm working off of the latest trunk daily.
> > 
> > Yeah, that's probably an issue with the ATI SDK.
> > 
> >> ~Seth
> >> 
> >> On Dec 8, 2010, at 1:12 PM, Even Rouault wrote:
> >>> Seth,
> >>> 
> >>> Thanks for your help.
> >>> 
> >>>> It's more than a little strange that none of those image sizes
> >>>> work.
> >>>> Perhaps it's a problem with the image format? Can you verify that
> >>>> the
> >>>> given format should work?
> >>> 
> >>> The image format was CL_UNORM_INT8 (for GDT_Byte)
> >>> 
> >>>> Looking at the spec, it might also be a problem with the 'sz'
> >>>> argument. What value is that passing?
> >>> 
> >>> It's 1.
> >>> 
> >>> I managed to found the following workaround that enables gdalwarp to
> >>> complete
> >>> (see http://trac.osgeo.org/gdal/changeset/21220  that basically
> >>> passes a dummy
> >>> buffer instead of a NULL pointer).
> >>> 
> >>> However the visual result of the warping is really poor. I see 4
> >>> "ghost"
> >>> images shifted.
> >>> 
> >>> For better understanding I've attached the source image
> >>> (small_world_b1.tif)
> >>> and the result of bilinear resampling (but I get similar weird
> >>> visual effects
> >>> with cubic, cubic spline or lanczos)
> >>> 
> >>> gdalwarp  -rb small_world_b1.tif out_bilinear.tif
> >>> 
> >>> Best regards,
> >>> 
> >>> Even
> >>> <small_world_b1.tif><out_bilinear.tif>