[GRASS-dev] Parallel computing for r.sun
Hamish
hamish_b at yahoo.com
Tue Jan 29 18:49:58 PST 2013
Yann wrote:
> anyone has a timeline for merging the OpenCL code into trunk?
Hi,
that's been on my todo list for way too long.
the first step is to get support for OpenCL build into grass7's
./configure next to pthreads and OpenMP which are already there.
I welcome help with that, my copious free time hasn't been
very good lately.
The removal of tertiary calls from the main r.sun loop has
already been done in trunk.
I'll try to write more after work, but a lot is explained
on these pages:
http://grasswiki.osgeo.org/wiki/Category:Parallelization
besides the r.sun work already done, AFAIAC the top candidates
for parallelization in GRASS are v.surf.rst and v.surf.bspline.
Currently there is some support directly in the LU decomposition
but that makes 1000s of threads; the cost of creating and
destroying those is coming down a lot, but I think it will
probably be a lot more efficient to only create dozens of
threads in the case of v.surf.bspline (see code comment at
the start of the loop where the OpenMP support could go)
and for v.surf.rst perhaps multithread the various boxes of
the quadtree? The idea being to more closely match the
number of threads/processes with the number of CPUs or GPUs.
For CPUs that means dozens, for GPUs that means hundreds.
Each module will be different, so each one requires its own
approach. For that reason I'm happy for pthreads, OpenMP,
and OpenCL to all be supported.
various python and bourne shell scripts (quite new so not
backported to 6.4.svn yet) have been parallelized; the easy
win is to run the three R,G,B bands in parallel. It only
scales to 3 CPUs, but is nearly perfectly efficient and a 3x
speedup is as good as any. See the v.surf.icw script in both
g6.sh and g7.py addons for a complete example.
the good news is that slowly the opencl apis are making their
way into the mainstream driver releases, even intel is on board.
before you pretty much needed to tailor your linux distro to
match their SDK release target if you wanted to use it; and
then the SDK didn't match with the available driver version,
and other such pain..
I'm not sure of what modules in GRASS besides r.sun are well
suited to GPU acceleration. r.sun as a ray-tracing exercise
was an obvious one as that's what GPUs are generally designed
to do these days. Much, if not most, of GRASS's modules are
I/O limited, and especially I/O to the video card has
traditionally been really slow. (that's getting better to, but
still a little on the horizon).
another thing to consider is that GPU math on consumer-grade
video cards have been traditionally limited to float()s. you
had to buy the expensive "science grade" one if you wanted to
calc using double prec. FPs.
great leaps can be made, but there are some caveats to consider.
best,
Hamish
More information about the grass-dev
mailing list