[GRASS-dev] Interested in parallelization of GRASS

Glynn Clements glynn at gclements.plus.com
Sun Mar 27 14:05:13 EDT 2011


Hamish wrote:

> While e.g. pthreads in r.mapcalc in grass7 may give me a nice 10%
> speedup by separating reading and writing into two threads [or is
> that splitting into IO and maths?] (* I haven't actually benchmarked
> it, 10% is just a guess),

The parallelism in 7.0's r.mapcalc consists of evaluating function
arguments concurrently.

Previously, evaluation of a function (or operator) meant:

	for each argument:
	    evaluate argument
	evaluate function

In 7.0, this changed to:

	for each argument:
	    commence evaluation of argument
	for each argument:
	    wait for evaluation to complete
	evaluate function

For expressions with multiple input maps, the maps will be read
concurrently. Evaluation of the top-most operator and writing the
output map is performed by the main thread. If there are multiple
output maps, these are evaluated sequentially. Rows are evaluated
sequentially. Also, arguments to eval() are evaluated sequentially. 
So, the amount of parallelism available is limited by the complexity
of the expression.

> and that is not a gift to refuse, the
> reality seems that the time-to-completion is still dominated by the
> I/O bottlenecks and saturating the bus-- at which point it doesn't
> matter how many threads or CPUs you have, you're still limited by
> the speed of your bus/drive/RAID array.

I don't know how significant physical disc access really is. There's a
lot of overhead in Rast_{get,put}_row(). Even on a desktop system with
a fast CPU and an average hard drive, I wouldn't be surprised if a
simple get-row/put-row copy operation was CPU bound.

> So I try to think of modules which are CPU bound.. the first
> task is to replace inefficient algorithms with better ones (e.g.
> Glynn's r.cost work,

Huh? I haven't done anything related to r.cost.

If you're thinking of r.grow.distance, that can't be used as a
substitute for r.cost except in the case of constant cost, as it
relies upon "distance" being monotonic with respect to x and y.

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list