[GRASS-dev] multi-threaded r.mapcalc

Markus Neteler neteler at osgeo.org
Fri Jan 9 09:49:16 EST 2009


On Sat, Nov 22, 2008 at 1:34 AM, Glynn Clements
<glynn at gclements.plus.com> wrote:
>
> I have added experimental multi-threading support to r.mapcalc in 7.0
> (r34440). It isn't enabled by default; to enable it, use
> "make USE_PTHREAD=1". It only uses a handful of pthread functions, so
> there's a reasonable chance of it working on MacOSX and/or Windows
> (with the pthread-win32 library).
>
> You can change the number of worker threads by setting the WORKERS
> environment variable.

A curiosity: is the approach completely different from the parallelization
implemented in r.li.*?

> The parallelism is limited to evaluation of argument expressions to
> functions and operators.

This is not entirely clear to me:
"evaluation of argument expressions" - isn't this done only one time
when invoking r.mapcalc with a formula? What would be a complex
example which illustrates the benefits of parallelism in the evaluation
part?

> It doesn't attempt to calculate multiple rows
> concurrently, or multiple output maps.

Could this be integrated somehow from r.li.*?

> Also, only one thread can call get_map_row() at a time, so commands
> which are I/O bound (i.e. which only perform simple calculations, so
> most of the time is spent performing I/O), won't see much of an
> improvement.
>
> This is mainly due to the fact that the libgis raster I/O functions
> aren't re-entrant, due to the use of pre-allocated row buffers (and
> possibly other issues).
>
> [There are also static row buffers in r.mapcalc itself, but those
> could easily be eliminated if there was any point.]

Long ago (KR to ANSI C reformatting) I suggested to split the
raster functions out of libgis, calling them Rast_XXX().
Do you think this could become relevant now for GRASS 7?

I still find libgis overwhelmed with too many different functions.

> Finally, eval() is excluded from parallelisation, as it is used for
> variable assignments, and a variable needs to be assigned before it is
> used. If this is significant, we can have separate parallel and
> sequential versions. If you perform variable assignments other than in
> eval(), you lose.
>
> Mostly, I'm interested in discovering whether this actually has any
> practical benefit. IOW, whether anyone is actually using scripts which
> do enough computation to make use of additional cores.

Do we have an example in [6.4] scripts/ which I can use for timing testing?

Markus


More information about the grass-dev mailing list