[GRASS-dev] [GRASS GIS] #3069: Evaluate branches in r.mapcalc if-function only when needed

GRASS GIS trac at osgeo.org
Mon Jun 20 15:55:09 PDT 2016


#3069: Evaluate branches in r.mapcalc if-function only when needed
-------------------------+-------------------------------------------------
  Reporter:  wenzeslaus  |      Owner:  grass-dev@…
      Type:              |     Status:  new
  enhancement            |
  Priority:  normal      |  Milestone:  7.3.0
 Component:  Raster      |    Version:  svn-trunk
Resolution:              |   Keywords:  r.mapcalc, r3.mapcalc,
                         |  optimization, expression, evaluation, if-
       CPU:              |  statement, if-function
  Unspecified            |   Platform:  Unspecified
-------------------------+-------------------------------------------------

Comment (by glynn):

 Replying to [ticket:3069 wenzeslaus]:

 > The ''r.mapcalc'' expression `x = if(0, rand(0, 2), 3)` takes about the
 same time to evaluate as `x = rand(0, 2) + if(0, 2, 3)`, although the
 true-branch of the if-statement is never used. As far as I understand,
 this is given by fact that `if` is actually a function in ''r.mapcalc'',
 so all the arguments are evaluated before the function is called.

 It's largely a consequence of r.mapcalc using vectorised (SIMD)
 evaluation, similar to e.g. std::valarray in C++, !NumPy, or GPU-oriented
 languages such as GLSL.

 The "value" of any expression is a row buffer, an array containing a value
 for each column. Each function takes row buffers as arguments and
 populates the row buffer for the result.

 The reason for this approach is to reduce the overhead of the interpreter,
 as the interpreter only need to be run per-row rather than per-cell.

 As it stands, conditional evaluation would require first evaluating the
 condition expression, then analysing the row buffer holding the condition
 to determine whether a specific case (zero or non-zero for the 2-argument
 and 3-argument versions, negative, zero or positive for the 4-argument
 version) doesn't occur in any column.

 This could be optimised by analysing the expression to identify sub-
 expressions which are constant for any given row. This includes literals,
 and function calls where all arguments are constant and the function
 itself doesn't "inject" non-constant values (specifically, rand(), col()
 and x()). Ideally, variables would also need to be flagged as holding
 constant-per-row values.

 In the case of variables, there's also the issue that a given variable may
 be used within multiple conditionals where the conditions are different.
 The variable would need to be evaluated for each column where its value is
 needed for at least one usage (i.e. the union of all of the conditions).
 That would require either a) associating a validity mask with each
 variable (indicating the columns for which the value has been calculated),
 b) performing the evaluation independently for each conditional (typically
 resulting in redundant evaluation), or c) simply evaluating the variable
 for all columns in all but the most straightfoward cases.

 Needless to say, this isn't a simple change. It isn't one which I have
 time to work on.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3069#comment:1>
GRASS GIS <https://grass.osgeo.org>



More information about the grass-dev mailing list