[GRASSLIST:40] Re: r.mapcalc evaluation order

Thu May 15 11:21:42 EDT 2003

Ok, very clear answer, thanks. Good to know that this is indeed the 
bottleneck. I'll write a C function in Grass to do my recategorization. 
Shouldn't be much work.

Knowing this, I don't think implementing "lazy" function in Grass itself 
is a good idea. It would change Grass's behavior. With r.mapcalc for 
example, if two if-expressions overlap, I guess the result now is the 
last one, and with lazy evaluation the first. This would without any 
doubt cause endless problems with existing applications.

Jan

Glynn Clements wrote:
> Jan Hartmann wrote:
> 
> 
>>I am running r.mapcalc on the LandScan world population database 
>>(www.ornl.gov/gist/projects/LandScan). This is a 4 byte, 43.200*20.800 
>>raster, and I use mapcalc to recategorize it to a one byte, 256 valued 
>>raster, according to some different categorizing algorithms (essentially 
>>variations on histogram equalisation). The category boundaries have been 
>>computed separately; the only thing r.mapcalc has to do is to apply them 
>>to the input raster. The mapcalc script only consists of a few dozens of 
>>"if" rules, but these take a very long time to execute (days). My 
>>impression is that r.mapcalc evaluates every rule for every cell. Does 
>>anyone know if this is the case, and if so, is there a way to stop 
>>evaluation after a match has been found (something like an "else" 
>>statement)?
> 
> 
> The overall evaluation strategy is that r.mapcalc processes one row at
> a time. Each function (e.g. "if") takes a row of cells (i.e. a
> one-dimensional array) as input, and stores the results in a result
> array. For nested expressions, the array which holds the result from
> an inner term is passed as an input to the enclosing term. Once the
> final result has been computed for a row, the row is written out and
> computation starts on the next row.
> 
> I suspect that your problem stems from the fact that the "if" function
> is strict; i.e. all of the arguments are computed, passed to the
> function, then one of the arguments is returned. Any time spent
> computing the unused argument is effectively wasted.
> 
> This contrasts with general-purpose programming languages, where the
> "if" construct is lazy; i.e. the condition is evaluated first, and
> only one of the alternatives is evaluated.
> 
> r.mapcalc doesn't provide a lazy conditional mechanism. For
> general-purpose programming languages, a lazy conditional is
> absolutely necessary in order to be able to define recursive
> functions. As r.mapcalc doesn't allow you to define your own functions
> (recursive or otherwise), this isn't an issue.
> 
> For r.mapcalc, a lazy conditional would only be an optimisation; it
> wouldn't extend the set of possible computations, but would just make
> some of them faster.
> 
> I appreciate that this particular optimisation might be of substantial
> benefit to you personally, but it would require significant changes to
> r.mapcalc. Currently, *all* functions are strict; the core evaluation
> code would need to be extended to allow for lazy functions generally
> before you could implement any specific lazy function.
> 
> Consequently, such an extension isn't likely to be implemented any
> time soon. In the absence of any other options, you could just add
> your categorisation function to r.mapcalc itself. Adding new functions
> is relatively straightforward; I can provide assistance if you wish to
> take that route.
>