[gdal-dev] Update on RFC 62: Raster algebra

Tue Sep 13 04:24:41 PDT 2016

13.09.2016, 13:14, Even Rouault kirjoitti:
> Le mardi 13 septembre 2016 10:15:42, Ari Jolma a écrit :
>> There's been some discussion here about doing pixel function
>> calculations in VRT derived bands, so I thought I'd let you know what
>> are my thoughts and what I've done regarding RFC 62.
>>
>> At FOSS4G my conclusion was that my first approach was not good for many
>> reasons, most importantly because it did not scale to several bands in
>> one operation. So I've trying with the following ideas/decisions:
>>
>> The problem is to compute y=f(x1, x2, ...), where y is a new dataset
>> with one band or an existing dataset, into which a new band is added.
>> x1, x2, ... are existing bands. f is an expression. The goal is to be
>> able to create an expression object, with which one can write
>>
>> y->compute(f);
>>
>> The expression object is a tree and its nodes are operators (+,-,...),
>> functions (abs, floor,...), band objects, values, and possibly more
>> complex data structures (for example classifiers). It should be possible
>> to define focal functions (e.g., for terrain analysis).
>>
>> The compute algorithm is something like this:
>>
>> for_all_blocks_of_y {
>>
>>       f->cache_blocks(required_region);
>>
>>       compute_the_block_of_y(f, block_of_y);
>>
>>       f->free_blocks();
>>
>>       write_block(block_of_y);
>>
>> }
>>
>> One block of y is computed cell by cell using a recursive 'get_value'
>> method on the expression. The call would have the (global) cell
>> coordinates as arguments. Non-spatial nodes would ignore them but
>> spatial nodes need them. There would be get_value methods for all GDAL
>> data types.
>>
>> It's rather brute force but it should be easy to parallelize and it is
>> very general and extensible - expression node types can be subclassed
>> easily.
> Let's say f = A + B * C
>
> I can see 2 possible approaches :
> - one where you would have a first step where B * C would be run on a chunk of
> pixels and would produce a 'temp' chunk, and then A + 'temp' would be run. On
> the plus side, you can have vectorized/parallelized/optimized implementations
> of the '*' and '+' operators to operate a number of pixels. One downside of
> this is that it is very memory bandwidth intensive if you have many steps in
> your computations (you have one useless writing and reading of the 'temp'
> array).
> - another one where you would iterate on each pixel of the chunk and evaluate
> A+B*C of it. Less memory memory bandwidth intensive, but would probably be
> slow due to the overhead of evaluating f() in a somehow interpreted way on a
> pixel by pixel basis (unless you can just-in-time compile f).

Probably it is slow because of calling virtual functions recursively in 
a tree. But my first objective is/was to get it working.

Ari

>
>> This is partly working but it's all still on my own computer and not
>> committed to github. One thing I'd like to have quite early is an
>> expression parser to make testing easier etc. I have not done any
>> profiling so I have little idea about where the bottlenecks would be.
>>
>> I plan to present this on our local developer oriented FOSS4G meeting
>> next month so I guess I'll need to work on it more soon.
>>
>> Ari
>>
>>
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/gdal-dev