[gdal-dev] Update on RFC 62: Raster algebra

Even Rouault even.rouault at spatialys.com
Tue Sep 13 03:14:31 PDT 2016


Le mardi 13 septembre 2016 10:15:42, Ari Jolma a écrit :
> There's been some discussion here about doing pixel function
> calculations in VRT derived bands, so I thought I'd let you know what
> are my thoughts and what I've done regarding RFC 62.
> 
> At FOSS4G my conclusion was that my first approach was not good for many
> reasons, most importantly because it did not scale to several bands in
> one operation. So I've trying with the following ideas/decisions:
> 
> The problem is to compute y=f(x1, x2, ...), where y is a new dataset
> with one band or an existing dataset, into which a new band is added.
> x1, x2, ... are existing bands. f is an expression. The goal is to be
> able to create an expression object, with which one can write
> 
> y->compute(f);
> 
> The expression object is a tree and its nodes are operators (+,-,...),
> functions (abs, floor,...), band objects, values, and possibly more
> complex data structures (for example classifiers). It should be possible
> to define focal functions (e.g., for terrain analysis).
> 
> The compute algorithm is something like this:
> 
> for_all_blocks_of_y {
> 
>      f->cache_blocks(required_region);
> 
>      compute_the_block_of_y(f, block_of_y);
> 
>      f->free_blocks();
> 
>      write_block(block_of_y);
> 
> }
> 
> One block of y is computed cell by cell using a recursive 'get_value'
> method on the expression. The call would have the (global) cell
> coordinates as arguments. Non-spatial nodes would ignore them but
> spatial nodes need them. There would be get_value methods for all GDAL
> data types.
> 
> It's rather brute force but it should be easy to parallelize and it is
> very general and extensible - expression node types can be subclassed
> easily.

Let's say f = A + B * C

I can see 2 possible approaches :
- one where you would have a first step where B * C would be run on a chunk of 
pixels and would produce a 'temp' chunk, and then A + 'temp' would be run. On 
the plus side, you can have vectorized/parallelized/optimized implementations 
of the '*' and '+' operators to operate a number of pixels. One downside of 
this is that it is very memory bandwidth intensive if you have many steps in 
your computations (you have one useless writing and reading of the 'temp' 
array).
- another one where you would iterate on each pixel of the chunk and evaluate 
A+B*C of it. Less memory memory bandwidth intensive, but would probably be 
slow due to the overhead of evaluating f() in a somehow interpreted way on a 
pixel by pixel basis (unless you can just-in-time compile f).

> 
> This is partly working but it's all still on my own computer and not
> committed to github. One thing I'd like to have quite early is an
> expression parser to make testing easier etc. I have not done any
> profiling so I have little idea about where the bottlenecks would be.
> 
> I plan to present this on our local developer oriented FOSS4G meeting
> next month so I guess I'll need to work on it more soon.
> 
> Ari
> 
> 
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list