[gdal-dev] Map algebra

alex alexhighviz at hotmail.com
Thu Apr 7 08:27:26 PDT 2016


Hi Ari,

Sorry for responding to your message late. I do appreciate your initiative to facilitate map algebra in GDAL. 

What exactly do you mean by Map Algebra? I understand it to cover the following:

1. Local operations: pixel-by-pixel operations on raster bands.
2. Focal operations: operations that calculate an indicator for each pixel based on the values in surrounding pixels
3. Zonal operations: operations that calculate indicators for groups (zones) of pixels.
4. Global operations: operations that calculate indicators over all pixels.

1. Local operations: pixel-by-pixel operations on raster bands.

To support this well, I think you should allow the user to specify a function and a combination of raster bands and constants that feed into the function, e.g: 

my_average = apply(average_function, band1, band2, band3);

but the function should also be able to handle a mix of raster bands and constants, e.g.:

my_average = apply(average_function, band1, band2, 7.25);

Also, I think it would make sense to have overloads for all or most of the operators, so you can write:

my_average = (band1 + band2 + band3) / 3.0;

And, I would like to combine these:

my_average = (2* apply(average_function, band1, band2) + band3) / 3.0;

Is that the kind of functionality you are trying to offer?

If you want to support this, it seems to be best practice to use expression templates (https://en.wikipedia.org/wiki/Expression_templates), because that would allow to create complex compositions of operations without having to create temporary datasets (e.g. for the sum of band1 and band2 in the example above).

The key thing that you need in order to implement this is an iterator over the raster bands. Eric Niebler writes at length about the benefits of using Ranges as a means of composing functions (https://github.com/ericniebler/range-v3 ) .You seem to be applying an iterator that iterates over blocks, rather than over pixels. How can that work when my_average, band1, band2 and band3 have different block sizes? Wouldn't your life be a lot easier if you iterated over pixels in the same order for all bands?. 

If you are interested, I have a solution, albeit in C++11 (https://github.com/ahhz/map_algebra ). The following is a program using it:

#include <blink/map_algebra/map_algebra.h> 

namespace ma = blink::map_algebra;

int my_function(int w, int x, int y, int z)
{
  return w * x + y * z;
}

int main()
{
  auto a = ma::open_read_only<int>("input_1.tif");
  auto b = ma::open_read_only<int>("input_2.tif");
  
  // Example 1: Using operators
  auto w = ma::create_from_model<double>("output_1.tif", a);
  w = a + 3 * b;

  // Example 2: Using assigning operators
  auto x = ma::create_from_model<double>("output_2.tif", a);  x = 1;
  x *= a;
  x += 3 * b;
  
  // Example 3: Map algebra using cell-by-cell functions
  auto y = ma::create_from_model<double>("output_3.tif", a);
  y = ma::apply(my_function, 1, a, 3, b);

  // Example 4: Combination
  auto z = ma::create_from_model<double>("output_4.tif", a);
  z = b + 3 * ma::apply(my_function, 1, a, 2, b);

  return 0;
}

2. Focal operations: operations that calculate an indicator for each pixel based on the values in surrounding pixels 

I think the key challenge here is to allow users to specify the function to be applied on the surrounding pixels. In my opinion a Map Algebra library should support all kinds of moving window analysis, and not just a number of built-in types of analysis. This can be implemented as yet another range that allows iteration over the pixels in the same order as the raster bands, but instead of return pixel values, it returns neighbourhood / window summaries. 

I have a C++11 library for this as well, but it is not yet working smoothly with the map algebra library (but it will!)
https://github.com/ahhz/moving_window 
http://www.sciencedirect.com/science/article/pii/S0303243415300337 

3. Zonal operations: operations that calculate indicators for groups (zones) of pixels.
This seems more trivial than the other two components of map algebra and I haven't given it much attention, but I think should be supported.

4. Global operations
I don't think these need to be explicitly supported, because there is no genericity to exploit.

In summary: 

What I am really trying to say is that developing map algebra functionality on top of GDAL or as part of GDAL seems very worthwhile and feasible. However, I would aim for functionality that is more generically supporting map algebra, rather than implementing specific map algebra tools. The foundation of such functionality in my opinion must be a GDALRange class that facilitates iteration over pixels in a uniform order. If I were you, I would first push a RFC to create a GDALRange, and only once that is done and dusted, consider the map algebra on top of it. 

Kind regards, Alex

(sorry for the self-promotion)

>-----Original Message-----
>From: gdal-dev [mailto:gdal-dev-bounces at lists.osgeo.org] On Behalf Of Ari
>Jolma
>Sent: 01 April 2016 09:39
>To: gdal-dev at lists.osgeo.org
>Subject: Re: [gdal-dev] Map algebra
>
>I've got this into a rather nice working shape. The code needs to be
>reorganized (it is now, horror, all in header files) but that's another
>thing.
>
>I've so far built this around a basic DEM catchment analysis workflow.
>It includes many kinds of steps so it is a pretty good benchmark. In
>pseudo code the workflow is something like
>
>dem = open_from_file
>dem->fill_depressions
>fd = dem->compute_flow_directions
>fd->route_flat_cells
>ua = fd->compute_upstream_area
>c = new_raster // compute the catchment into this
>c->set_border_cells // we're interested in the outlet cells
>c *= ua
>c *= c > 10000 // we're interested in the big catchments
>outlets = c->get_cells // outlets is an array of cells, and a cell is an
>object containing (x,y,value)
>c = 0
>c->mark_catchment(fd, outlets[0], 1) // assuming the outlet cell is now
>in outlets[0]
>ua2 = new_raster // for visualization log of upstream area is nice
>ua2 += ua
>ua2->log
>
>The C code is of course not this nice but it is not very far from this.
>All the basic four method types I mention below are implemented and
>adding new methods is not very difficult. The code makes heavy use of
>templates and macros in an attempt for versatility (GDAL rasters can
>have many datatypes) and easiness to write methods in a few lines of
>code. The whole set of code is now ~2500 lines but it creates rather
>large executables due to templates. The code is C++ but the API is C
>like, I've so far done no changes to existing GDAL code because of this.
>
>As written below, the code works on two levels of x,y loops: the blocks
>within a band, which is generic and cells within a block, which is
>separate for each method. The methods work on line by line (no recursion
>etc.) and thus for example the fill_depressions is not optimal but
>instead very simple. Also, very large rasters can be processed.
>
>I think this may be somewhat similar thing in relation to GDAL as the
>network support, so it could be a thing to add to the library if wished.
>
>I've added RFC 62 "raster algebra" to spark discussion.
>
>Ari
>
>
>20.03.2016, 10:21, Ari Jolma kirjoitti:
>> Folks,
>>
>> I played around a bit with the map algebra idea and came up with
>> something that could work nicely as a plugin. The initial code is at
>> https://github.com/ajolma/gdal/tree/trunk/gdal/map_algebra
>>
>> The idea is based on processing raster bands block by block and
>> calling a given function on each block. Using macros and templates the
>> code should be rather nice and manageable even when supporting
>> multiple data types for raster cells.
>>
>> Further, the idea is to support three kinds of map algebra methods,
>> mostly working on a raster band in-place.
>>
>> 1) Simple ones, which do not take any arguments beyond the raster band
>> itself, examples are functions like log, sin, rand.
>>
>> 2) Those which take one non-spatial argument. The argument can be a
>> scalar like a number, or more complex like a reclassifier.
>>
>> 3) Those which operate on two rasters. Examples are summation of
>> rasters and computing flow directions on a DEM. The latter is a bit
>> more complex since it is a focal method and requires a 3 x 3 matrix of
>> blocks from the other operand raster.
>>
>> Maybe a fourth kind of methods are those which compute something from
>> a raster.
>>
>> This would lead to four functions in the C API and raster band methods
>> in the bindings.
>>
>> Comments are welcome, the initial code base contains a small demo
>> program. Eventually, if this works, I'll make a RFC from this.
>>
>> Ari
>>
>



More information about the gdal-dev mailing list