[gdal-dev] Map algebra

alex alexhighviz at hotmail.com
Thu Apr 7 16:21:34 PDT 2016


> 
> That is one aspect, yes. However, so far I've considered only operations on one
> or two bands at time. Operations on more than two can of course be dissolved
> into a series of operations with max two at a time.

That will often be possible, but not always. For instance how would you dissolve the following function? (without resorting to complete ugliness like c * a  + !c * b)

int conditional(bool c, int a, int b) { return c ? a : b; }

Also, would you expect the user to break down all his/her functions into two-argument or fewer functions?

> 
> I agree that supporting overloading is important and it is more than just
> syntactic sugar since it can delegate some tasks to the compiler, which may
> create temporaries as needed etc.

To be fair, in my project it is just syntactic sugar: "a + b" is just a short way to write "apply(std::plus<>{}, a, b) ";

> 
> >
> > If you want to support this, it seems to be best practice to use expression
> templates (https://en.wikipedia.org/wiki/Expression_templates), because that
> would allow to create complex compositions of operations without having to
> create temporary datasets (e.g. for the sum of band1 and band2 in the example
> above).
> 
> Yes, I agree that avoiding temporaries is a worthy goal, and what I
> understand of expression templates, motivation for them has been that.
> 
> However, expression templates seem to be very C++ centric and do their
> work in compile time. The C API and high level language bindings are
> very important in GDAL. Therefore expression templates might not be the
> best solution in this case. However, perhaps the idea can be exploited
> somehow.

I would think that you can get the principle to work in object-oriented fashion as well, where the template based function "apply" looks like this

auto apply(Function&& function, Args&&... args)

Couldn't the object oriented version for two arguments be something like this:

GDALRange* apply(GDALValue*( *function)(GDALValue*, GDALValue*), GDALArgument * arg1 GDALArgument * arg2)

Assuming that all ranges derive from GDALRange and all values from GDALValue, and both of these derive from GDALArgument. You can now sort out the function composition in runtime instead of compile-time.

I know which approach I would prefer, but I am not convinced it is not possible.

> 
[snip]
>You
> seem to be applying an iterator that iterates over blocks, rather than over pixels.
> How can that work when my_average, band1, band2 and band3 have different
> block sizes? Wouldn't your life be a lot easier if you iterated over pixels in the
> same order for all bands?.
> 
> I agree that it takes a bit of haggling to be able to work with more
> than one band at the same time when one has to maintain a separate cache
> for each band and each band has different block sizes. The situation is
> even worse for focal methods since one focal region can be on four
> blocks of one band in the worst case. 

The worst-case can be a lot worse, consider a focal window with a radius of 500 pixels.

>It is however, in my opinion, an
> efficient approach when it is not feasible to load all raster data into
> memory. 

Iterating pixel-by-pixel does not imply having to load all raster data into memory. It does imply loading multiple blocks into memory, i.e. a whole row of blocks.

>The approach is also strongly supported by the basic design of
> GDAL and the fundamental algorithm is presented in GDAL documentation (I
> gave the link in earlier email).

I didn't see that email and just trawled through the whole thread, can you give it again?

> 
> The solution I already have seems to work well for the one and two band
> case even with 3x3 focal region and as far as I can see it could be
> extended to cases involving more bands. The macros and functions I've
> written make cell (pixel) access possible with only a few lines of code
> but it could be probably improved. That would probably be desirable if
> more than two bands are present.
> 
> >
> > If you are interested, I have a solution, albeit in C++11
> (https://github.com/ahhz/map_algebra ).
> 
> I remember your email and I've taken a look at your code. I have nothing
> against using modern tools but I think GDAL is a bit conservative on
> them and portability is a big issue. However, I believe the map algebra
> implementation I'm writing will be a add-on or plugin for GDAL, and not
> something that will be tightly integrated into it.
> 
[snip]
> >
> > 2. Focal operations: operations that calculate an indicator for each pixel based
> on the values in surrounding pixels
> >
> > I think the key challenge here is to allow users to specify the function to be
> applied on the surrounding pixels. In my opinion a Map Algebra library should
> support all kinds of moving window analysis, and not just a number of built-in
> types of analysis. This can be implemented as yet another range that allows
> iteration over the pixels in the same order as the raster bands, but instead of
> return pixel values, it returns neighbourhood / window summaries.
> 
> The hydrological algorithm would benefit a lot from random access to
> extended neighborhood cells but so far I've written only code that uses
> linear, block by block and pixel by pixel within blocks, access.
> 
> 

I think I need to read up on your fundamental algorithm to understand this. Is linear access line-by-line? That would ease my worries because line-by-line is independent of block size, right?

> 
[snip]
> >
> > 3. Zonal operations: operations that calculate indicators for groups (zones) of
> pixels.
> > This seems more trivial than the other two components of map algebra and I
> haven't given it much attention, but I think should be supported.
> 
> One complication, which I just today worked on, seems to stem from one
> requirement that I have. I want the API be free of templates and thus
> there is a need for generic classes for, e.g., numbers (GDAL supports
> bands where data is various length integers or floats). Thus, for
> example a simple zonal method, which returns which zones share a
> boundary returns a rather complex data structure.

I understand zonal methods to be operating on pixels within zones and not on the relationship between zones. What kind of methods do you have in mind? 

> >
> > 4. Global operations
> > I don't think these need to be explicitly supported, because there is no
> genericity to exploit.
> 
> I find that I often need methods, which return for example the histogram
> of values in a band.
> 

Me too. But I don't feel it urgently needs special library support. 

> >
> > In summary:
> >
> > What I am really trying to say is that developing map algebra functionality on
> top of GDAL or as part of GDAL seems very worthwhile and feasible. However, I
> would aim for functionality that is more generically supporting map algebra,
> rather than implementing specific map algebra tools. The foundation of such
> functionality in my opinion must be a GDALRange class that facilitates iteration
> over pixels in a uniform order. If I were you, I would first push a RFC to create a
> GDALRange, and only once that is done and dusted, consider the map algebra on
> top of it.
> 
> The RFC procedure of GDAL is open to everybody and the deciding body is
> the GDAL PSC, to which I do not belong to. As I wrote, the API I'm
> suggesting is essentially an add-on and thus leaves, even if adopted,
> room for other solutions.

I thought the idea of discussing on this mailing-list was to look for consensus or at least strengthen ideas.

Kind regards, Alex


More information about the gdal-dev mailing list