[gdal-dev] Map algebra

Peter Baumann p.baumann at jacobs-university.de
Sat Apr 2 06:06:08 PDT 2016


see also rasdaman with its n-D raster query language based on Array Algebra,
which effectively is the basis for the forthcoming ISO Array SQL standard:
http://www.rasdaman.org
publications:
http://www.faculty.jacobs-university.de/pbaumann/iu-bremen.de_pbaumann//pubs.php

cheers,
Peter


On 04/02/2016 02:48 PM, Ari Jolma wrote:
> 02.04.2016, 15:15, Even Rouault kirjoitti:
>> A few thoughts and questions :
>> - I guess what would be cool from the perspective of a C++ user is to be
>> really to write the above pseudo code with C++ operator overload when
>> appropriate !  A += B, A.min(B), min_A= A.min(), etc...
>
> Operator overloading is of course something to have. It is great for
> interactive use too.
>
> I would most probably implement that in the Perl bindings.
>
>> - There might be some numpy reinvention here... Or at least, it could perhaps
>> serve as an inspiration. I guess Python users would in priority use numpy for
>> the most basic operations. Of course the GIS specific methods would still be of
>> value.
>
> I know little of numpy and how it is used by the Python bindings. Does it load
> the whole raster band into memory or can it make use of the block read/write?
> If yes, then that is probably preferable for Pythonists at least when it comes
> to standard methods.
>
>> - Saturation arithmetics could be interesting. numpy by default do wrapped
>> arithmetics, you have to cast to a larger type and clip down afterwards,
>> whereas OpenCV does it by default :
>> http://opencvpython.blogspot.fr/2012/06/difference-between-matrix-arithmetic-
>> in.html
>> - Are the methods handling 2 bands able to deal with different block sizes ?
>
> Yes, the band size needs to be the same currently but the caching and cell
> index functions allow working with two bands at the same time even when the
> cell neighborhood is needed. I'm not sure about the optimality of the code
> however, the cache is cleaned of not-needed blocks at each block step.
>
>> - And with bands of different data types ?
>
> Yes. It creates pretty huge switch tables however. They are mostly hidden in
> macros but are a problem. I don't see how I can get the datatype into the
> templates from the bands and I don't like the idea of leaving that to the users.
>
>> - Why the need to reinvent a hashset mechanism ? Isn't std::set good enough ?
>> - Same for gma_array vs std::vector ?
>
> Simply ignorance of C++ I think. I do see that for example C++11 and boost
> have lots of useful things but did not (yet) go that way. Earlier C++
> standards seem to have many tools too. I learnt C++ >20 years ago... I believe
> I also need some small classes like "mapper", which reclassifies data from old
> values to new values and are not readily available.
>
>> - various methods take a void* argument, whereas it seems that could be
>> rewritten as datatype* for better type safety
>
> Noted. The problem is often that the callback to the method specific function
> needs a prototype, which can be used for many methods. Surely minimization of
> void pointers is a goal. Generic objects and introspection might be a partial
> solution.
>
>> - there seems to be a block cache mechanism. What is it for ? How does that
>> work ?
>
> In the block loop (looping through all blocks in a band) at each step the
> cache is checked for block(s) that are needed (if focal distance is > 0 then
> more than one block may be needed) and which are not needed. Then blocks are
> read and written/discarded.
>
>> - there is some overlap/connection with the pixel functions of VRT ( see
>> "Using Derived Bands" of http://gdal.org/gdal_vrttut.html).
>
> ok
>
> Thanks for the comments!
>
> Ari
>
>>
>>
>>> The C code is of course not this nice but it is not very far from this.
>>> All the basic four method types I mention below are implemented and
>>> adding new methods is not very difficult. The code makes heavy use of
>>> templates and macros in an attempt for versatility (GDAL rasters can
>>> have many datatypes) and easiness to write methods in a few lines of
>>> code. The whole set of code is now ~2500 lines but it creates rather
>>> large executables due to templates. The code is C++ but the API is C
>>> like, I've so far done no changes to existing GDAL code because of this.
>>>
>>> As written below, the code works on two levels of x,y loops: the blocks
>>> within a band, which is generic and cells within a block, which is
>>> separate for each method. The methods work on line by line (no recursion
>>> etc.) and thus for example the fill_depressions is not optimal but
>>> instead very simple. Also, very large rasters can be processed.
>>>
>>> I think this may be somewhat similar thing in relation to GDAL as the
>>> network support, so it could be a thing to add to the library if wished.
>>>
>>> I've added RFC 62 "raster algebra" to spark discussion.
>>>
>>> Ari
>>>
>>> 20.03.2016, 10:21, Ari Jolma kirjoitti:
>>>> Folks,
>>>>
>>>> I played around a bit with the map algebra idea and came up with
>>>> something that could work nicely as a plugin. The initial code is at
>>>> https://github.com/ajolma/gdal/tree/trunk/gdal/map_algebra
>>>>
>>>> The idea is based on processing raster bands block by block and
>>>> calling a given function on each block. Using macros and templates the
>>>> code should be rather nice and manageable even when supporting
>>>> multiple data types for raster cells.
>>>>
>>>> Further, the idea is to support three kinds of map algebra methods,
>>>> mostly working on a raster band in-place.
>>>>
>>>> 1) Simple ones, which do not take any arguments beyond the raster band
>>>> itself, examples are functions like log, sin, rand.
>>>>
>>>> 2) Those which take one non-spatial argument. The argument can be a
>>>> scalar like a number, or more complex like a reclassifier.
>>>>
>>>> 3) Those which operate on two rasters. Examples are summation of
>>>> rasters and computing flow directions on a DEM. The latter is a bit
>>>> more complex since it is a focal method and requires a 3 x 3 matrix of
>>>> blocks from the other operand raster.
>>>>
>>>> Maybe a fourth kind of methods are those which compute something from
>>>> a raster.
>>>>
>>>> This would lead to four functions in the C API and raster band methods
>>>> in the bindings.
>>>>
>>>> Comments are welcome, the initial code base contains a small demo
>>>> program. Eventually, if this works, I'll make a RFC from this.
>>>>
>>>> Ari
>>> _______________________________________________
>>> gdal-dev mailing list
>>> gdal-dev at lists.osgeo.org
>>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
Dr. Peter Baumann
 - Professor of Computer Science, Jacobs University Bremen
   www.faculty.jacobs-university.de/pbaumann
   mail: p.baumann at jacobs-university.de
   tel: +49-421-200-3178, fax: +49-421-200-493178
 - Executive Director, rasdaman GmbH Bremen (HRB 26793)
   www.rasdaman.com, mail: baumann at rasdaman.com
   tel: 0800-rasdaman, fax: 0800-rasdafax, mobile: +49-173-5837882
"Si forte in alienas manus oberraverit hec peregrina epistola incertis ventis dimissa, sed Deo commendata, precamur ut ei reddatur cui soli destinata, nec preripiat quisquam non sibi parata." (mail disclaimer, AD 1083)




More information about the gdal-dev mailing list