[gdal-dev] GDAL raster processing library?
James Ramm
jamessramm at gmail.com
Wed Jul 13 00:09:46 PDT 2016
Peter,
I think 'Grid Algebra' would be what Ari Jolma is proposing here:
https://trac.osgeo.org/gdal/wiki/rfc62_raster_algebra
As Even pointed out, there is some overlap, though my proposal is
technically very different.
The key differences I see are:
- Users can submit functions which operate on each sub window of the
raster, rather than an algebraic expression. This potentially allows for
much more complicated algorithms to be used (e.g. I dont think it would be
possible to run a watershed segmentation with a raster algebra
implementation, or to have algorithms which behave differently depending on
the 'location' within the raster or the values of surrounding pixels etc
etc).
- Functions can be chained together for a complex processing toolchain.
Some overlap with VRT here, although again this introduces a little more
flexibility.
On 12 July 2016 at 15:47, Peter Halls <p.halls at york.ac.uk> wrote:
> James,
>
> in reality, are you not requesting an implementation of Tomlin's
> 'Grid Algebra' in GDAL? That defines the whole range of functions from
> whole raster to pixel and has the distinct advantage of being both
> published and extremely well known because of other implementations ...
> which also provide ready-made reference bases for the GDAL implementors ...
>
> Best wishes,
> Peter
>
> On 12 July 2016 at 15:39, James Ramm <jamessramm at gmail.com> wrote:
>
>> Hi Even,
>>
>> The difference I see with pixel functions is that, as far as I
>> understand, the pixel function is applied per pixel, so there is no
>> possibility of e.g. the pixel buffer when have the function apply to
>> 'blocks'.
>> I may be way off, but many of the algorithms we deal with require some
>> kind of neighbourhood search - a polygonise algorithm or flow direction
>> algorithm being good examples.
>> I dont think VRT pixel functions allow this?
>>
>> So in that sense I'd see a VRT being 'just' another potential input data
>> source.
>>
>> Perhaps VRT pixel functions could be expanded to also allow 'window'
>> functions?
>>
>> A downside is it requires creating a VRT even when you only want to apply
>> a such a function to a single dataset. Small effort, but still a bit more
>> than throwing in any GDALDataset to be processed.
>>
>> I see the overlap with raster algebra, although yes technically very
>> different.
>>
>>
>>
>> On 12 July 2016 at 14:55, Even Rouault <even.rouault at spatialys.com>
>> wrote:
>>
>>> James,
>>>
>>> There's some intersection with Ari's proposal :
>>> https://trac.osgeo.org/gdal/wiki/rfc62_raster_algebra . At least
>>> regarding the
>>> overall purposes, since technically this is quite different.
>>>
>>> Actually what you propose is very close to the existing VRT pixel
>>> functions of
>>> derived bands. See "Using Derived Bands" in
>>> http://www.gdal.org/gdal_vrttut.html . In the last days, we've merged
>>> Antonio's work regarding a predefined set of pixel functions.
>>> Perhaps some extension to allow passing user parameters to the pixel func
>>> could be useful. It is possible to use pixel functions from Python as
>>> shown in
>>>
>>> https://github.com/OSGeo/gdal/blob/trunk/autotest/gcore/testnonboundtoswig.py#L303
>>> although this is a bit ugly as it uses ctypes and not SWIG. But should be
>>> possible through SWIG by introducing proper types similarly to what is
>>> done
>>> for the progress functions or error handler functions.
>>>
>>> Even
>>>
>>> Le mardi 12 juillet 2016 14:40:27, jramm a écrit :
>>> > I wonder if there is a use case/interest in a small library within
>>> GDAL to
>>> > facilitate generic raster processing.
>>> >
>>> > My idea would be to have a user-extensible framework to run processing
>>> > functions on whole rasters, with a growing library of common-such
>>> functions
>>> > within GDAL.
>>> >
>>> > Something along the lines of this:
>>> >
>>> > /*******************************************************************/
>>> > typedef CPLErr (*GDALRasterProcessFn)(double *padfInArray, double
>>> > *padfOutArray,
>>> > int nWindowXSize, int nWindowYSize, void *pData, double
>>> > *pdfNoDataValue);
>>> > /**
>>> > * \brief Definition of a raster processing function.
>>> > *
>>> > * A GDALRasterProcessFn accepts an array of data as input, applies
>>> custom
>>> > logic and writes the output to padfOutArray.
>>> > * Such a function can be passed to GDALRunRasterProcess to apply custom
>>> > processing to a GDALDataset in chunks and create
>>> > * a new GDALDataset.
>>> > *
>>> > * @param padfInArray The input array of data.
>>> > *
>>> > * @param padfOutArray The output array of data. On first call (via
>>> > GDALRunRasterProcess) this will be an empty, initialised array,
>>> > * which should be populated with the result of calculations on
>>> > padfInArray. In subsequent calls it will contain the result of the
>>> > * previous window.
>>> > *
>>> > * @param nWindowXSize the actual x size (width) of the read window.
>>> > *
>>> > * @param nWindowYSize the actual y size (height) of the read window.
>>> The
>>> > length of padfInArray == padfOutArray == nWindowXSize * nWindowYSize
>>> > *
>>> > * @param pData Process-specific data. This data is passed straight
>>> through
>>> > to the GDALRasterProcessFn and may contain e.g user defined parameters.
>>> > * The GDALRasterProcessFn definition would define the
>>> structure/type of
>>> > such data.
>>> > *
>>> > * @param pdfNoDataValue The no data value of the dataset
>>> > */
>>> >
>>> > CPLErr GDALRunRasterProcess(GDALRasterProcessFn processFn, GDALDataset
>>> > *poSrcDataset,
>>> > GDALDataset *poDstDataset, void *pData, int
>>> > *pnWindowXSize,
>>> > int *pnWindowYSize, int *pnPixelBuffer);
>>> > /**
>>> > * \brief Apply a raster processing function to each sub-window of a
>>> raster.
>>> > *
>>> > * The input raster dataset is read in chunks of nWindowXSize *
>>> nWindowYSize
>>> > and each chunk is passed to the processing
>>> > * function. The output array from the function is written to the
>>> > destination dataset.
>>> > * An optional 'pixel buffer' can be specified to allow overlaps between
>>> > successive windows. This is useful for
>>> > * some algorithms, e.g. blob extraction, watershed/stream flow
>>> analysis,
>>> > convolution etc.
>>> > * Process specific data can be passed (e.g. configuration parameters).
>>> This
>>> > data is simply passed straight through to the processing
>>> > * function on each call.
>>> > *
>>> > * @param processFn A GDALRasterProcessFn to apply to each sub window
>>> of the
>>> > raster.
>>> > *
>>> > * @param poSrcDataset The source raster dataset from which pixel
>>> values are
>>> > read
>>> > *
>>> > * @param poDstDataset The destination raster dataset to which pixel
>>> values
>>> > are written. Must support RasterIO in write mode.
>>> > *
>>> > * @param pData Process-specific data. This is passed straight through
>>> to
>>> > the GDALRasterProcessFn on each call.
>>> > *
>>> > * @param pnWindowXSize The desired width of each read window. If NULL
>>> it
>>> > defaults to the 'natural' block size of the raster
>>> > *
>>> > * @param pnWindowYSize The desired height of each read window. If NULL
>>> it
>>> > defaults to the 'natural' block size.
>>> > *
>>> > * @param pnPixelBuffer A pixel buffer to apply to the read window. The
>>> read
>>> > window is expanded by pnPixelBuffer pixels in all directions such that
>>> > * each window overlaps by pnPixelBuffer pixels. Ignored if NULL or 0
>>> > *
>>> > * @return a CPLErr enum indicating whether the function succesfully
>>> ran.
>>> > */
>>> >
>>> >
>>> > CPLErr GDALRunRasterProcesses(GDALRasterProcessFn *paProcessFn, int
>>> > nProcesses, GDALDataset *poSrcDataset,
>>> > GDALDataset *poDstDataset, void **paData, int
>>> > *pnWindowXSize,
>>> > int *pnWindowYSize, int *pnPixelBuffer);
>>> >
>>> > /**
>>> > * \brief Apply multiple raster processing functions to each sub-window
>>> of a
>>> > raster
>>> > *
>>> > * For each window, the functions defined by the paProcessFn array are
>>> > called in turn, with the array output of the previous function forming
>>> the
>>> > input * to the next function. This allows processing 'toolchains' to be
>>> > built without having to create intermediate datasets, which can be less
>>> > efficient in time and space.
>>> > *
>>> > *
>>> > * @param paProcessFn An array of GDALRasterProcessFn to apply to each
>>> sub
>>> > window of the raster
>>> > *
>>> > * @param nProcesses The size of paProcessFn
>>> > *
>>> > * @param poSrcDataset The source raster dataset from which pixel
>>> values are
>>> > read
>>> > *
>>> > * @param poDstDataset The destination raster dataset to which pixel
>>> values
>>> > are written. Must support RasterIO in write mode.
>>> > *
>>> > * @param paData an array of process-specific data objects of size
>>> > nProcesses. Each data object will be passed to the corresponding
>>> > GDALRasterProcessFn
>>> > *
>>> > * @param pnWindowXSize The desired width of each read window. If NULL
>>> it
>>> > defaults to the 'natural' block size of the raster
>>> > *
>>> > * @param pnWindowYSize The desired height of each read window. If NULL
>>> it
>>> > defaults to the 'natural' block size.
>>> > *
>>> > * @param pnPixelBuffer A pixel buffer to apply to the read window. The
>>> read
>>> > window is expanded by pnPixelBuffer pixels in all directions such that
>>> > * each window overlaps by pnPixelBuffer pixels. If NULL, it is
>>> ignored.
>>> > *
>>> > * @return a CPLErr enum indicating whether the function succesfully
>>> ran.
>>> > */
>>> > /*******************************************************************/
>>> >
>>> > So GDALRunRasterProcess would take care of I/O and looping through the
>>> > dataset in chunks, which can be configure by the user. Each chunk is
>>> > submitted to the processing function, which does 'stuff' and populates
>>> the
>>> > output array. GDALRunRasterProcess would write it to the output dataset
>>> > object.
>>> >
>>> > An example of a processing function could be something like this:
>>> >
>>> > /*******************************************************************/
>>> > typedef struct ReclassArgs {
>>> > int nReclassArgs;
>>> > double *padfMinBound;
>>> > double *padfMaxBound;
>>> > double *padfBinValue;
>>> >
>>> > } ReclassArgs;
>>> >
>>> >
>>> > CPLErr Reclass(double *padfInArray, double *padfOutArray,
>>> > int nWindowXSize, int nWindowYSize, void *pData,
>>> > double *pdfNoDataValue)
>>> > {
>>> > ReclassArgs *poRargs = static_cast<ReclassArgs *>(pData);
>>> > double pixel;
>>> > for (int i = 0; i < nWindowXSize * nWindowYSize - 1; ++i){
>>> > pixel = padfInArray[i];
>>> > for (int k = 0; k < poRargs->nReclassArgs; ++k){
>>> > if ((pixel > poRargs->padfMinBound[k]) && (pixel <=
>>> > poRargs->padfMaxBound[k])){
>>> > padfOutArray[i] = poRargs->padfBinValue[k];
>>> > break;
>>> > }
>>> > }
>>> > }
>>> > return CE_None;
>>> > }
>>> > /*******************************************************************/
>>> >
>>> > To use it you would:
>>> >
>>> > /*******************************************************************/
>>> > // Open a source dataset
>>> > GDALDataset *poSrc = GDALOpen(...)
>>> >
>>> > // Create some output dataset with same dimensions etc..
>>> > GDALDataset *poDst = ....
>>> >
>>> > // Setup the reclass args
>>> > ReclassArgs *poRargs = ...
>>> >
>>> > // Call the process
>>> > CPLErr retval = GDALRunRasterProcess(Reclass, poSrc, poDst, (void
>>> > *)poRargs, NULL, NULL, NULL);
>>> > /*******************************************************************/
>>> >
>>> > GDAL could feature its own library of common functions...reclass,
>>> truncate,
>>> > blob extraction etc etc and a user of GDAL could write their own and
>>> have
>>> > GDAL process it.
>>> >
>>> > It would be useful to expose enough to python via SWIG so that a
>>> > GDALRasterProcessFn could be defined in python - this would then allow
>>> > rapid development in python, but push the expensive work of looping
>>> over
>>> > dataset chunks to C++.
>>> > I dont know enough about SWIG yet to see whether this would be
>>> feasible.
>>> >
>>> > However, even in C++, it allows the user to only think about their
>>> > algorithm and leave the boilerplate to GDAL.
>>> >
>>> > I imagine all this being implemented in a 'sub library', something like
>>> > gdpl.h (geospatial/gdal dataset processing library), but sitting in the
>>> > GDAL source tree (then I can happily expose all the useful GDAL
>>> goodness
>>> > without being worried about users having to include gdal headers etc,
>>> and
>>> > avoid any work in having to abstract it away)
>>> >
>>> > This is a first stab at defining something like this and is probably
>>> not
>>> > yet comprehensive enough.
>>> >
>>> > Any thoughts?
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> >
>>> http://osgeo-org.1560.x6.nabble.com/GDAL-raster-processing-library-tp52759
>>> > 48.html Sent from the GDAL - Dev mailing list archive at Nabble.com.
>>> > _______________________________________________
>>> > gdal-dev mailing list
>>> > gdal-dev at lists.osgeo.org
>>> > http://lists.osgeo.org/mailman/listinfo/gdal-dev
>>>
>>> --
>>> Spatialys - Geospatial professional services
>>> http://www.spatialys.com
>>>
>>
>>
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>
>
>
> --
>
> ----------------------------------------------------------------------------------------------------------------
> Peter J Halls, PhD Student, Post-war Reconstruction and Development Unit
> (PRDU),
> University of York
>
> Snail mail: c/o Research Centre for the Social Sciences (RCSS),
> University of York,
> Heslington, York YO10 5DD
>
> This message has the status of a private and personal communication
>
> ----------------------------------------------------------------------------------------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160713/dce9faf7/attachment-0001.html>
More information about the gdal-dev
mailing list