[gdal-dev] GDAL raster processing: parallel computing
Ari Jolma
ari.jolma at gmail.com
Wed Aug 17 05:28:56 PDT 2016
Hi,
This relates to RFC 62, raster algebra.
I realized that parallel processing is really an essential element of
this. I don't have a lot of experience with parallel processing and
threads so please let me know if I'm writing silly or ignorant things.
James, in your emails you write that map and reduce functions are
essential. That seems to point to parallel processing - can you
elaborate a bit more, what's you approach there and are you using some
specific libraries etc?
Rutger mentioned Dask and Numba, which seem to be a high level solution.
Anyway, I thought I'd make a try with OpenMP and the C++ code I have
written so far. On a very very simple level it seems that it might be
enough to add "#pragma omp parallel for" before the for loop, which
iterates over the (cached) blocks. And then compile the code with
-fopenmp. Of course this does not work (or it seems to work but not make
the code use more than one cpu at the same time) since a single GDAL
Dataset object should not be used by several threads (GDAL FAQ).
There seems to be a solution in a book "Remote Sensing Raster
Programming", which I found with google and google books shows the
relevant page. The book suggests adding #pragma omp barrier before
GDALRasterIO. To me it seems that that would cause all the raster data
to accumulate into the RAM. I did not try it though.
It seems that I should somehow make the code spawn a new Dataset object
for each thread. The function for that is GDALOpenShared. Now a simple
question: What if the raster is created in the code? My test application
for this is a simple one, which takes an existing raster and returns a
0/1 raster, where the cell has 1 if the original raster has value 48 and
0 elsewhere. Is the solution to create the dataset, and then open new
connections to it using OpenShared?
By the way, I'll be at the FOSS4G code sprint Tuesday afternoon and
Saturday morning if anyone wants to discuss this.
Best,
Ari
More information about the gdal-dev
mailing list