[gdal-dev] GDAL raster processing: parallel computing

Ari Jolma ari.jolma at gmail.com
Wed Aug 17 05:28:56 PDT 2016


Hi,

This relates to RFC 62, raster algebra.

I realized that parallel processing is really an essential element of 
this. I don't have a lot of experience with parallel processing and 
threads so please let me know if I'm writing silly or ignorant things.

James, in your emails you write that map and reduce functions are 
essential. That seems to point to parallel processing - can you 
elaborate a bit more, what's you approach there and are you using some 
specific libraries etc?

Rutger mentioned Dask and Numba, which seem to be a high level solution.

Anyway, I thought I'd make a try with OpenMP and the C++ code I have 
written so far. On a very very simple level it seems that it might be 
enough to add "#pragma omp parallel for" before the for loop, which 
iterates over the (cached) blocks. And then compile the code with 
-fopenmp. Of course this does not work (or it seems to work but not make 
the code use more than one cpu at the same time) since a single GDAL 
Dataset object should not be used by several threads (GDAL FAQ).

There seems to be a solution in a book "Remote Sensing Raster 
Programming", which I found with google and google books shows the 
relevant page. The book suggests adding #pragma omp barrier before 
GDALRasterIO. To me it seems that that would cause all the raster data 
to accumulate into the RAM. I did not try it though.

It seems that I should somehow make the code spawn a new Dataset object 
for each thread. The function for that is GDALOpenShared. Now a simple 
question: What if the raster is created in the code? My test application 
for this is a simple one, which takes an existing raster and returns a 
0/1 raster, where the cell has 1 if the original raster has value 48 and 
0 elsewhere. Is the solution to create the dataset, and then open new 
connections to it using OpenShared?

By the way, I'll be at the FOSS4G code sprint Tuesday afternoon and 
Saturday morning if anyone wants to discuss this.

Best,

Ari




More information about the gdal-dev mailing list