[gdal-dev] How to add OpenMP to GDAL?

Even Rouault even.rouault at mines-paris.org
Fri Nov 4 18:08:58 EDT 2011

Le vendredi 04 novembre 2011 22:07:07, 宋小璐 a écrit :
> Dear sir,
> I've  used GDALWarper to develop a program for image resampling, resolution
> of 10m up to 2.5m. In this process I found that the ChunkAndWarpImage( )
> in GDALWarpOperation was too slow that it almost took me an hour to get a
> sence of QuickBird image. After checking source codes, I would like to
> promote two ways to get this process faster.
> 1. Set the dfWarpMemoryLimit larger than 64. Since the computer I am using
> has a RAM of 12GB, 64M is a little smaller under this condition. Besides,
> the smaller dfWarpMemoryLimit is, the more chunks will be creat. That
> means we have to do lots of RasterIO( ), which takes a long time to
> complete. ( Thanks for your advice in the declaration before the
> definition, in which you have advice that we should try various schemes to
> query physical RAM. I just wanna offer some help to somebody who has the
> same confusion with me, in case they could find a solution here by google
> or something else.)

Increasing dfWarpMemoryLimit will generally help performance, but read 
http://trac.osgeo.org/gdal/wiki/UserDocs/GdalWarp because there are non 
obvious cases where it doesn't.

> 2. Add parallelization support to GDAL build options, such as OpenMP or
> CUDA. (I've been able to use CUDA along with GDAL.) Then question comes:
> how can I add OpenMP to GDAL? I konw nothing about Cmake or gcc. After
> googling I tried the following metrods:
> Since OpenMP could speed up codes so easily, would you please help me to
> add it into GDAL? I'm using GDAL1.8.0 and VS2009, Win7 Ultimate.

First, I just want to make you aware that there's an accelerated 
implementation of the warping algorithm that is  OpenCL-based, and which was 
integrated with GDAL 1.8.0.

Then, on your attempt to use OpenMP the way you did, I think you have run into 
a issue because the PROJ based transformer (ogrct.cpp) uses a big mutex with 
PROJ < 4.8. So the parallelization is pretty useless if you let the 
transformer part into the parallelized loop. But if you PROJ >= 4.8dev, you 
would have potentially thread-safety issue if the same transformer object is 
used by different threads. You would need to instanciate it as many times as 
you have openmp threads. An alternative would be to do as the OpenCL-based 
implementation and do the reprojection before. It really depends on what 
bottlenecks you have identified in your use cases : reprojection or resampling.

> Thank you!
>  Xiaolu Song
> 2011-11-05
> Xiaolu Song
> Center for Earth Observation and Digital Earth Chinese Academy of Sciences
> No.9 Dengzhuang South Road, Haidian District, Beijing 100094, China
> Tel: +86-010-82178188 +86-18610335605
> Email:talent_sxl at 126.com
> 中国科学院对地观测与数字地球科学中心
> 北京市海淀区邓庄南路9号

More information about the gdal-dev mailing list