[gdal-dev] does gdal support multiple simultaneous writers to raster

Frank Warmerdam warmerdam at pobox.com
Mon Jan 14 06:32:34 PST 2013


Jan,

While it is desirable to resolve the multithreaded write issue in gdal I
don't think it is critical for your use case.

I think your case would be handled more appropriately by multiple
*processes*.  You would still run into issues with write consistency but of
a different nature.  The easiest approach is to assign different spatial
regions to each worker server and run these linearly.

Best regards,
On Jan 12, 2013 8:05 AM, "Jan Hartmann" <j.l.h.hartmann at uva.nl> wrote:

>  I would be interested in an implementation. I'm preparing a proposal to
> georeference the complete cadastral map of the Netherlands in 1832 at a 10
> cm/pixel scale with Cloud facilities. Gdalwarp is the central piece of
> software, and distributed processing capabilities would be very important.
> Could you please think about the possibilities, and what kind of funding
> would be required, so I can take that in during the next few months?
>
> Jan
>
>  On 01/12/2013 04:35 PM, Even Rouault wrote:
>
>  ex. convert
> multiple datasets to different output datasets in a parallel way.
>
>
> As Frank underlined, there's currently an issue with the global block cache
> regarding write support.
>
> Imagine that you have 2 threads A and B.
> Thread A deal with dataset A, and thread B deal with dataset .
> Thread A is in the middle of writing some tile/line of dataset A.
> Thread B is trying to fill a new entry in the block cache (with new read data,
> or new data to write). But the block cache is full. So the last recently used
> entry must be discarded. If that entry is a dirty block of dataset A, then it
> must be flushed to disk, in the context of thread B, but at that time thread A
> is also writing data... Which might be an issue since drivers are re-entrant
> (can be invoked by multiple threads, if each thread deal with different
> datasets) not thread-safe.
>
> This specific case here could be fixed in different ways :
> A) Making drivers thread-safe (or accessing them through a thread-safe layer),
> that is to say add a dataset level mutex
> B) or having a per-dataset block cache instead of a global block cache
> C) deal differently with dirty blocks. Only flush them if the operation that
> need to discard the dirty block is initiated by an operation on the same
> dataset as the dirty block.
>
>
>
>  Would
> those parallel operations not be affected by GDAL caching for bot read and
> write.Since the cache is set to a limit. Is Accessing the current used
> cache value concurrent safeto increase it/decrease it ?
>
>
> Hum, I see that GDALGetCacheMax() and GDALSetCacheMax() are not thread safe
> currently. We would need to protect them by the raster block mutex, with a
> leading call to CPLMutexHolderD( &hRBMutex );
> _______________________________________________
> gdal-dev mailing listgdal-dev at lists.osgeo.orghttp://lists.osgeo.org/mailman/listinfo/gdal-dev
>
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20130114/a14b2a1a/attachment.html>


More information about the gdal-dev mailing list