<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<font face="Times New Roman, Times, serif">You probably know this,
but there is an option to let gdalwarp use more cores: -wo
NUM_THREADS=ALL_CPUS. It gives some improvement, but not really
staggering. Splitting up operations over individual tiles would
really fasten up things. Even if I use only one VM, I can define
32 cores, and it would certainly be interesting to experiment with
programs like MPI to integrate multiple VMs into one computing
cluster.<br>
<br>
Jan<br>
<br>
</font>
<div class="moz-cite-prefix">On 01/12/2013 02:38 AM, Kennedy, Paul
wrote:<br>
</div>
<blockquote
cite="mid:1E52512B-ADAD-45AD-810C-CA1B8E7F179A@fugro.com.au"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div>Hi,</div>
<div>Yes, we are pretty sure we will see a significant benefit.
The processing algorithms are CPU bound not io bound. Our
digital terrain model interpolations often run for many hours (
we do them overnight) but the underlying file is only a few
gigabytes. If we split them into multiple files of tiles and run
each on a dedicated process the whole thing is quicker, but this
is messy and results in a stitching error. </div>
<div><br>
</div>
<div>Another example is gdalwarp. It takes quite some time with a
large data set and would be. A good candidate for
parallelisation, as would gdaladdo. </div>
<div><br>
</div>
<div>I believe slower cores but more of them in pcs are the
future. My pc has 8 but they rarely get used to their
potential. <br>
<br>
I am certain there are some challenges here, that's why it is
interesting;)</div>
<div><br>
Regards
<div>pk</div>
</div>
<div><br>
On 11/01/2013, at 6:54 PM, "Even Rouault" <<a
moz-do-not-send="true"
href="mailto:even.rouault@mines-paris.org">even.rouault@mines-paris.org</a>>
wrote:<br>
<br>
</div>
<blockquote type="cite">
<div>
<meta name="Generator" content="MS Exchange Server version
6.5.7654.1">
<title>Re: [gdal-dev] does gdal support multiple simultaneous
writers to raster</title>
<!-- Converted from text/plain format -->
<p><font size="2">Hi,<br>
<br>
This is an intersting topic, with many "intersecting"
issues to deal with at<br>
different levels.<br>
<br>
First, are you confident that in the use cases you imagine
that I/O access won't<br>
be the limiting factor, in which case serialization of I/O
could be acceptable<br>
and this would just require an API with a dataset level
mutex.<br>
<br>
There are several places where parallel write should be
addressed :<br>
- The GDAL core mechanisms that deal with the block cache<br>
- Each GDAL driver where parallel write would be
supported. I guess that GDAL<br>
drivers should advertize a specific capability<br>
- The low-level library used by the driver. In the case of
GDAL, libtiff<br>
<br>
And finally, as Frank underlined, there are intrinsic
limitations due to the<br>
format itself. For a compressed TIFF, at some point, you
have to serialize the<br>
writing of the tile, because you cannot kown in advance
the size of the<br>
compressed data, or at least have some coordination of the
writers so that a<br>
"next offset available" is properly synchronized between
them. The compression<br>
itself could be serialized.<br>
<br>
I'm not sure however if what Jan mentionned, different
process, writing the same<br>
dataset is doable.<br>
<br>
</font>
</p>
</div>
</blockquote>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
gdal-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="http://lists.osgeo.org/mailman/listinfo/gdal-dev">http://lists.osgeo.org/mailman/listinfo/gdal-dev</a></pre>
</blockquote>
<br>
</body>
</html>