<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hello Even,<br>
<br>
I've had a chance to test the fix in trunk and can report that it
works very well: the `gdalbuildvrt` completed in just over an hour
with the progress meter giving a much more accurate report on
progress.<br>
<br>
I have submitted an enhancement request regarding the VRT indexing
at <a class="moz-txt-link-rfc2396E" href="http://trac.osgeo.org/gdal/ticket/5762"><http://trac.osgeo.org/gdal/ticket/5762></a>.<br>
<br>
Many thanks and best regards,<br>
<br>
Homme<br>
<br>
<div class="moz-cite-prefix">On 03/12/14 10:31, Homme Zwaagstra
wrote:<br>
</div>
<blockquote cite="mid:547EE679.9070007@geodata.soton.ac.uk"
type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
Even,<br>
<br>
On 03/12/14 10:24, Even Rouault wrote:<br>
<span style="white-space: pre;">> Homme,<br>
><br>
>><br>
>> I've come up against a problem with `gdalbuildvrt`
taking a long time to<br>
>> create<br>
>> a VRT when it is passed a large number of source
datasets. I am trying<br>
>> to create<br>
>> a VRT file for a zoom level in a TMS structure
containing JPEG tiles. The<br>
>> command I'm using is:<br>
>><br>
>> gdalbuildvrt output.vrt `find ./tiles/18 -iname *.jpg
-printf "%p "`<br>
>><br>
>> where the number of tiles is:<br>
>><br>
>> $ find ./tiles/18 -iname *.jpg | wc -l<br>
>> 767104<br>
>><br>
>> The processing seemed to progress reasonably quickly
with the progress bar<br>
>> outputing '0... etc ...100 - done'. However
`gdalbuildvrt` continued<br>
>> running<br>
>> until I killed it 8 hours later. Looking at
`output.vrt` just before I<br>
>> killed<br>
>> the program showed it remained empty (0 bytes).<br>
><br>
> I've looked up a bit at the code, and I spotted a potential
performance <br>
> problem when serialing the in-memory VRT into the XML with
a big number of <br>
> sources. I've just committed an improvement into trunk that
will make the <br>
> complexity of source serialization linear instead of
quadratic.</span><br>
<br>
Many thanks! I will give it a spin and report back...<br>
<br>
<span style="white-space: pre;">><br>
>><br>
>> Before digging any deeper is there something I'm
missing? Am I expecting<br>
>> too much of `gdalbuildvrt`, or indeed the VRT format,
in processing this<br>
>> many source<br>
>> datasets?<br>
>><br>
>> Conceptually in this instance it seems as if it would
be useful for a<br>
>> VRT file<br>
>> (and `gdalbuildvrt`) to reference the output of
`gdaltindex` or something<br>
>> similar. I'm not sure how efficiently source datasets
are indexed in<br>
>> VRTs and<br>
>> whether this might be contributing to the problem?<br>
><br>
> There's no indexing in VRT. So yes for that big number of
sources, there might <br>
> be performance problems since each RasterIO() request will
have to go test if <br>
> each source interstects the requested area of interest.
Adding an in-memory <br>
> spatial index after opening the VRT would likely be
possible, provided that <br>
> the non neglectable size of the VRT/XML doesn't make
opening it too slow. That <br>
> depends on the use cases.<br>
><br>
> Yes, perhaps referencing a shapefile tile index could be a
possible <br>
> enhancement.</span><br>
<br>
Ok, that's useful to know, thanks. Unless I hear back otherwise,
I'll submit an<br>
enhancement request on the issue tracker to bookmark the issue.<br>
<br>
Best regards,<br>
<br>
Homme<br>
<br>
<span style="white-space: pre;">><br>
><br>
> Even<br>
></span><br>
<br>
<br>
</blockquote>
<br>
</body>
</html>