<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Even,<br>
<br>
On 03/12/14 10:24, Even Rouault wrote:<br>
<span style="white-space: pre;">> Homme,<br>
><br>
>><br>
>> I've come up against a problem with `gdalbuildvrt` taking
a long time to<br>
>> create<br>
>> a VRT when it is passed a large number of source
datasets. I am trying<br>
>> to create<br>
>> a VRT file for a zoom level in a TMS structure containing
JPEG tiles. The<br>
>> command I'm using is:<br>
>><br>
>> gdalbuildvrt output.vrt `find ./tiles/18 -iname *.jpg
-printf "%p "`<br>
>><br>
>> where the number of tiles is:<br>
>><br>
>> $ find ./tiles/18 -iname *.jpg | wc -l<br>
>> 767104<br>
>><br>
>> The processing seemed to progress reasonably quickly with
the progress bar<br>
>> outputing '0... etc ...100 - done'. However
`gdalbuildvrt` continued<br>
>> running<br>
>> until I killed it 8 hours later. Looking at `output.vrt`
just before I<br>
>> killed<br>
>> the program showed it remained empty (0 bytes).<br>
><br>
> I've looked up a bit at the code, and I spotted a potential
performance <br>
> problem when serialing the in-memory VRT into the XML with a
big number of <br>
> sources. I've just committed an improvement into trunk that
will make the <br>
> complexity of source serialization linear instead of
quadratic.</span><br>
<br>
Many thanks! I will give it a spin and report back...<br>
<br>
<span style="white-space: pre;">><br>
>><br>
>> Before digging any deeper is there something I'm missing?
Am I expecting<br>
>> too much of `gdalbuildvrt`, or indeed the VRT format, in
processing this<br>
>> many source<br>
>> datasets?<br>
>><br>
>> Conceptually in this instance it seems as if it would be
useful for a<br>
>> VRT file<br>
>> (and `gdalbuildvrt`) to reference the output of
`gdaltindex` or something<br>
>> similar. I'm not sure how efficiently source datasets
are indexed in<br>
>> VRTs and<br>
>> whether this might be contributing to the problem?<br>
><br>
> There's no indexing in VRT. So yes for that big number of
sources, there might <br>
> be performance problems since each RasterIO() request will
have to go test if <br>
> each source interstects the requested area of interest.
Adding an in-memory <br>
> spatial index after opening the VRT would likely be possible,
provided that <br>
> the non neglectable size of the VRT/XML doesn't make opening
it too slow. That <br>
> depends on the use cases.<br>
><br>
> Yes, perhaps referencing a shapefile tile index could be a
possible <br>
> enhancement.</span><br>
<br>
Ok, that's useful to know, thanks. Unless I hear back otherwise,
I'll submit an<br>
enhancement request on the issue tracker to bookmark the issue.<br>
<br>
Best regards,<br>
<br>
Homme<br>
<br>
<span style="white-space: pre;">><br>
><br>
> Even<br>
></span><br>
<br>
<br>
</body>
</html>