<div dir="ltr">Another thing that might speed up access is setting the config option <span style="color:rgb(0,0,0);font-family:Arial,Verdana,'Bitstream Vera Sans',Helvetica,sans-serif;letter-spacing:-0.018em">GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster.</span><div class="gmail_extra">
Luke<br><br><div class="gmail_quote">On Sun, Feb 2, 2014 at 2:12 PM, Jukka Rahkonen <span dir="ltr"><<a href="mailto:jukka.rahkonen@mmmtike.fi" target="_blank">jukka.rahkonen@mmmtike.fi</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
<br>
I made a few tests and here comes my conclusions. Hypothesis is that someone<br>
wants to make a DEM query service which is using gdallocationinfo for<br>
queries and DEM data is to be accessed as files from a standard web site. I<br>
compared three alternatives:<br>
1) There are thousands of DEM files on the server and they are combined<br>
together with a VRT file.<br>
2) There is only one DEM file as BigTIFF.<br>
3) DEM is split into tiles into x/y/z tile directory structure like in<br>
Google maps or OpenStreetMap tiles.<br>
<br>
My test data covers Finland with 10 m grid size and as deflate compressed<br>
tiffs they make about 10 GB together.<br>
<br>
Before going on, keep in mind that the speed needs indexes. The better<br>
index, the less unnecessary data to read. In case 1) the first level index<br>
is the VRT file. The second level index, if it exists, is in the headers of<br>
the real DEM files. It may be possible to jump to a correct offset from the<br>
beginning of the DEM data and read only a part of the file. In case 2) the<br>
index is in the internal TIFF directory. If the BigTIFF is tiled the access<br>
to tiles should be rather effectice. And finally in case 3) the index is<br>
built into directory structure and tiling schema that is used for saving the<br>
tiles. The schema is no well known that tile map service clients can<br>
directly ask for a certain file name if they know the coordinates and scale.<br>
<br>
Conclusions:<br>
<br>
1)<br>
- The whole VRT file must be readed. Caching the vrt file would make next<br>
requests faster.<br>
- For some reason gdallocationinfo wants to get the directory list of the<br>
directory where the vrt file is. This is slow and generates lots of traffic<br>
if the thousands of DEM files are in the same directory. Probably it would<br>
be faster to have them in another dierectory.<br>
<br>
2)<br>
- BigTIFF route is more straight forward but gdallocationinfo needs still to<br>
do many big range reads.<br>
- Also in this case gdallocationinfo reads the target file directory. It<br>
would be good to keep this directory small. Don't do like I did with having<br>
in the directory the BigTIFF DEM file that was the only file needed, but<br>
also the vrt and thousands of original DEMs from the previuos test -> but at<br>
least this is a know this issue now and know how to avoid it. In my case<br>
reading the directory made 2.2 MB of web traffic and all or most for wain.<br>
<br>
3)<br>
- I used OpenStreetMap tile service as the test data for the third test. In<br>
this case gdallocationinfo knows exactly which tile to request and it is<br>
making only one request. It also seems to cache some tiles on the client<br>
side which means that queries for close locations may hit the cached tile<br>
and be very fast.<br>
<br>
Summary statistics:<br>
<br>
1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data<br>
2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data<br>
3) Gdallocationinfo makes 1 requests and reads 10 kB of data<br>
<br>
Requests I used are these:<br>
<br>
1)<br>
gdallocationinfo /vsicurl/<a href="http://latuviitta.kapsi.fi/data/" target="_blank">http://latuviitta.kapsi.fi/data/</a><br>
dem10m/dem_10m.vrt -geoloc 389559 6677412<br>
2)<br>
gdallocationinfo /vsicurl/<a href="http://latuviitta.kapsi.fi/data/" target="_blank">http://latuviitta.kapsi.fi/data/</a><br>
dem10m/dem_10m.tif -geoloc 389559 6677412<br>
3)<br>
gdallocationinfo frmt_wms_openstreetmap_tms.xml -geoloc 389559 6677412<br>
<br>
I know that the queried place in 3) is not the same because SRIDs of data<br>
differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead<br>
but it does not matter here, the idea is what is important.<br>
<br>
My conclusion is that you should cut your DEM into tiles with for example<br>
gdal2tiles or MapTiler and the resuld could actually be quit speedy and<br>
perhaps using 126x126 tiles could make it still a bit faster. Hope that they<br>
can create tiles as 16-bit tiffs.<br>
<br>
I am sure that these results are not scientifically sound but I am also<br>
sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think<br>
about especially if you dream about a mobile service.<br>
<br>
I placed the requests which gdallocationinfo made during these tests into<br>
<a href="http://latuviitta.org/documents/gdallocationinfo_requests.txt" target="_blank">http://latuviitta.org/documents/gdallocationinfo_requests.txt</a><br>
<span class="HOEnZb"><font color="#888888"><br>
-Jukka Rahkonen-<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">http://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>
</div></div></blockquote></div><br></div></div>