[gdal-dev] Fast Pixel Access

Jukka Rahkonen jukka.rahkonen at mmmtike.fi
Sun Feb 2 11:12:37 PST 2014


Hi,

I made a few tests and here comes my conclusions. Hypothesis is that someone
wants to make a DEM query service which is using gdallocationinfo for
queries and DEM data is to be accessed as files from a standard web site. I
compared three alternatives:
1) There are thousands of DEM files on the server and they are combined
together with a VRT file.
2) There is only one DEM file as BigTIFF.
3) DEM is split into tiles into x/y/z tile directory structure like in
Google maps or OpenStreetMap tiles.

My test data covers Finland with 10 m grid size and as deflate compressed
tiffs they make about 10 GB together.

Before going on, keep in mind that the speed needs indexes. The better
index, the less unnecessary data to read. In case 1) the first level index
is the VRT file. The second level index, if it exists, is in the headers of
the real DEM files. It may be possible to jump to a correct offset from the
beginning of the DEM data and read only a part of the file.  In case 2) the
index is in the internal TIFF directory. If the BigTIFF is tiled the access
to tiles should be rather effectice. And finally in case 3) the index is
built into directory structure and tiling schema that is used for saving the
tiles. The schema is no well known that tile map service clients can
directly ask for a certain file name if they know the coordinates and scale.

Conclusions:

1)
- The whole VRT file must be readed. Caching the vrt file would make next
requests faster.
- For some reason gdallocationinfo wants to get the directory list of the
directory where the vrt file is. This is slow and generates lots of traffic
if the thousands of DEM files are in the same directory. Probably it would
be faster to have them in another dierectory.

2) 
- BigTIFF route is more straight forward but gdallocationinfo needs still to
do many big range reads. 
- Also in this case gdallocationinfo reads the target file directory. It
would be good to keep this directory small. Don't do like I did with having
in the directory the BigTIFF DEM file that was the only file needed, but
also the vrt and thousands of original DEMs from the previuos test -> but at
least this is a know this issue now and know how to avoid it. In my case
reading the directory made 2.2 MB of web traffic and all or most for wain.

3)
- I used OpenStreetMap tile service as the test data for the third test. In
this case gdallocationinfo knows exactly which tile to request and it is
making only one request. It also seems to cache some tiles on the client
side which means that queries for close locations may hit the cached tile
and be very fast.

Summary statistics:

1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data
2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data
3) Gdallocationinfo makes 1 requests and reads 10 kB of data

Requests I used are these:

1)
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
dem10m/dem_10m.vrt -geoloc  389559 6677412
2)
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
dem10m/dem_10m.tif -geoloc  389559 6677412
3)
gdallocationinfo  frmt_wms_openstreetmap_tms.xml -geoloc  389559 6677412

I know that the queried place in 3) is not the same because SRIDs of data
differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead
but it does not matter here, the idea is what is important.

My conclusion is that you should cut your DEM into tiles with for example
gdal2tiles or MapTiler and the resuld could actually be quit speedy and
perhaps using 126x126 tiles could make it still a bit faster. Hope that they
can create tiles as 16-bit tiffs.

 I am sure that these results are not scientifically sound but I am also
sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think
about especially if you dream about a mobile service. 

I placed the requests which gdallocationinfo made during these tests into
http://latuviitta.org/documents/gdallocationinfo_requests.txt

-Jukka Rahkonen-




More information about the gdal-dev mailing list