[gdal-dev] Fast Pixel Access

David Baker (Geoscience) david.m.baker at chk.com
Wed Feb 5 05:25:16 PST 2014


Luke,

Thank you for this suggestion.  This too the access times from 15-20 seconds down to 1 to 3 seconds.  The majority of the time seems to be spent on the initial read of the vrt as subsequent piped locations after the first are returned sub-second.  For my current application, this should be okay.

David


From: gdal-dev-bounces at lists.osgeo.org [mailto:gdal-dev-bounces at lists.osgeo.org] On Behalf Of Luke Roth
Sent: Monday, February 03, 2014 8:11 AM
To: Jukka Rahkonen
Cc: gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] Fast Pixel Access

Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line.  That should help with GDAL reading the directory each time it opens a dataset.  I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster.
Luke
On Sun, Feb 2, 2014 at 2:12 PM, Jukka Rahkonen <jukka.rahkonen at mmmtike.fi<mailto:jukka.rahkonen at mmmtike.fi>> wrote:
Hi,

I made a few tests and here comes my conclusions. Hypothesis is that someone
wants to make a DEM query service which is using gdallocationinfo for
queries and DEM data is to be accessed as files from a standard web site. I
compared three alternatives:
1) There are thousands of DEM files on the server and they are combined
together with a VRT file.
2) There is only one DEM file as BigTIFF.
3) DEM is split into tiles into x/y/z tile directory structure like in
Google maps or OpenStreetMap tiles.

My test data covers Finland with 10 m grid size and as deflate compressed
tiffs they make about 10 GB together.

Before going on, keep in mind that the speed needs indexes. The better
index, the less unnecessary data to read. In case 1) the first level index
is the VRT file. The second level index, if it exists, is in the headers of
the real DEM files. It may be possible to jump to a correct offset from the
beginning of the DEM data and read only a part of the file.  In case 2) the
index is in the internal TIFF directory. If the BigTIFF is tiled the access
to tiles should be rather effectice. And finally in case 3) the index is
built into directory structure and tiling schema that is used for saving the
tiles. The schema is no well known that tile map service clients can
directly ask for a certain file name if they know the coordinates and scale.

Conclusions:

1)
- The whole VRT file must be readed. Caching the vrt file would make next
requests faster.
- For some reason gdallocationinfo wants to get the directory list of the
directory where the vrt file is. This is slow and generates lots of traffic
if the thousands of DEM files are in the same directory. Probably it would
be faster to have them in another dierectory.

2)
- BigTIFF route is more straight forward but gdallocationinfo needs still to
do many big range reads.
- Also in this case gdallocationinfo reads the target file directory. It
would be good to keep this directory small. Don't do like I did with having
in the directory the BigTIFF DEM file that was the only file needed, but
also the vrt and thousands of original DEMs from the previuos test -> but at
least this is a know this issue now and know how to avoid it. In my case
reading the directory made 2.2 MB of web traffic and all or most for wain.

3)
- I used OpenStreetMap tile service as the test data for the third test. In
this case gdallocationinfo knows exactly which tile to request and it is
making only one request. It also seems to cache some tiles on the client
side which means that queries for close locations may hit the cached tile
and be very fast.

Summary statistics:

1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data
2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data
3) Gdallocationinfo makes 1 requests and reads 10 kB of data

Requests I used are these:

1)
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
dem10m/dem_10m.vrt -geoloc  389559 6677412
2)
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
dem10m/dem_10m.tif -geoloc  389559 6677412
3)
gdallocationinfo  frmt_wms_openstreetmap_tms.xml -geoloc  389559 6677412

I know that the queried place in 3) is not the same because SRIDs of data
differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead
but it does not matter here, the idea is what is important.

My conclusion is that you should cut your DEM into tiles with for example
gdal2tiles or MapTiler and the resuld could actually be quit speedy and
perhaps using 126x126 tiles could make it still a bit faster. Hope that they
can create tiles as 16-bit tiffs.

 I am sure that these results are not scientifically sound but I am also
sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think
about especially if you dream about a mobile service.

I placed the requests which gdallocationinfo made during these tests into
http://latuviitta.org/documents/gdallocationinfo_requests.txt

-Jukka Rahkonen-


_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>
http://lists.osgeo.org/mailman/listinfo/gdal-dev


________________________________

This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140205/fa49eeee/attachment-0001.html>


More information about the gdal-dev mailing list