[COG] Fwd: Cloud optimized GeoTIFF configuration specifics [SEC=UNOFFICIAL]

Fri Jun 15 04:18:14 PDT 2018

Hi,

replying here to a thread that started privately.

> ---------- Forwarded message ---------
> From: Kouzoubov Kirill <Kirill.Kouzoubov at ga.gov.au>
> Date: Wed, Jun 13, 2018, 8:26 PM
> Subject: RE: Cloud optimized GeoTIFF configuration specifics
> [SEC=UNOFFICIAL]
> To: Chris Holmes <cholmes at radiant.earth>, Seth Fitzsimmons
> <seth.fitzsimmons at radiant.earth>
> 
> Hi Chris,
> 
> 
> 
> I fully agree that single file with everything in it (bands, overviews,
> stats) would be ideal from useability and even data management perspective.
> Particularly if overviews can be made lossy (jpeg) and not take too much
> space, so would only be meant for visualisation not computation. Purely
> from format perspective I don’t see it as inefficient or problematic. But
> when you start taking into account existing software in its current form,
> number of problems come into light
> 
> 
> 
> ·         Larger header, means much slower open
> 
> 
> 
> This is purely GDAL implementation issue, fetching more bytes when opening
> a file is not a problem as such, it’s that GDAL will make many more
> requests (which is slow). And currently there is no way to hint GDAL to
> fetch more bytes on open even if you know characteristics of your
> collection. Storing detailed metadata about the file on faster storage
> medium addresses this problem, but then you need bespoke reader.
> 
> 

Just added the following which was missing in /vsicurl/ doc at
http://gdal.org/gdal_virtual_file_systems.html#gdal_virtual_file_systems_vsicurl

"""
Starting with GDAL 2.3, the GDAL_INGESTED_BYTES_AT_OPEN
configuration option can be set to impose the number of bytes read in one
GET call at file opening (can help performance to read Cloud optimized geotiff
with a large header).

Related to that there's a point I've in mind. The current COG definition asks 
for the TileOffsets and TileByteCounts tag values to be located just after the 
IFD they refer too, and before the next IFD. I'm not totally convinced this is 
the appropriate organization for COG that would have very large dimensions and 
thus very large sizes of TileOffsets and TileByteCounts. If for your 
processing you only need to access one tile, reading the whole TileOffsets and 
TileByteConts could become adverse for performance. GDAL when built against 
its internal libtiff copy can use an optimization to avoid reading the whole 
arrays, but just the part of them that are needed to get the offset and count 
of the blocks it needs to access.

But this is mostly a concern for very large files. For example if you take a 
10,000 x 10,000 file with 512 x 512 blocks, the size of both arrays is 3200 
bytes: 2 uint32 values for (10000 / 512)^2 tiles
You have to go to 190,000 x 190,000 pixels to reach the megabyte size.

In that case, probably that the following organization would be better:

Minimum header:
- IFD1 (full resolution image)
- tag values of IFD1 except TileOffset ands TileByteCounts (essentially 
GeoTIFF tag values)
- IFD2 (overview)
- tag values of IFD2 except TileOffset ands TileByteCounts

Extended header:
- Tile Offsets and TileByteCounts of IFD1
- Tile Offsets and TileByteCounts of IFD2 (the order of this line and the 
previous one could be indifferently switched)

Imagery:
- Imagery of IFD2
- Imagery of IFD1

A reader would have to read at least the minimum reader, whatever it will do 
with the file. It could then decide to completely read extended header, or 
just part of it, depending on how much of the imagery it will process.

The only drawback of this organization is that it might require changes in 
libtiff to generate it, and that's not necessarily trivial to do so... 
(libtiff would have no issue reading that of course)

> 
> ·         Bands in separate files are easier to read concurrently
> 
> 
> 
> Again purely an implementation problem of GDAL. To have concurrent reads
> from the same file you have to open that file multiple times. Cache should
> make subsequent opens cheaper, but it’s not guaranteed. Nothing that cannot
> be fixed with a bespoke reader, but if you just want to use GDAL, or more
> likely a convenient wrapper for it in some dynamic language this becomes a
> bit of a problem. And there is no storage saving from putting several bands
> into one file, even if they share the same geospatial data, it all gets
> repeated from what I understand, not sure if that’s requirement of TIFF
> standard or just a s limitation of TIFF writer libraries.
> 

Starting with GDAL 2.3, the GDAL GTiff driver can issue parallel requests if a 
pixel request intersects several tiles. In HTTP 1.1, this will create parallel 
connections. If enabling HTTP 2.0 (and having a libcurl version, and server 
supporting it), HTTP 2.0 multiplexing is used.

Can be controlled with the GDAL_HTTP_VERSION configuration option
""""
GDAL_HTTP_VERSION=1.0/1.1/2/2TLS (GDAL >= 2.3). Specify HTTP version to use.
 *     Will default to 1.1 generally (except on some controlled environments,
 *     like Google Compute Engine VMs, where 2TLS will be the default).
 *     Support for HTTP/2 requires curl 7.33 or later, built against nghttp2.
 *     "2TLS" means that HTTP/2 will be attempted for HTTPS connections only. 
Whereas
 *     "2" means that HTTP/2 will be attempted for HTTP or HTTPS.
"""

There's currently no optimization to issue parallel requests if the bands are 
separate (PLANARCONFIG=SEPARATE in TIFF parlance) instead of using pixel 
interleaving (PLANARCONFIG=CONTIG), but could probably be added. And that 
doesn't request the bands to be in separate files.

That said I'm not completely convinced that this would result in (significant) 
performance wins. When doing the above optimization about parallel requests 
for several intersection tiles, this was done for Google Cloud Engine+Storage 
environement, and I found benchmarking this to be super tricky. Timings tend 
to be not repeatable (variance of the timings is huge). For example deciding 
which of HTTP 1.1 parallel connections (several TCP sockets) vs HTTP 2.0 
multiplexing (single TCP socket, but with multiplexing of requests and 
responses) is the best choice tended to be super difficult to assess (the 
difference of timing was not that huge), hence I only enabled HTTP 2 by 
default for the particular environment I tested.

In fact the question is more general than parallelizing request to get 
different bands. Imagine that the data is not compressed, and you have N 
bands, and the number of bytes for one block of a band is B. And consider the 
case of a single tile we want to read. If you have PLANARCONFIG=CONTIG, you 
have a single block of size N*B. If you have PLANARCONFIG=SEPARATE, you have N 
blocks of size B. So if you decide to do parallelized read in the 
PLANARCONFIG=SEPARATE case, why not also artificially spitting your single 
request in the PLANARCONFIG=CONTIG case as well and doing paralllized read ? 
(The advantage of PLANARCONFIG=CONTIG is reduced amount of metadata)
In an ideal world, parallelizing for reading a contiguous sequence of ranges 
shouldn't help at all: your single connection should deliver at maximum speed. 
But perhaps splitting would in practice help performance a bit.
There is probably a value of the minimum amount of bytes below which splitting 
the request in 2 GETs is going to be slower than doing a single big request. 
There is probably also a maximum amount of parallel channels beyond which  
performance will decrease.

Parallelization to read non-contiguous sequences can help a bit since you can 
save the latency of serial requests to the server (with HTTP/2 multiplexing in 
particular, at least in theory). Instead of doing on the same socket: ask 
range R1, wait for server initial processing, get data for range R1, ask range 
R2, wait for  server initial processing, get data for range R2. You can do: 
ask range R1, ask range R2 without waiting for server, wait for server initial 
processing, get data for R1 (or R2 depending of server optimzations), get data 
for R2 (or R1). But sometimes establishing HTTP 1.1 parallel connections can 
give a small win (if you can reuse your HTTP 1.1 sockets, otherwise the TLS 
establishment time will be adverse)

I don't claim being an expert for maximum throughput of HTTP connections, so 
take the above as the result of my modest experiments 

> 
> Also how does multiple bands + overviews work together. Can you point me to
> a resource that explains embedded overviews, just looking at generated
> files with embedded overviews they look like multi-band tiff, so I’d like
> to understand that aspect properly.
> 

Using tiffdump / tiffinfo + some reading of the TIFF spec will help. overviews 
(different IFD in TIFF parlance) and bands (Samples in TIFF parlance) are 
completely different concepts.

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com