From vincent.sarago at gmail.com Mon Apr 16 12:14:04 2018 From: vincent.sarago at gmail.com (Vincent Sarago) Date: Mon, 16 Apr 2018 15:14:04 -0400 Subject: [Landsat-pds] COG Format discussion Message-ID: Sorry for hijacking this mailing list but as Landsat dataset is one of the biggest open COG dataset, discussion about the evolution of the format here made sense to us. Couple weeks ago we started a discussion internally about COGEO format. It’s great to see how many people are using and implementing COG right now (e.g planet, digitalglobe…) that’s say we think there is still place for improvement. Here are some of our ideas: - add webp compression to libtiff. Even if webp wasn’t a big success in term of adoption, the format is still a good option in comparison to JPEG and PNG. - Improve mask storing inside COG. For some technical reason, when you create a COG with internal masking, the mask is appended to the end of the IFD. Some of improvement could be to append mask TILE data just after the imagery tile data. we’d love to hear story or comments from people about the format and how they see it moving in the future. From warmerdam at pobox.com Mon Apr 16 14:59:51 2018 From: warmerdam at pobox.com (Frank Warmerdam) Date: Mon, 16 Apr 2018 14:59:51 -0700 Subject: [Landsat-pds] COG Format discussion In-Reply-To: References: Message-ID: On Mon, Apr 16, 2018 at 12:14 PM, Vincent Sarago wrote: > Sorry for hijacking this mailing list but as Landsat dataset is one of the > biggest open COG dataset, discussion about the evolution of the format here > made sense to us. > > Couple weeks ago we started a discussion internally about COGEO format. > It’s great to see how many people are using and implementing COG right now > (e.g planet, digitalglobe…) that’s say we think there is still place for > improvement. > > Here are some of our ideas: > - add webp compression to libtiff. Even if webp wasn’t a big success in > term of adoption, the format is still a good option in comparison to JPEG > and PNG. > Vincent, and other folks, I'd be interested in whether webp is believed to have good enough performance that it would be a substantial improvement over the Deflate support already available in libtiff. I'm also interested in how it compares to the new zstd support ( https://github.com/OSGeo/gdal/commit/1c60366a193e67ee90856e1008e3c17cb8524f60#diff-ce45424050585add924746240ffc2761). I'm willing to support new codecs in libtiff if they add significant value, but I don't want to just add every compression format known. > - Improve mask storing inside COG. For some technical reason, when you > create a COG with internal masking, the mask is appended to the end of the > IFD. Some of improvement could be to append mask TILE data just after the > imagery tile data. > That is interesting. Even, can you comment on what it would take to ensure extra mask IFDs are located near their corresponding imagery as part of GDAL and the implications for COG? I assume you are suggesting a file with the "nodata" handled as a distinct IFD like this (our internal serving format): http://download.osgeo.org/gdal/data/gtiff/20160929_023611_0e0f_Browse.tif ... TIFF Directory at offset 0x2070 (8304) Subfile Type: reduced-resolution image (1 = 0x1) Image Width: 857 Image Length: 438 Tile Width: 128 Tile Length: 128 Bits/Sample: 8 Sample Format: unsigned integer Compression Scheme: JPEG Photometric Interpretation: YCbCr YCbCr Subsampling: 2, 2 Samples/Pixel: 3 Planar Configuration: single image plane Reference Black/White: 0: 0 255 1: 128 255 2: 128 255 JPEG Tables: (142 bytes) ... TIFF Directory at offset 0x2a0e (10766) Subfile Type: reduced-resolution image/transparency mask (5 = 0x5) Image Width: 857 Image Length: 438 Tile Width: 128 Tile Length: 128 Bits/Sample: 1 Sample Format: unsigned integer Compression Scheme: AdobeDeflate Photometric Interpretation: transparency mask Samples/Pixel: 1 Planar Configuration: single image plane Predictor: none 1 (0x1) ... I consider this format very useful (lossy compression for the imagery, but losslessly compress nodata masks) for some purposes and I'd be interested in methods to optimize it for COG use even though it is pretty rare for applications to properly support it. Best regards, Frank > > we’d love to hear story or comments from people about the format and how > they see it moving in the future. > > _______________________________________________ > Landsat-pds mailing list > Landsat-pds at lists.osgeo.org > https://lists.osgeo.org/mailman/listinfo/landsat-pds > -- ---------------------------------------+-------------------------------------- I set the clouds in motion - turn up | Frank Warmerdam, warmerdam at pobox.com light and sound - activate the windows | and watch the world go round - Rush | Geospatial Software Developer -------------- next part -------------- An HTML attachment was scrubbed... URL: From even.rouault at spatialys.com Tue Apr 17 00:54:40 2018 From: even.rouault at spatialys.com (Even Rouault) Date: Tue, 17 Apr 2018 09:54:40 +0200 Subject: [Landsat-pds] COG Format discussion In-Reply-To: References: Message-ID: <2370754.Q33c2g4Dgx@even-i700> > I'd be interested in whether webp is believed to have good enough > performance that it would be a substantial improvement over the Deflate > support already available in libtiff. I'm also interested in how it > compares to the new zstd support ( > https://github.com/OSGeo/gdal/commit/1c60366a193e67ee90856e1008e3c17cb8524f6 > 0#diff-ce45424050585add924746240ffc2761). I'm willing to support new codecs > in libtiff if they add significant value, but I don't want to just add > every compression format known. You can only compare webp vs deflate or zstd for its lossless profile. The WebP website provides this comparison against PNG: https://developers.google.com/speed/webp/docs/webp_lossless_alpha_study so claiming a 42% size improvement over PNG For lossy support, if you believe Mozilla (pushing for MozJpeg) https://research.mozilla.org/2014/07/15/mozilla-advances-jpeg-encoding-with-mozjpeg-2-0/ rather than Google https://developers.google.com/speed/webp/docs/webp_study "We consider this study to be inconclusive when it comes to the question of whether "WebP and/or JPEG XR outperform JPEG by any significant margin" One potential advantage of webp is that lossy webp for imagery with lossless alpha would be naturally supported (in comparison to the second point below). One drawback of webp is that it is really RGB[A] only (would likely be unwise to use it for 3 or 4 band image with other photometric interpretations) > > > - Improve mask storing inside COG. For some technical reason, when you > > create a COG with internal masking, the mask is appended to the end of the > > IFD. Some of improvement could be to append mask TILE data just after the > > imagery tile data. > > I assume you are suggesting a file with the "nodata" handled as a distinct > IFD like this (our internal serving format): That was I understood from previous discussion with Vincent's team. > That is interesting. Even, can you comment on what it would take to ensure > extra mask IFDs are located near their corresponding imagery as part of > GDAL and the implications for COG? Such a layout of blocks would apply only for Planar Configuration==single image plane That's certainly doable by the GDAL GTiff driver, but would require the copying of imagery to be done in a custom fashion, to properly interleave imagery blocks with mask blocks, instead of using GDALDatasetCopyWholeRaster() (for imagery) and GDALRasterBandCopyWholeRaster() (for masks). Actually, implementation wise, that could be a new option of GDALDatasetCopyWholeRaster (INTERLEAVE_MASK=YES). For very large images (dimensions > 100,000 pixels) where the size of the TileOffsets and TileByteCounts tags is big this constant back and forth between IFD could be rather costly, as they are reloaded/flushed each time you change the active IFD in libtiff. The COG definition would have to be updated for that use case (probably as an allowed extra optimization, rather than forcing people to use it), and the validation script as well. On the GDAL read side, the GTiff driver would also have to be updated to detect this layout and when reading a block of imagery, it should issue a GET range request that is big enough to fetch the imagery block and its mask block at once (the base logic to optimize GET requests for a given IRasterIO() request is already in place) Even -- Spatialys - Geospatial professional services http://www.spatialys.com