[gdal-dev] Call for discussion on RFC 63 : Sparse datasets improvements

Even Rouault even.rouault at spatialys.com
Mon Jul 11 08:31:58 PDT 2016


Le lundi 11 juillet 2016 17:11:59, Kurt Schwehr a écrit :
> Even,
> 
> I think this looks good and a big win for a lot of use cases.
> 
> Comments related to the flags defined in gcore/gdal.h:
> 
> - If possible, use an enum rather than bare defines.

They are flags that can be OR'ed, so you cannot really define an enumeration. Or 
am I missing something ?


> - Add comments to gdal.h explaining what the values are
Yep, will do

> - Should an unimplemented also have the _DATA flag set rather than _EMPTY?

Yes, this is the case of the default GDALRasterBand::IGetDataCoverageStatus(). 
I should also mention it in the doc that a driver that would implement the 
method, but in some cases would want to return _UNIMPLEMENTED should also set 
_DATA. 

> 
> A suggestion for this
> <https://github.com/OSGeo/gdal/compare/trunk...rouault:sparse_datasets?expa
> nd=1#diff-66f7e256611c32cddc2745d78b4fe59bR3787>, which looks brittle to
> me:
> 
>      if( nStatus != GDAL_DATA_COVERAGE_STATUS_EMPTY )
> 
> A helper function to check empty rather than a direct check will allow for
> later or'ing the status with other flags.  Or you can & with the flag.  But
> looking at it a second time, I think it might be more explicit if you did a
> check against having data:
> 
>     if( nStatus & GDAL_DATA_COVERAGE_STATUS_DATA )

Yes, that's probably better (assuming that _unimplemented comes with _data 
set)

Thanks

> 
> -kurt
> 
> On Sun, Jul 10, 2016 at 2:54 AM, Even Rouault <even.rouault at spatialys.com>
> 
> wrote:
> > Le dimanche 10 juillet 2016 11:32:48, Andrew C Aitchison a écrit :
> > > On Fri, 8 Jul 2016, Even Rouault wrote:
> > > > The topic of sparse dataset management come back regularly, so I've
> > > > decided to tackle it.
> > > > 
> > > > Please find
> > > > https://trac.osgeo.org/gdal/wiki/rfc63_sparse_datasets_improvements
> > 
> > for
> > 
> > > > review.
> > > 
> > > I know of several proprietary file formats where the data is tiled
> > > with the index indicating (perhaps implicitly) where the tile has no
> > > data.
> > > 
> > > A driver for such formats could have IReadBlock quickly return with
> > > a code to indicate NoData, rather than filling in the image data.
> > > As it stands that might mean extending CPLErr, but would
> > > that be helpful to the main library ?
> > 
> > That would have a significant impact on the whole code base, as well as
> > application code, so I didn't really considered that option and prefered
> > an auxiliary interface to be used by code aware and caring about special
> > behaviour for sparse datasets.
> > 
> > > Is this what is described by having the offset and byte count both zero
> > > ?
> > 
> > Yes, in TIFF, you have 2 arrays, one that indicates at which offset in
> > the file
> > a given tile/strip is located, and the other one for the number of bytes
> > of that tile/strip. GDAL uses offset = count = 0 as a convention for
> > missing blocks.
> > 
> > > ----
> > > 
> > > I don't really understand how GDAL_DATA_COVERAGE_STATUS values combine
> > > or
> > 
> > * If the requested window contains has no missing blocks, it returns
> > GDAL_DATA_COVERAGE_STATUS_DATA
> > * If the requested window has only missing blocks, it returns
> > GDAL_DATA_COVERAGE_STATUS_EMPTY
> > * If the requested window is a mix of both, it returns
> > GDAL_DATA_COVERAGE_STATUS_DATA | GDAL_DATA_COVERAGE_STATUS_EMPTY
> > 
> > > when pdfDataPct is valid.
> > 
> > It should be valid if the processing has not been stopped prematurely due
> > to
> > the nMaskFlagStop being triggered. For example if you have a dataset and
> > you
> > want a special processing (could be just an info "This dataset is
> > sparse") as
> > soon as it contains empty blocks, then you can query the whole dataset
> > extent
> > with nMaskFlagStop = GDAL_DATA_COVERAGE_STATUS_EMPTY. As soon as a
> > missing block is found, the function will exit, and will thus be unable
> > to determine
> > the percentage of valid data.
> > 
> > > In one of the formats above, the tile index has special values
> > > for "no data" and and for "data exists and could be retrieved/purchased
> > > if required". I'd consider mapping these to
> > 
> > GDAL_DATA_COVERAGE_STATUS_EMPTY
> > 
> > Clearly a missing block will cause GDAL_DATA_COVERAGE_STATUS_EMPTY to be
> > set.
> > 
> > > and GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED. Does that make sense ?
> > 
> > GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED is aimed at being returned when a
> > driver does not offer an implementation of the interface, and thus uses
> > the default dumb implementation that returns
> > GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED
> > 
> > I realize that I didn't really document yet the semantics of those flags.
> > To be
> > done.
> > 
> > 
> > --
> > Spatialys - Geospatial professional services
> > http://www.spatialys.com
> > _______________________________________________
> > gdal-dev mailing list
> > gdal-dev at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list