[gdal-dev] Call for discussion on RFC 63 : Sparse datasets improvements

Even Rouault even.rouault at spatialys.com
Sun Jul 10 02:54:17 PDT 2016


Le dimanche 10 juillet 2016 11:32:48, Andrew C Aitchison a écrit :
> On Fri, 8 Jul 2016, Even Rouault wrote:
> > The topic of sparse dataset management come back regularly, so I've
> > decided to tackle it.
> > 
> > Please find
> > https://trac.osgeo.org/gdal/wiki/rfc63_sparse_datasets_improvements for
> > review.
> 
> I know of several proprietary file formats where the data is tiled
> with the index indicating (perhaps implicitly) where the tile has no
> data.
> 
> A driver for such formats could have IReadBlock quickly return with
> a code to indicate NoData, rather than filling in the image data.
> As it stands that might mean extending CPLErr, but would
> that be helpful to the main library ?

That would have a significant impact on the whole code base, as well as 
application code, so I didn't really considered that option and prefered an 
auxiliary interface to be used by code aware and caring about special 
behaviour for sparse datasets.

> 
> Is this what is described by having the offset and byte count both zero ?
Yes, in TIFF, you have 2 arrays, one that indicates at which offset in the file 
a given tile/strip is located, and the other one for the number of bytes of 
that tile/strip. GDAL uses offset = count = 0 as a convention for missing 
blocks.
> 
> ----
> 
> I don't really understand how GDAL_DATA_COVERAGE_STATUS values combine
> or

* If the requested window contains has no missing blocks, it returns 
GDAL_DATA_COVERAGE_STATUS_DATA
* If the requested window has only missing blocks, it returns 
GDAL_DATA_COVERAGE_STATUS_EMPTY
* If the requested window is a mix of both, it returns 
GDAL_DATA_COVERAGE_STATUS_DATA | GDAL_DATA_COVERAGE_STATUS_EMPTY

> when pdfDataPct is valid.

It should be valid if the processing has not been stopped prematurely due to 
the nMaskFlagStop being triggered. For example if you have a dataset and you 
want a special processing (could be just an info "This dataset is sparse") as 
soon as it contains empty blocks, then you can query the whole dataset extent 
with nMaskFlagStop = GDAL_DATA_COVERAGE_STATUS_EMPTY. As soon as a missing 
block is found, the function will exit, and will thus be unable to determine 
the percentage of valid data.

> 
> In one of the formats above, the tile index has special values
> for "no data" and and for "data exists and could be retrieved/purchased
> if required". I'd consider mapping these to GDAL_DATA_COVERAGE_STATUS_EMPTY

Clearly a missing block will cause GDAL_DATA_COVERAGE_STATUS_EMPTY to be set.

> and GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED. Does that make sense ?

GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED is aimed at being returned when a 
driver does not offer an implementation of the interface, and thus uses the 
default dumb implementation that returns 
GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED

I realize that I didn't really document yet the semantics of those flags. To be 
done.


-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list