<div dir="ltr">Even,<div><br></div><div>I think this looks good and a big win for a lot of use cases.</div><div><br></div><div>Comments related to the flags defined in gcore/gdal.h:</div><div><br></div><div>- If possible, use an enum rather than bare defines.</div><div>- Add comments to gdal.h explaining what the values are</div><div>- Should an unimplemented also have the _DATA flag set rather than _EMPTY?</div><div><br></div><div>A suggestion <a href="https://github.com/OSGeo/gdal/compare/trunk...rouault:sparse_datasets?expand=1#diff-66f7e256611c32cddc2745d78b4fe59bR3787">for this</a>, which looks brittle to me:</div><div><br></div><div> if( nStatus != GDAL_DATA_COVERAGE_STATUS_EMPTY )</div><div><br></div><div>A helper function to check empty rather than a direct check will allow for later or'ing the status with other flags. Or you can & with the flag. But looking at it a second time, I think it might be more explicit if you did a check against having data:</div><div><br></div><div> if( nStatus & <span style="font-size:12.8px">GDAL_DATA_COVERAGE_STATUS_DATA )</span></div><div><span style="font-size:12.8px"><br></span></div><div>-kurt</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jul 10, 2016 at 2:54 AM, Even Rouault <span dir="ltr"><<a href="mailto:even.rouault@spatialys.com" target="_blank">even.rouault@spatialys.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Le dimanche 10 juillet 2016 11:32:48, Andrew C Aitchison a écrit :<br>
> On Fri, 8 Jul 2016, Even Rouault wrote:<br>
> > The topic of sparse dataset management come back regularly, so I've<br>
> > decided to tackle it.<br>
> ><br>
> > Please find<br>
> > <a href="https://trac.osgeo.org/gdal/wiki/rfc63_sparse_datasets_improvements" rel="noreferrer" target="_blank">https://trac.osgeo.org/gdal/wiki/rfc63_sparse_datasets_improvements</a> for<br>
> > review.<br>
><br>
> I know of several proprietary file formats where the data is tiled<br>
> with the index indicating (perhaps implicitly) where the tile has no<br>
> data.<br>
><br>
> A driver for such formats could have IReadBlock quickly return with<br>
> a code to indicate NoData, rather than filling in the image data.<br>
> As it stands that might mean extending CPLErr, but would<br>
> that be helpful to the main library ?<br>
<br>
</span>That would have a significant impact on the whole code base, as well as<br>
application code, so I didn't really considered that option and prefered an<br>
auxiliary interface to be used by code aware and caring about special<br>
behaviour for sparse datasets.<br>
<span class=""><br>
><br>
> Is this what is described by having the offset and byte count both zero ?<br>
</span>Yes, in TIFF, you have 2 arrays, one that indicates at which offset in the file<br>
a given tile/strip is located, and the other one for the number of bytes of<br>
that tile/strip. GDAL uses offset = count = 0 as a convention for missing<br>
blocks.<br>
<span class="">><br>
> ----<br>
><br>
> I don't really understand how GDAL_DATA_COVERAGE_STATUS values combine<br>
> or<br>
<br>
</span>* If the requested window contains has no missing blocks, it returns<br>
GDAL_DATA_COVERAGE_STATUS_DATA<br>
* If the requested window has only missing blocks, it returns<br>
GDAL_DATA_COVERAGE_STATUS_EMPTY<br>
* If the requested window is a mix of both, it returns<br>
GDAL_DATA_COVERAGE_STATUS_DATA | GDAL_DATA_COVERAGE_STATUS_EMPTY<br>
<br>
> when pdfDataPct is valid.<br>
<br>
It should be valid if the processing has not been stopped prematurely due to<br>
the nMaskFlagStop being triggered. For example if you have a dataset and you<br>
want a special processing (could be just an info "This dataset is sparse") as<br>
soon as it contains empty blocks, then you can query the whole dataset extent<br>
with nMaskFlagStop = GDAL_DATA_COVERAGE_STATUS_EMPTY. As soon as a missing<br>
block is found, the function will exit, and will thus be unable to determine<br>
the percentage of valid data.<br>
<span class=""><br>
><br>
> In one of the formats above, the tile index has special values<br>
> for "no data" and and for "data exists and could be retrieved/purchased<br>
> if required". I'd consider mapping these to GDAL_DATA_COVERAGE_STATUS_EMPTY<br>
<br>
</span>Clearly a missing block will cause GDAL_DATA_COVERAGE_STATUS_EMPTY to be set.<br>
<span class=""><br>
> and GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED. Does that make sense ?<br>
<br>
</span>GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED is aimed at being returned when a<br>
driver does not offer an implementation of the interface, and thus uses the<br>
default dumb implementation that returns<br>
GDAL_DATA_COVERAGE_STATUS_UNIMPLEMENTED<br>
<br>
I realize that I didn't really document yet the semantics of those flags. To be<br>
done.<br>
<span class="im HOEnZb"><br>
<br>
--<br>
Spatialys - Geospatial professional services<br>
<a href="http://www.spatialys.com" rel="noreferrer" target="_blank">http://www.spatialys.com</a><br>
</span><div class="HOEnZb"><div class="h5">_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" rel="noreferrer" target="_blank">http://lists.osgeo.org/mailman/listinfo/gdal-dev</a></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">--<div><a href="http://schwehr.org" target="_blank">http://schwehr.org</a></div></div>
</div>