[gdal-dev] Removing default nodata value from netCDF driver ?

Julien Demaria julien.demaria at acri-st.fr
Mon Nov 4 17:18:35 PST 2019


Hi Even,

Are you sure that the driver uses a default fill value if the attribute is not present? because bGotNoData seems set to false in this case and so SetNoDataValue() never called and GetNoDataValue() should return pbSuccess=false but maybe I'm wrong.

I already falled several years ago on the same behaviour with the NetCDF Python module, and after discussing with the author it seems that the NetCDF specs say that by default every NetCDF variable has a fill value even if the _FillValue attribute is not present (in this case the library uses its default fill value for the corresponding type). But you can disable this behavior of default fill value using nc_def_var_fill(NC_NOFILL). So it seems that generic applications should use nc_inq_var_fill() to know if the default fill value should be used.
You can found this discussion here: https://github.com/Unidata/netcdf4-python/issues/209
So I think we should keep the existing behavior (but add uses use nc_inq_var_fill()) to stick to the NetCDF specs.
On the other hand I think that this behavior is not very intuitive, not well known and there should be a lot of existing NetCDF files (wrongly) supposing that there is no fill value if the attribute is absent...

For byte/ubyte types (your case) the NetCDF specs recommends to not use default fill value:
https://www.unidata.ucar.edu/software/netcdf/docs/attribute_conventions.html
"It is not necessary to define your own _FillValue attribute for a variable if the default fill value for the type of the variable is adequate. However, use of the default fill value for data type byte is not recommended."
https://www.unidata.ucar.edu/software/netcdf/docs/known_problems.html#ncdump_ubyte_fill
"There should be no default fill values when reading any byte type, signed or unsigned, because the byte ranges are too small to assume one of the values should appear as a missing value unless a _FillValue attribute is set explicitly."
and after our discussion the NetCDF Python changed to ignore default fill value for byte/ubyte types, so maybe it's also the best choice for GDAL to stick to the NetCDF specs (but still not fully clear for me ;-) ). The current behavior of the GDAL driver is to force zero for byte/ubyte which is very dangerous and there is this comment "Don't do default fill-values for bytes, too risky" so maybe it's a bug and the developer wanted to ignore like the specs?

Best,
Julien

________________________________________
De : gdal-dev [gdal-dev-bounces at lists.osgeo.org] de la part de Even Rouault [even.rouault at spatialys.com]
Envoyé : lundi 4 novembre 2019 18:14
À : gdal-dev at lists.osgeo.org
Objet : [gdal-dev] Removing default nodata value from netCDF driver ?

Hi,

It has been raised to my attention that the netCDF driver systematically
reports a nodata value when opening a dataset, even if no explicit _FillValue
metadata item is set in the product. This seems to be a behaviour that has
existed forever, but I'm not clear why. For byte/ubyte data types, the nodata
value is set to 0 by default.

But for some products, like Sentinel 3 S3A_SL_2_LST products, in the
LST_ancillary_ds.nc file, 0 is a valid value.

See following extract of ncdump on such a product:

        ubyte biome(rows, columns) ;
                biome:flag_meanings = "open_ocean irrigated_cropland
rainfed_cropland mosaic_cropland mosaic_vegetation
broadleaved_evergreen_forest closed_broadleaved_deciduous_forrest
open_broadleaved_deciduous
_forest closed_needleleaved_forest open_needleleaved_forest mixed_forest
mosaic_forest mosaic_grassland shrubland grassland sparse_vegetation
freshwater_flooded_forest saltwater_flooded_forest flooded_vegetation
artificial_surface bare_area_unknown bare_area_orthents bare_area_sand
bare_area_calcids bare_area_cambids bare_area_orthels water snow_and_ice
unfilled" ;
                biome:flag_values = 0UB, 1UB, 2UB, 3UB, 4UB, 5UB, 6UB, 7UB,
8UB, 9UB, 10UB, 11UB, 12UB, 13UB, 14UB, 15UB, 16UB, 17UB, 18UB, 19UB, 20UB,
21UB, 22UB, 23UB, 24UB, 25UB, 26UB, 27UB, 28UB ;
                biome:long_name = "Gridded GlobCover surface classification
code" ;

So per https://github.com/OSGeo/gdal/pull/1979, I'm proposing to remove this
default nodata value. Anybody seeing a side effect to this ?

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


More information about the gdal-dev mailing list