<div dir="ltr"><div>Hi Julien<br></div><div><br></div><div>These improvements look really great! </div><div><br></div><div>Unfortunately I am on a long-term vacation and semi-retired from my GIS work. So I am sorry but I cannot really help on testing and committing the work...<br></div><div>That is why I did not respond sooner.<br></div><div><br></div><div>As the last "official" netcdf maintainer I should at least give some comments.<br></div><div><br></div><div>Perhaps Even can test the patches and commit them, or someone could step in as a new netcdf maintainer.</div><div><br></div><div>If it is possible, It would be best to separate these improvements into a number of patches. In any case you should add your patch(es) to a number of new gdal trac ticket(s) (or existing ticket if it fixes a bug). I did significant improvements some time ago and I understand that it is hard to separate these improvements in several patches, so I don't think it is absolutely necessary to split them up - but it helps debugging any regressions. Especially the groups support, I/O improvements and NASA products should have independent patches and tickets.</div><div><br></div><div>Regarding 10) it would be best to maintain backwards compatibility, but if the issue is only with 1D dataset I don't think it matters that much.</div><div><br></div><div>Cheers,</div><div>Etienne</div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Feb 5, 2015 at 11:10 PM, Julien Demaria <span dir="ltr"><<a href="mailto:Julien.Demaria@acri-st.fr" target="_blank">Julien.Demaria@acri-st.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi GDAL team,<br>
<br>
I've implemented several improvements to the NetCDF driver and I would like to provide them to the community.<br>
Main goal of the changes is to add full support of NetCDF-4 including groups.<br>
NetCDF-4 is the future format of ESA Sentinel-3 products (no groups) and NASA Ocean Color team is switching their L2/L3 products to NetCDF-4 with groups (VIIRS has already switched to the new format in December).<br>
With the changes NASA L2 products geolocation is automatically handled as geolocation arrays and can be reprojected using gdalwarp.<br>
<br>
I validated with autotests that nothing is broken in tests netcdf.py (excepting test 13 but see my point 5), netcdf_cf.py and hdf5.py, using NetCDF-3 and 4 libraries.<br>
I've also tested the new functionalities on various NetCDF-4 files.<br>
I think the only possible regression could be for marginal cases where a file was seen directly as a dataset and is now seen as multiple subdatasets (for example if a file has only one var in the top group and has nested groups containing variables), but I think this is not very common.<br>
<br>
For the moment I have all these changes in local GIT separated commits on the latest gdal-1.11 branch, let me know what changes you want and how can I provide them.<br>
<br>
Changes :<br>
<br>
1) Implement full support for NetCDF-4 groups on reading:<br>
- explore recursively all nested groups to create the subdatasets list<br>
- subdatasets in nested groups use the /group1/group2/.../groupn/var standard<br>
NetCDF-4 convention, excepting for variables in the root group which do not<br>
have a leading slash for backward compatibility<br>
- when accessing a subdataset using NETCDF:$file:$path, the leading slash is optional<br>
- global attributes of each nested group are also collected in the GDAL dataset<br>
metadata, using the same convention /group1/group2/.../groupn/NC_GLOBAL#attr_name,<br>
excepting for the root group which do not have a leading slash for backward compatibility<br>
- when searching for a variable containing auxiliary information on the selected subdataset,<br>
like coordinate variables or grid_mapping, we now also search in parent groups (using NCDFResolveVar).<br>
I now this is something not specified at this time in the CF convention because CF does not know groups,<br>
but it seems logical to me to support this: NetCDF-4 specifies that dimensions of a group are<br>
shared to its nested groups, so associated coordinate variables could be defined as the same level of its<br>
corresponding dimension.<br>
- reference to coordinate variables using the "coordinates" attribute support now also absolute paths,<br>
this allow for example to specify coordinate variables located outside the group of the selected variable<br>
or its parents. Relative paths could be implemented if needed.<br>
This feature is used to add support for new NASA Ocean Color L2 products.<br>
<br>
2) Implement full read/write support for new NetCDF4 types NC_UBYTE, NC_USHORT, NC_UINT and NC_STRING, only if NETCDF_HAS_NC4 is defined (and only if format=NC4 for writing).<br>
Support implemented for variables and attributes.<br>
NC_STRING type is supported for reading (scalar and arrays) attributes and is used for writing only for array attributes (scalar are still written as NC_CHAR).<br>
If NETCDF_HAS_NC4 is not defined or format!=NC4, NC_STRING array attributes are written as a single NC_CHAR string using the GDAL {v1,v2,...} convention.<br>
Add missing support for NC_BYTE in CreateBandMetadata() and NC_BYTE/SHORT in NCDFPut1DVar().<br>
<br>
3) Add support for new NASA Ocean Color L2 products and ESA Sentinel-3 L1 or<br>
L2 products which use the NetCDF-4 format (with groups for NASA, see<br>
<a href="http://oceancolor.gsfc.nasa.gov/DOCS/FormatChange.html" target="_blank">http://oceancolor.gsfc.nasa.gov/DOCS/FormatChange.html</a>):<br>
- NASA products: simulate a "coordinates" variable attribute to detect CF<br>
geolocation arrays, and set bBottomUp to FALSE<br>
- ESA products: set bBottomUp to FALSE and disable warning on missing<br>
Conventions attribute<br>
<br>
4) Fix bug #4554 with a more generic solution by disabling the installation of the HDF5 atexit() cleanup routine using H5dont_atexit().<br>
Previous fix was to call GDALExit() (for the moment only defined gdalwarp.cpp) at the end of every program, which is more painful.<br>
<br>
5) Fix implementation of GetScale/Offset to not always return pbSuccess=TRUE.<br>
Fix CopyMetadata to handle bands with only scale or offset.<br>
==> WARNING this commit breaks the autotest netcdf_13 (check for scale/offset = 1.0/0.0 if no scale or offset is available), but for me it is not logical to return always pbSuccess=TRUE<br>
<br>
6) Optimize IReadBlock() and CheckData() handling of partial blocks in the x axis by re-using the GDAL block buffer instead of allocating a new temporary buffer for each block.<br>
<br>
7) Force block size to 1 scanline for bottom-up datasets if nBlockYSize != 1 instead of raising a fatal error<br>
==> Solve a recent problem raised on the mailing list<br>
<br>
8) Implement Get/SetUnitType using the standard "units" NetCDF attribute<br>
<br>
9) Change default block size to 256x256 instead of scanline (only affect file without NetCDF chunking)<br>
==> because I think this is better for a random access to the data, but I'm not sure if the community want this change which could impact performances<br>
<br>
10) I've also implemented for my needs support for 1D variables by simulating 2D datasets with only one row (dimensionless variables are not supported for the moment),<br>
but this breaks backward compatibility because files containing only one variable and associated 1D coordinate variables are now seen as multiple sub datasets...<br>
and maybe this is not the goal of GDAL to give access to not-2D-raster variables (but sometimes it's useful ;-) )<br>
<br>
Thanks for GDAL!<br>
<br>
Julien<br>
<br>
_______________________________________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">http://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>
</blockquote></div><br></div></div>