[gdal-dev] NetCDF driver improvements (including groups support)

Joaquim Luis jluis at ualg.pt
Mon Mar 30 04:30:41 PDT 2015



>
> Thanks a lot for your advices, my changes are already separated in  
> different local GIT commits, so I plan to deliver them >into separated  
> patches/tickets (I already did it for change 7 related to ticket #5291).

Hi Julien,

Please note that I had to reopen #5291 because unfortunately your patch,  
although it works, makes the reading unbearably slow.


Joaquim


>
> But I discussed with Even and before deliver the patches I have to add  
> unit tests for each patch (especially for new >features), but  
> unfortunately I don’t have a lot of free time to do this for the  
> moment...
>
>
> Cheers,
>
> Julien
>
>
> De : Etienne Tourigny [mailto:etourigny.dev at gmail.com]Envoyé : lundi 30  
> mars 2015 03:52
> À : Julien Demaria
> Cc : gdal-dev at lists.osgeo.org
> Objet : Re: [gdal-dev] NetCDF driver improvements (including groups  
> support)
>
>
> Hi Julien
>
>
> These improvements look really great!  
>
> Unfortunately I am on a long-term vacation and semi-retired from my GIS  
> work. So I am sorry but I cannot really help on >testing and committing  
> the work...
>
> That is why I did not respond sooner.
>
>
> As the last "official" netcdf maintainer I should at least give some  
> comments.
>
>
> Perhaps Even can test the patches and commit them, or someone could step  
> in as a new netcdf maintainer.
>
>
> If it is possible, It would be best to separate these improvements into  
> a number of patches. In any case you should add your >patch(es) to a  
> number of new gdal trac ticket(s) (or existing ticket if it fixes a  
> bug).  I did significant improvements some time >ago and I understand  
> that it is hard to separate these improvements in several patches, so I  
> don't think it is absolutely >necessary to split them up - but it helps  
> debugging any regressions. Especially the groups support, I/O  
> improvements and >NASA products should have independent patches and  
> tickets.
>
>
> Regarding 10) it would be best to maintain backwards compatibility, but  
> if the issue is only with 1D dataset I don't think it >matters that much.
>
>
> Cheers,
>
> Etienne
>
>
> On Thu, Feb 5, 2015 at 11:10 PM, Julien Demaria  
> <Julien.Demaria at acri-st.fr> wrote:
>
> Hi GDAL team,
>
> I've implemented several improvements to the NetCDF driver and I would  
> like to provide them to the community.
> Main goal of the changes is to add full support of NetCDF-4 including  
> groups.
> NetCDF-4 is the future format of ESA Sentinel-3 products (no groups) and  
> NASA Ocean Color team is switching their L2/>L3 products to NetCDF-4  
> with groups (VIIRS has already switched to the new format in December).
> With the changes NASA L2 products geolocation is automatically handled  
> as geolocation arrays and can be reprojected using >gdalwarp.
>
> I validated with autotests that nothing is broken in tests netcdf.py  
> (excepting test 13 but see my point 5), netcdf_cf.py and >hdf5.py, using  
> NetCDF-3 and 4 libraries.
> I've also tested the new functionalities on various NetCDF-4 files.
> I think the only possible regression could be for marginal cases where a  
> file was seen directly as a dataset and is now seen as >multiple  
> subdatasets (for example if a file has only one var in the top group and  
> has nested groups containing variables), but I >think this is not very  
> common.
>
> For the moment I have all these changes in local GIT separated commits  
> on the latest gdal-1.11 branch, let me know what >changes you want and  
> how can I provide them.
>
> Changes :
>
> 1) Implement full support for NetCDF-4 groups on reading:
>    - explore recursively all nested groups to create the subdatasets list
>    - subdatasets in nested groups use the /group1/group2/.../groupn/var  
> standard
>      NetCDF-4 convention, excepting for variables in the root group  
> which do not
>      have a leading slash for backward compatibility
>    - when accessing a subdataset using NETCDF:$file:$path, the leading  
> slash is optional
>    - global attributes of each nested group are also collected in the  
> GDAL dataset
>      metadata, using the same convention  
> /group1/group2/.../groupn/NC_GLOBAL#attr_name,
>      excepting for the root group which do not have a leading slash for  
> backward compatibility
>    - when searching for a variable containing auxiliary information on  
> the selected subdataset,
>      like coordinate variables or grid_mapping, we now also search in  
> parent groups (using NCDFResolveVar).
>      I now this is something not specified at this time in the CF  
> convention because CF does not know groups,
>      but it seems logical to me to support this: NetCDF-4 specifies that  
> dimensions of a group are
>      shared to its nested groups, so associated coordinate variables  
> could be defined as the same level of its
>      corresponding dimension.
>    - reference to coordinate variables using the "coordinates" attribute  
> support now also absolute paths,
>      this allow for example to specify coordinate variables located  
> outside the group of the selected variable
>      or its parents. Relative paths could be implemented if needed.
>      This feature is used to add support for new NASA Ocean Color L2  
> products.
>
> 2) Implement full read/write support for new NetCDF4 types NC_UBYTE,  
> NC_USHORT, NC_UINT and NC_STRING, >only if NETCDF_HAS_NC4 is defined  
> (and only if format=NC4 for writing).
>    Support implemented for variables and attributes.
>    NC_STRING type is supported for reading (scalar and arrays)  
> attributes and is used for writing only for array attributes >(scalar  
> are still written as NC_CHAR).
>    If NETCDF_HAS_NC4 is not defined or format!=NC4, NC_STRING array  
> attributes are written as a single NC_CHAR >string using the GDAL  
> {v1,v2,...} convention.
>    Add missing support for NC_BYTE in CreateBandMetadata() and  
> NC_BYTE/SHORT in NCDFPut1DVar().
>
> 3) Add support for new NASA Ocean Color L2 products and ESA Sentinel-3  
> L1 or
>    L2 products which use the NetCDF-4 format (with groups for NASA, see
>    http://oceancolor.gsfc.nasa.gov/DOCS/FormatChange.html):
>    - NASA products: simulate a "coordinates" variable attribute to  
> detect CF
>      geolocation arrays, and set bBottomUp to FALSE
>    - ESA products: set bBottomUp to FALSE and disable warning on missing
>      Conventions attribute
>
> 4) Fix bug #4554 with a more generic solution by disabling the  
> installation of the HDF5 atexit() cleanup routine using >H5dont_atexit().
>    Previous fix was to call GDALExit() (for the moment only defined  
> gdalwarp.cpp) at the end of every program, which is >more painful.
>
> 5) Fix implementation of GetScale/Offset to not always return  
> pbSuccess=TRUE.
>    Fix CopyMetadata to handle bands with only scale or offset.
>    ==> WARNING this commit breaks the autotest netcdf_13 (check for  
> scale/offset = 1.0/0.0 if no scale or offset is >available), but for me  
> it is not logical to return always pbSuccess=TRUE
>
> 6) Optimize IReadBlock() and CheckData() handling of partial blocks in  
> the x axis by re-using the GDAL block buffer instead >of allocating a  
> new temporary buffer for each block.
>
> 7) Force block size to 1 scanline for bottom-up datasets if nBlockYSize  
> != 1 instead of raising a fatal error
>    ==> Solve a recent problem raised on the mailing list
>
> 8) Implement Get/SetUnitType using the standard "units" NetCDF attribute
>
> 9) Change default block size to 256x256 instead of scanline (only affect  
> file without NetCDF chunking)
>    ==> because I think this is better for a random access to the data,  
> but I'm not sure if the community want this change which >could impact  
> performances
>
> 10) I've also implemented for my needs support for 1D variables by  
> simulating 2D datasets with only one row (dimensionless >variables are  
> not supported for the moment),
>       but this breaks backward compatibility because files containing  
> only one variable and associated 1D coordinate variables >are now seen  
> as multiple sub datasets...
>       and maybe this is not the goal of GDAL to give access to  
> not-2D-raster variables (but sometimes it's useful ;-) )
>
> Thanks for GDAL!
>
> Julien
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20150330/bc2a7e14/attachment.html>


More information about the gdal-dev mailing list