[gdal-dev] Feasibility of expanding VRT schema to allow users to specify X/Y dimension for HDF data?

H. Joe Lee hyoklee at hdfgroup.org
Thu Aug 4 14:31:25 PDT 2016


Hi,

  My name is Joe Lee and I'm very interested in improving GDAL's
capability to access NASA HDF4/HDF5 data so that users can work with
HDF easily through GDAL. For example, my goal is to allow users to
translate any HDF data into GeoTIFF via gdal_translate.

  I've worked with diverse NASA HDF products and provided solution for
visualizing data correctly through Python/MATLAB/IDL/NCL [1] and I
know that many products, other than HDF-EOS, may not work well with
GDAL because HDF is flexible and NASA data producers do not
necessarily follow the conventions that GDAL uses.

  By default, GDAL HDF4/HDF5 driver uses the following convention for
unknown products.

    For HDF4 (frmts/hdf4/hdf4imagedataset.cpp):

    // Search for the starting "X" and "Y" in the names or take
    // the last two dimensions as X and Y sizes
    iXDim = nDimCount - 1;
    iYDim = nDimCount - 2;

  For HDF5 (frmts/hdf5/hdf5imagedataset.cpp):

    int     GetYIndex() const { return IsComplexCSKL1A() ? 0 : ndims - 2; }
    int     GetXIndex() const { return IsComplexCSKL1A() ? 1 : ndims - 1; }

 The above code works well as long as Unknown HDF product follows the
above convention. However, in reality, HDF data can have an arbitrary
order in terms of Band, X and Y dimension like this:

  dset4D[XDim=360][YDim=180][Band1=2][Band2=3]
  dimindex:    0                      1            2             3

  Since ndims=4, ndims-2 becomes 2 and ndims-1=3. In such case, GDAL
generates 360x180 bands of 2x3 images, instead of the desired 2x3
bands of 360x180 images.

  Thus, I'm wondering if GDAL can expand VRT schema so that VRT allows
users to specify the correct dimension index because specifying
dimension order for each different NASA product in [1]  is
impractical. For example, I'd like suggest a new tag like
"SourceDimension" like below:

  <VRTRasterBand dataType="UInt16" band="1">
  <SimpleSource>
    <SourceFilename
relativeToVRT="0">HDF4_SDS:UNKNOWN:"DATA_WITH_4D_DATASET.hdf":7</SourceFilename>
    <SourceDimension RasterXDim="0" RasterYDim="1" />
    <SourceBand>1</SourceBand>
    <SourceProperties RasterXSize="360" RasterYSize="180"
DataType="UInt16" BlockXSize="360" BlockYSize="180" />
   ...
  </SimpleSource>
</VRTRasterBand>

  Once user specifies correct dimensions by editing VRT, it can be
used by GDAL HDF4/HDF5 drivers and the HDF drivers read the data
correctly for GDAL's image buffer.

  Do you think it's right and feasible approach to solve wrong X/Y
dimension order problem? Or do you have any other existing solution
that can mitigate this problem in GDAL? The GEE project team has
experimented the idea by creating another separate XML file [2] but I
think it's time to sync with GDAL development team and come up with
the most elegant solution. In my opinion, VRT looks best and I wish
GDAL development team can give me some feedback on this idea.

  Best Regards,



[1] http://hdfeos.org/zoo/
[2] https://wiki.earthdata.nasa.gov/pages/viewpage.action?pageId=65799385


More information about the gdal-dev mailing list