[gdal-dev] Feasibility of expanding VRT schema to allow users to specify X/Y dimension for HDF data?

Even Rouault even.rouault at spatialys.com
Fri Aug 5 00:59:13 PDT 2016


Frank,

>From what I understand from Joe's needs, it looks like simple transposing of a 
2D raster wouldn't be enough. Here we would need to "transpose" pixels 
scattered through different subdatasets due to the Nd > 2 dimensionality of the 
original dataset, which would be impractical to express at the VRT level, and 
likely inefficient.

Something I've thought about would be to make Nd > 2 rasters native objects at 
the GDAL level, but this would have likely deep implications on the code base 
and should be considered carefully, and would be of interest for a limited set 
of drivers (netCDF, HDF4, HDF5, Rasdaman).

Even

> Brian / Even,
> 
> Certainly it is desirable for the HDF (and perhaps other super
> flexible formats like netcdf) to support an open option to select
> alternative axes.  But the ability to transpose a dataset could also
> be quite valuable in the VRT driver to "fix" any input transposed
> dataset.
> 
> I'm also not entirely certain why one couldn't supply an appopriately
> transposed geotransform to accomplish something similar.  This could
> be done without any code changes in the existing VRT format.
> 
> Best regards,
> Frank
> 
> On Thu, Aug 4, 2016 at 3:32 PM, Even Rouault <even.rouault at spatialys.com> 
wrote:
> > On Thursday 04 August 2016 16:31:25 H. Joe Lee wrote:
> >> Hi,
> >> 
> >>   My name is Joe Lee and I'm very interested in improving GDAL's
> >> 
> >> capability to access NASA HDF4/HDF5 data so that users can work with
> >> HDF easily through GDAL. For example, my goal is to allow users to
> >> translate any HDF data into GeoTIFF via gdal_translate.
> >> 
> >>   I've worked with diverse NASA HDF products and provided solution for
> >> 
> >> visualizing data correctly through Python/MATLAB/IDL/NCL [1] and I
> >> know that many products, other than HDF-EOS, may not work well with
> >> GDAL because HDF is flexible and NASA data producers do not
> >> necessarily follow the conventions that GDAL uses.
> >> 
> >>   By default, GDAL HDF4/HDF5 driver uses the following convention for
> >> 
> >> unknown products.
> >> 
> >>     For HDF4 (frmts/hdf4/hdf4imagedataset.cpp):
> >>     
> >>     // Search for the starting "X" and "Y" in the names or take
> >>     // the last two dimensions as X and Y sizes
> >>     iXDim = nDimCount - 1;
> >>     iYDim = nDimCount - 2;
> >>   
> >>   For HDF5 (frmts/hdf5/hdf5imagedataset.cpp):
> >>     int     GetYIndex() const { return IsComplexCSKL1A() ? 0 : ndims - 2;
> >>     }
> >>     int     GetXIndex() const { return IsComplexCSKL1A() ? 1 : ndims - 1;
> >>     }
> >>  
> >>  The above code works well as long as Unknown HDF product follows the
> >> 
> >> above convention. However, in reality, HDF data can have an arbitrary
> >> 
> >> order in terms of Band, X and Y dimension like this:
> >>   dset4D[XDim=360][YDim=180][Band1=2][Band2=3]
> >>   dimindex:    0                      1            2             3
> >>   
> >>   Since ndims=4, ndims-2 becomes 2 and ndims-1=3. In such case, GDAL
> >> 
> >> generates 360x180 bands of 2x3 images, instead of the desired 2x3
> >> bands of 360x180 images.
> >> 
> >>   Thus, I'm wondering if GDAL can expand VRT schema so that VRT allows
> >> 
> >> users to specify the correct dimension index because specifying
> >> dimension order for each different NASA product in [1]  is
> >> impractical. For example, I'd like suggest a new tag like
> >> 
> >> "SourceDimension" like below:
> >>   <VRTRasterBand dataType="UInt16" band="1">
> >>   <SimpleSource>
> >>   
> >>     <SourceFilename
> >> 
> >> relativeToVRT="0">HDF4_SDS:UNKNOWN:"DATA_WITH_4D_DATASET.hdf":7</SourceFi
> >> len ame> <SourceDimension RasterXDim="0" RasterYDim="1" />
> >> 
> >>     <SourceBand>1</SourceBand>
> >>     <SourceProperties RasterXSize="360" RasterYSize="180"
> >> 
> >> DataType="UInt16" BlockXSize="360" BlockYSize="180" />
> >> 
> >>    ...
> >>   
> >>   </SimpleSource>
> >> 
> >> </VRTRasterBand>
> >> 
> >>   Once user specifies correct dimensions by editing VRT, it can be
> >> 
> >> used by GDAL HDF4/HDF5 drivers and the HDF drivers read the data
> >> correctly for GDAL's image buffer.
> >> 
> >>   Do you think it's right and feasible approach to solve wrong X/Y
> >> 
> >> dimension order problem? Or do you have any other existing solution
> >> that can mitigate this problem in GDAL? The GEE project team has
> >> experimented the idea by creating another separate XML file [2] but I
> >> think it's time to sync with GDAL development team and come up with
> >> the most elegant solution. In my opinion, VRT looks best and I wish
> >> GDAL development team can give me some feedback on this idea.
> > 
> > Joe,
> > 
> > I would rather suggest to add open options to the drivers and pass them
> > with the exiting VRT OpenOptions element, rather than adding a new
> > element in the VRT that would be specific of a few drivers
> > 
> >  <SimpleSource>
> >  
> >     <SourceFilename>
> > 
> > relativeToVRT="0">HDF4_SDS:UNKNOWN:"DATA_WITH_4D_DATASET.hdf":7</SourceFil
> > ename>> 
> >    <OpenOptions>
> >    
> >       <OOI key="RASTERXDIM">0</OOI>
> >       <OOI key="RASTERYDIM">1</OOI>
> >    
> >    </OpenOptions>
> >    
> >     <SourceBand>1</SourceBand>
> >     <SourceProperties RasterXSize="360" RasterYSize="180"
> >     DataType="UInt16"
> > 
> > BlockXSize="360" BlockYSize="180" />
> > 
> >    ...
> >   
> >   </SimpleSource>
> > 
> > Which is equivalent to:
> > 
> > gdalinfo HDF4_SDS:UNKNOWN:"DATA_WITH_4D_DATASET.hdf":7 -oo RASTERXDIM=0
> > -oo
> > RASTERYDIM=0
> > 
> > 
> > Even
> > 
> > --
> > Spatialys - Geospatial professional services
> > http://www.spatialys.com
> > _______________________________________________
> > gdal-dev mailing list
> > gdal-dev at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list