[gdal-dev] Re: HDF-EOS vs. GDAL: order of dimensions

Wed Dec 19 11:59:55 EST 2007

>>>>> Lucena, Ivan <ivan.lucena at pmldnet.com> writes:

 >> Personally I do not like this way. We already discussed the idea of
 >> separate option vs. filename encoding in regard to OGROpen() call
 >> and it also applies here. Though it seems I have not succeeded in
 >> convincing everyone that we should add open parameters to our open
 >> calls.

[...]

 >> Also I think it will be quite hard (or even impossible) to solve all
 >> HDF issues without user control.

 > You might be right, the dimension information on the subdataset
 > string is a little bit of a stretch and it might not be the best
 > solution for every situation. But I also agree with Frank about the
 > open options.

 > User-case scenario: Right now I am facing problems with ImageServer
 > (it uses GDAL to read HDF) and the only solution would be to get the
 > latests updates on the HDF driver (or make my own changes to it),
 > recompile and place it as a GDAL plugins on the path. That is not
 > very user friendly :)

	My opinion is that the ability to query the driver about the
	options available for a particular dataset and to specify any of
	them in at the dataset opening time in some way /without/ doing
	any formatting / parsing runs would be a handy addition to the
	GDAL API.

	However, are there any reasons to actually disallow options to
	be specified in the (sub)dataset name?

 > So, may I raise another suggestion if you don't mind?

 > What if we create a hdf-knowledge-base, a simple XML helper file with
 > details about particular HDF-EOS products?

 > That could have something like a list of products and dataset names
 > and the particularities about then. So, a node on your XML could
 > have:

 > product="MODISXYZ"
 > dataset="CLOUDSDATA"
 > processdate="20071001"

 > and then a set of particular issues about that dataset. That would
 > include not only dimensions information but any other issue like "how
 > to break down the bands of 16bits Quality data?", etc.

 > The file could be deployed at <gdal>\data but users should be able to
 > create their own "hdf-helper" files and place in the same folder as
 > their data or some kind of user-path.

	I agree with this proposal.  It's unclear to me, however, how
	the dataset could be identified, so that the database could be
	queried?  I see two ways of doing this:

	* relying on some identification string(s) in the HDF-EOS
	  metadata;

	* relying on the names, dimensions or order of the data sets.

	Both ways it seems to be only guessing, as the actual product
	format may vary with time.

 > Anyway, that is just a humble suggestion. I understand the
 > frustration of dealing with HDF. The format is great, the hdf library
 > is good and the GDAL HDF4 driver is also good, but there is lack of
 > documentation on the files itself so what usually happens is that we,
 > programmers, read the documentation on the web and hard code it.

	Actually, I disagree with this (mildly.)

 > The question is: Can we cover all the issues about HDF4 and hard code
 > then? If that is possible, forget about it.

	I agree with Andrey's opinion about the HDF-EOS-related issues
	above.  I certainly would prefer a solution which allows the
	settings that cannot be reliably retrieved from an HDF-EOS file
	to be overriden manually.

 > Frank, Can you imagine using this "helper files" solution to other
 > situation and drivers?

	Indeed, I've been thinking about a kind of ``helper files''
	(though a somewhat different kind) to allow for specified
	datasets to be ``stacked upon each other along the band axis'',
	forming a dataset with the same spatial dimensions and a larger
	bands number...

	BTW, setting GCPs and specifying the projection fits quite
	neatly into the ``helper file'' framework.  I'm sure there are
	many other applications for them.

	Given the above, we stumble upon that there're some tasks that
	are already implemented with either the command line interface,
	or dataset specifier parser.  How these facilities, would they
	be extended to the point of ``overlapping'' each other, should
	relate to each other?

	I believe that there should be a sort of generic ``options
	database'' inside of GDAL.  Would it be allowed for the values
	for the options for a particular dataset:

	* to come from the dataset specifier,

	* to be passed with an ::OpenOptions () call,

	* to be acquired from a helper file (XML or not),

	it would be only to the benefit of GDAL users -- both GDAL-based
	software developers and the users of the software based on GDAL.