[gdal-dev] GDAL Raster Attribute Tables.

Sam Gillingham gillingham.sam at gmail.com
Thu May 2 15:23:48 PDT 2013


Hi Evan,

Response to your points below:

1) Good point about bool not being available in C. How about making
this an 8 bit int instead? The idea was to be able to express
True/False values without using a whole integer, thus saving space. We
included it in case future drivers support it.

2) I was meaning the CSL* family of functions from cpl_string.h since
the CPLString class won't be available from C so the 'char
**papszStrList' parameter will be manipulated with these functions.
Your suggestion of only the list being allocated and ValuesIO()
allocating the strings themselves make sense and will be incorporated
in the RFC.

3) Yes this would be cleaner - I will modify the RFC to include
GDALDefaultRasterAttributeTable.

4,5) regarding Clone() and Serialize(). Maybe these functions should
be unsupported for GDALRasterAttributeTable? We are trying to get the
RAT support on a similar footing to bands (eg. reading/writing the
data on demand and not holding all data in memory, plus being able to
access whole chunks at once). AFAIK the Band object does not support
Clone and Serialize and our feeling is that neither should the RAT
class. Perhaps the proposed derived GDALDefaultRasterAttributeTable
would have Clone() and Serialize() to support existing code (and it
can because it holds all the data in memory)?

6) Yes this is a good opportunity to remove unimplemented functions.

7) The wording is incorrect and will be changed.

8) Agreed.

We will make the changes to the RFC.

Sam and Pete


------------------------------

Message: 2
Date: Wed, 1 May 2013 11:52:30 +0200
From: Even Rouault <even.rouault at mines-paris.org>
To: gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] GDAL Raster Attribute Tables.
Message-ID: <201305011152.30779.even.rouault at mines-paris.org>
Content-Type: Text/Plain;  charset="utf-8"

Le mercredi 01 mai 2013 01:29:30, Peter Bunting a ?crit :
> Hi All,
>
> Sam and I have prepared an RFC for our proposal for improving raster
> attribute table (RAT) support and put it on the trac.
>
> http://trac.osgeo.org/gdal/wiki/rfc40_enhanced_rat_support
>
> We'd be keen to here feedback on our proposal.
>

Peter,

Here are my observations and questions :

* Is the boolean data type really necessary ? There will be a difficulty with
the associated C bindings since 'bool' isn't a C type, but a C++ one.

* I'm not sure about what you meant with "The CSLString type will be used for
reading and writing strings". You probably meant the CPLString C++ class right
? I'm not sure how this would fly with ValuesIO(GDALRWFlag eRWFlag, int iField,
int iStartRow, int iLength, char **papszStrList). This is probably just an
implementation detail.
Related to that, when eRWFlag == GF_Read, you will probably need to document
that only the papszStrList array must be allocated with a size of iLength *
sizeof(char*) bytes, and that the papszStrList[i] values that will be set in
it will be allocated with the CPL allocation routines. So users will be
responsible to clear the values with CPLFree().

* For the HFA specialized implementation, I guess you would override almost
all methods of GDALRasterAttributeTable, and that you wouldn't use the
"std::vector<
GDALRasterAttributeField> aoFields" member variable for example
since you don't want the size of anValues/adfValues/aosValues arrays of
GDALRasterAttributeField to go out of control.
So perhaps a better design would be to completely remove the private members
of GDALRasterAttributeTable to make it an abstract class. The current
implementation, with the private member variables, would go to a
GDALDefaultRasterAttributeTable that would extend GDALRasterAttributeTable
(like HFARasterAttributeTable would do).
This would have of course some backward compatibility implications, especially
for C++ users, since GDALRasterAttributeTable would be no longer instanciable
(GDALDefaultRasterAttributeTable would), but I don't believe this would cause
much churn in existing user code bases, and would have only some limited
implication in GDAL code base. The C function GDALCreateRasterAttributeTable()
could be preserved and return an instance of GDALDefaultRasterAttributeTable.

* If your purpose is to address huge RATs, I'm not sure about what we can do
with the Serialize() method, which is for example used by the PAM mechanism
when translating datasets to formats that don't support natively RATs. Perhaps
in the case the RAT is detected to be large to be reasonably serialized (for
example 10 MB ?), this method could just return NULL (some check should be
done in the existing GDAL code base to see if we are robust to NULL being
returned. I can see at least that CPLAddXMLChild( psTree, psPam->poDefaultRAT-
>Serialize() ) at line 209 of gdalpamrasterband.cpp would need some care)

* In a similar situation (huge RAT), I'm not sure what the Clone() method
would do. GDALPamRasterBand::SetDefaultRAT() uses Clone() on that passed RAT
object.http://trac.osgeo.org/gdal/attachment/ticket/4903/4903.4.diff would do
the same on VRTRasterBand::SetDefaultRAT(). IdrisiRasterBand::SetDefaultRAT()
does that too. Perhaps that HFARasterAttributeTable::Clone() could return a
self-contained immutable copy of itself (i.e. it could survive the destruction
of the HFADataset object of the source RAT). Perhaps only in the huge RAT
case. In other cases, it could return a GDALDefaultRasterAttributeTable object
with all columns and rows in memory, that could be potentially modified by
calling code. Hum, that seems to be complicated...

In the SetDefaultRAT() use case, the use of Clone() is not meant at getting a
free-standing copy of the source object that can be altered, but rather to be
independant from the life cycle of the passed object. Would be probably better
addressed with some shared pointer mechanism, but the implication would be
really too huge and go well beyond changes in RAT management...

* Not linked with your proposal, but I noticed that gdal_rat.h exposes methods
that are not implemented, like GetRowMin(), GetRowMax(), GetColorOfValue().
Your work could be the opportunity to clean that by removing them.

* Just a detail: In the current state of your proposal, "The proposed
additions will have an impact on C binary compatibility because they change
the API" would be better rephrased as "The proposed additions will extend the
C API" (there's no compatibility problem since it is backward compatible).

* I'd like to see a mention of extending the Python autotest suite to test the
new API, both for their default implementation and specialized implementation
in the HFA driver.

Best regards,

Even


More information about the gdal-dev mailing list