[gdal-dev] Draft GDAL/OGR class hierarchy for GDAL 2.0

Dmitriy Baryshnikov bishop.dev at gmail.com
Thu Mar 27 12:13:56 PDT 2014


Hi Even,

I think that we can discuss the starting point for 2.0.
You start with idea "...to "refactor" a code base of 1.2 million lines.."
Maybe we have to start from white list - create ideal (brilliant) set of 
classes and create a well-designed foundation for GDAL. Also rewrite 
several wide used drivers (shp, tab, PostGIS, etc.). Than this will 
works very good, begin some kind of hackathon for rewriting overs.
Also we can add some testing framework inside GDAL (e.g. googletest) and 
start empty repo on github for GDAL 2.0.
The GDAL can be enhanced, we can move common things that use several 
drivers to core or replace some core thing with outer libraries. CMake 
maybe used as build system.
Yes this is deep refactoring, but maybe this is need for GDAL 2.0?

Best regards,
     Dmitry

27.03.2014 14:24, Even Rouault пишет:
> Hi,
>
>> Hi all,
>>
>> MY opinion is that's not a really good way to count how many classes you
>> will have and trying to have minimal classes.
>>
>> Drawing a model with UML is not easy : it's not easy to have a good model
>> from the beginning, it's not easy to name all members and methods
>> correctly, and the worst is inheritance.
>> You can think you spend too much time to make the model. But you can't say
>> if you will spend less or much time to change code because your model was
>> too simple.
> The issue is that I'm trying to "refactor" a code base of 1.2 million lines of
> C/C++ code (not mentionning the 140 000 lines of Python tests), and with 211
> existing drivers. So, given that nobody will probably ever want to fund the
> effort, it is completely excluded to have to edit each of those drivers in a non
> automated way...
> The end solution will necessary be a compromise. I would also like the existing
> C API to continue working as much as possible.
>
>> Because I don't know all the model of Gdal and the model can be "reset" I
>> have some questions :
>>
>> 1) Why GDALMajorObject not inherit of GDALIMajorObject ?
> It was just an omission in the UML diagram. My C++ sketch had this inheritance.
>
>> 2) Why all classes inherit (directly or not) of GDALMajorObject ?
> That's how it is done currently.
>
>> What do
>> you store in it ?
> It contains a list of list of strings, or in more concrete terms, metadata
> domains that contain metadata items.
>
>> Is it possible to put it as a member in classes ?
> That could have been a way of doing it, yes. Always the debate composition vs
> inheritance.
>
>> Have
>> you consider templates ?
> Actually, no. I tend to avoid them unless they are an obvious solution.
>
>> 3) An interface for Drivers ? geName(), getLongName(), getDescription() ?
>> Or maybe abstract ...
>>
>> 4) What is GDALDataset in your diagram ?
> It is the base class that a driver that supports both raster and vector would
> have to subclass.
>
>> 5) Where is the link between Driver (Gdal/Ogr) and Dataset (Raster/Vector)
>> ? Member, templates ?
> A Driver instanciates a Dataset by opening an existing one, or by creating a new
> one. But the Dataset object itself is a free-standing objects that is only owned
> by user code, and not by the Driver.
>
> For now, I've dropped my crazy UML diagram that nobody would be able to
> understand, even if I managed to make it work, and I'm pursuing my attempt
> just merging OGRDataSource inside GDALDataset. And make OGRDriver and
> OGRSFDriverRegistrar mostly disappear, or just create a minimized version of
> them, as compatibility layer for existing drivers, that forward the real work to
> GDALDriver and GDALDriverManager.
>
> Even
>
>> Florent
>>
>>
>> 2014-03-25 21:14 GMT+01:00 Even Rouault <even.rouault at mines-paris.org>:
>>
>>> Selon Etienne Tourigny <etourigny.dev at gmail.com>:
>>>
>>>> I also (respectfully) think that there are too many classes (mostly the
>>>> abstract classes and interfaces) which make it a bit hard to understand.
>>>>
>>>> I like Vincent's second suggestion to have 3 base Dataset classes
>>>> (GDALRasterDataset, GDALVectorDataset, GDALHybridDataset) which inherit
>>>> from GDALDataset, and an enum (vector,raster,hybrid) to easily query
>>> which
>>>> type the Dataset is (or which type its driver supports). You then
>>> subclass
>>>> the relevant Dataset class and it *should* work auto-magically for
>>> rasters.
>>>>   For example, a given Dataset which subclasses GDALDataset would be
>>>> migrated as  a subclass of GDALRasterDataset, which would imply that it
>>>> does not have a vector dataset. If the implementation changes to allow
>>>> vector, then you would change its parent and implement relevant vector
>>>> methods.
>>>>
>>>> My view on this is KISS with a minimum number of classes...
>>> Yes, the practice shows that the UML brainstorming leads to nowhere (at
>>> least
>>> with my current limited knowledge of C++ multi-inheritance)... So actually
>>> instead of creating new classes, I think that eventually there will be less
>>> classes... I'm not even sure about the interest of having subclasses of
>>> GDALDataset to distinguish between raster only, vector only or
>>> raster+vector. As
>>> I wrote previously that could be done as a driver metadata instread.
>>>
>>> So, for now, I've just "imported" the usefull methods of OGRDataSource into
>>> GDALDataset, and made a few changes here or there so that it works. Well
>>> except
>>> ogr_refcount.py, because of a funny discovery was that the ref counting of
>>> GDALDataset and OGRDataSource is (was) not the same. Starts at 1 for
>>> GDALDataset
>>> and 0 for OGRDataSource...
>>>
>>> The temporary result is in the following branch:
>>> https://github.com/rouault/gdal-1/commits/unification
>>>
>>> OGRDataSource is kept with just that :
>>>
>>> class CPL_DLL OGRDataSource : public GDALDataset
>>> {
>>> public:
>>>                          OGRDataSource();
>>>
>>>      virtual const char  *GetName() = 0;
>>>
>>>      static void         DestroyDataSource( OGRDataSource * );
>>>
>>>      OGRSFDriver        *GetOGRDriver() const;
>>>      void                SetOGRDriver( OGRSFDriver *poDriver );
>>>      virtual     const char* GetDriverName();
>>>
>>> protected:
>>>      friend class OGRSFDriverRegistrar;
>>>
>>>      OGRSFDriver        *m_poOGRDriver;
>>>
>>> };
>>>
>>> And it could eventually completely disappear (or maybe just remain as a
>>> convenience class if we don't want to touch existing drivers) if I manage
>>> to
>>> merge OGRSFDriver into GDALDriver, and get rid of OGRSFDriverRegistrar.
>>>
>>> Even
>>>
>>>> cheers
>>>> Etienne
>>>>
>>>>
>>>>
>>>> On Tue, Mar 25, 2014 at 9:13 AM, Vincent Mora
>>>> <vincent.mora at oslandia.com>wrote:
>>>>
>>>>> On 25/03/2014 11:44, Even Rouault wrote:
>>>>>
>>>>>> Selon Vincent Mora <vincent.mora at oslandia.com>:
>>>>>>
>>>>>>   On 24/03/2014 21:46, Even Rouault wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> "Release soon, release early", so for people who like UML diagrams
>>>>>>>> (there
>>>>>>>>
>>>>>>> is
>>>>>>>
>>>>>>>> also a prototype of C++ classes for those who don't like UML very
>>> much),
>>>>>>> here's
>>>>>>>
>>>>>>>> a blog entry with the outcome of my thoughts for a possible
>>>>>>>> re-organisation
>>>>>>>>
>>>>>>> of
>>>>>>>
>>>>>>>> the GDAL/OGR class hierarchy
>>>>>>>>
>>>>>>>>
>>>>>>>>   http://erouault.blogspot.ca/2014/03/draft-gdalogr-class-
>>>>>> hierarchy-for-gdal.html
>>>>>>
>>>>>>> I don't understand the need for HandleRasterData() and
>>>>>>> HandleVectorData().
>>>>>>>
>>>>>>> Isn't inheriting from GDALEmptyRasterDataset or
>>> GDALEmptyVectorDataset
>>>>>>> sufficient to convey the information that the class doesn't handle
>>> those
>>>>>>> kind of datasets.
>>>>>>>
>>>>>>> A call to GetLayerCount()/GetRasterCount() from the class user is
>>>>>>> sufficient to say "no vector/raster data here", and a dynamic _cast
>>> to
>>>>>>> GDALEmptyVectorDataset/GDALEmptyRasterDataset is even clearer to me,
>>> so
>>>>>>> HandleRasterData()/HandleVectorData() would be a third way to tell
>>> the
>>>>>>> same thing.
>>>>>>>
>>>>>> Actually, when thinking more about this, this kind of information
>>> should
>>>>>> be at
>>>>>> the driver level, and not at the dataset level. The idea is that there
>>>>>> are use
>>>>>> cases where you want to know if a driver can handle only vector, only
>>>>>> raster, or
>>>>>> both. For example if you still want to have separate Open dialog
>>> boxes for
>>>>>> raster or vector.
>>>>>>
>>>>>>> Also I'm not sure about the names GDALAbstract* since those are
>>> partial
>>>>>>> implementations.
>>>>>>>
>>>>>> I know I'm not very good at naming. Do you have an alternative
>>> proposal ?
>>>>>> The
>>>>>> issue is that with my draft we end up with a lot of classes and
>>>>>> interfaces, so
>>>>>> it is not obvious to find a good name to reflect their content.
>>>>>>
>>>>> Partial sound ok to me, but I not so good at naming  ?
>>>>>
>>>>> Considering the number of classes, I gave it some thoughts this morning
>>>>> and, like Dmitry, thought of merging the Abtract (Partial) into the
>>>>> interface before realizing that, even if it works (you can overload de
>>>>> default function in Empty) it's a bit ugly and I ended up preferring
>>> your
>>>>> solution.
>>>>>
>>>>> An altenative would be to have a Dataset with both aspects and provide
>>>>> three partial specialisations: one for vector (it will behave like
>>> empty
>>>>> raster) one for raster, and one for both. Code duplications could be
>>>>> avoided by implementing protected member functions in the Dataset
>>> class and
>>>>> simply calling them in the implementations of partial specialization.
>>> This
>>>>> solution avoid the diamond shaped inheritance diagram of hell :)
>>>>>
>>>>>     Even
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> gdal-dev mailing list
>>>>> gdal-dev at lists.osgeo.org
>>>>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>>>>>
>>>
>>> _______________________________________________
>>> gdal-dev mailing list
>>> gdal-dev at lists.osgeo.org
>>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>>>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
>



More information about the gdal-dev mailing list