<div dir="ltr">Dmitry - I am not sure that a wholescale refactor is the way to go - many formats will be dropped because quite a few do not have active maintainers, and development is mostly on a volunteer basis (correct me if I am wrong, Even).<div>
<br></div><div>As an example, the netcdf driver has been mostly maintained by myself for the past few years (with the help of a few users and devs), and it has been about 99% voluntary work. And I will be unable to work on it for the next year, so it probably would not be upgraded if the API changes drastically. And I am sure there are quite a few drivers in the same situation.<br>
<div><br></div><div>It there a way to think of a new design, while retaining backwards compatibility for the drivers that would not be updated? Especially for the raster drivers, which would probably need less changes.</div>
</div><div><br></div><div>cheers</div><div>Etienne</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Mar 27, 2014 at 4:13 PM, Dmitriy Baryshnikov <span dir="ltr"><<a href="mailto:bishop.dev@gmail.com" target="_blank">bishop.dev@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Even,<br>
<br>
I think that we can discuss the starting point for 2.0.<br>
You start with idea "...to "refactor" a code base of 1.2 million lines.."<br>
Maybe we have to start from white list - create ideal (brilliant) set of classes and create a well-designed foundation for GDAL. Also rewrite several wide used drivers (shp, tab, PostGIS, etc.). Than this will works very good, begin some kind of hackathon for rewriting overs.<br>
Also we can add some testing framework inside GDAL (e.g. googletest) and start empty repo on github for GDAL 2.0.<br>
The GDAL can be enhanced, we can move common things that use several drivers to core or replace some core thing with outer libraries. CMake maybe used as build system.<br>
Yes this is deep refactoring, but maybe this is need for GDAL 2.0?<br>
<br>
Best regards,<br>
Dmitry<br>
<br>
27.03.2014 14:24, Even Rouault пишет:<div class="HOEnZb"><div class="h5"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi all,<br>
<br>
MY opinion is that's not a really good way to count how many classes you<br>
will have and trying to have minimal classes.<br>
<br>
Drawing a model with UML is not easy : it's not easy to have a good model<br>
from the beginning, it's not easy to name all members and methods<br>
correctly, and the worst is inheritance.<br>
You can think you spend too much time to make the model. But you can't say<br>
if you will spend less or much time to change code because your model was<br>
too simple.<br>
</blockquote>
The issue is that I'm trying to "refactor" a code base of 1.2 million lines of<br>
C/C++ code (not mentionning the 140 000 lines of Python tests), and with 211<br>
existing drivers. So, given that nobody will probably ever want to fund the<br>
effort, it is completely excluded to have to edit each of those drivers in a non<br>
automated way...<br>
The end solution will necessary be a compromise. I would also like the existing<br>
C API to continue working as much as possible.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Because I don't know all the model of Gdal and the model can be "reset" I<br>
have some questions :<br>
<br>
1) Why GDALMajorObject not inherit of GDALIMajorObject ?<br>
</blockquote>
It was just an omission in the UML diagram. My C++ sketch had this inheritance.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
2) Why all classes inherit (directly or not) of GDALMajorObject ?<br>
</blockquote>
That's how it is done currently.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
What do<br>
you store in it ?<br>
</blockquote>
It contains a list of list of strings, or in more concrete terms, metadata<br>
domains that contain metadata items.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Is it possible to put it as a member in classes ?<br>
</blockquote>
That could have been a way of doing it, yes. Always the debate composition vs<br>
inheritance.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Have<br>
you consider templates ?<br>
</blockquote>
Actually, no. I tend to avoid them unless they are an obvious solution.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
3) An interface for Drivers ? geName(), getLongName(), getDescription() ?<br>
Or maybe abstract ...<br>
<br>
4) What is GDALDataset in your diagram ?<br>
</blockquote>
It is the base class that a driver that supports both raster and vector would<br>
have to subclass.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
5) Where is the link between Driver (Gdal/Ogr) and Dataset (Raster/Vector)<br>
? Member, templates ?<br>
</blockquote>
A Driver instanciates a Dataset by opening an existing one, or by creating a new<br>
one. But the Dataset object itself is a free-standing objects that is only owned<br>
by user code, and not by the Driver.<br>
<br>
For now, I've dropped my crazy UML diagram that nobody would be able to<br>
understand, even if I managed to make it work, and I'm pursuing my attempt<br>
just merging OGRDataSource inside GDALDataset. And make OGRDriver and<br>
OGRSFDriverRegistrar mostly disappear, or just create a minimized version of<br>
them, as compatibility layer for existing drivers, that forward the real work to<br>
GDALDriver and GDALDriverManager.<br>
<br>
Even<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Florent<br>
<br>
<br>
2014-03-25 21:14 GMT+01:00 Even Rouault <<a href="mailto:even.rouault@mines-paris.org" target="_blank">even.rouault@mines-paris.org</a>><u></u>:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Selon Etienne Tourigny <<a href="mailto:etourigny.dev@gmail.com" target="_blank">etourigny.dev@gmail.com</a>>:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I also (respectfully) think that there are too many classes (mostly the<br>
abstract classes and interfaces) which make it a bit hard to understand.<br>
<br>
I like Vincent's second suggestion to have 3 base Dataset classes<br>
(GDALRasterDataset, GDALVectorDataset, GDALHybridDataset) which inherit<br>
from GDALDataset, and an enum (vector,raster,hybrid) to easily query<br>
</blockquote>
which<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
type the Dataset is (or which type its driver supports). You then<br>
</blockquote>
subclass<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
the relevant Dataset class and it *should* work auto-magically for<br>
</blockquote>
rasters.<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
For example, a given Dataset which subclasses GDALDataset would be<br>
migrated as a subclass of GDALRasterDataset, which would imply that it<br>
does not have a vector dataset. If the implementation changes to allow<br>
vector, then you would change its parent and implement relevant vector<br>
methods.<br>
<br>
My view on this is KISS with a minimum number of classes...<br>
</blockquote>
Yes, the practice shows that the UML brainstorming leads to nowhere (at<br>
least<br>
with my current limited knowledge of C++ multi-inheritance)... So actually<br>
instead of creating new classes, I think that eventually there will be less<br>
classes... I'm not even sure about the interest of having subclasses of<br>
GDALDataset to distinguish between raster only, vector only or<br>
raster+vector. As<br>
I wrote previously that could be done as a driver metadata instread.<br>
<br>
So, for now, I've just "imported" the usefull methods of OGRDataSource into<br>
GDALDataset, and made a few changes here or there so that it works. Well<br>
except<br>
ogr_refcount.py, because of a funny discovery was that the ref counting of<br>
GDALDataset and OGRDataSource is (was) not the same. Starts at 1 for<br>
GDALDataset<br>
and 0 for OGRDataSource...<br>
<br>
The temporary result is in the following branch:<br>
<a href="https://github.com/rouault/gdal-1/commits/unification" target="_blank">https://github.com/rouault/<u></u>gdal-1/commits/unification</a><br>
<br>
OGRDataSource is kept with just that :<br>
<br>
class CPL_DLL OGRDataSource : public GDALDataset<br>
{<br>
public:<br>
OGRDataSource();<br>
<br>
virtual const char *GetName() = 0;<br>
<br>
static void DestroyDataSource( OGRDataSource * );<br>
<br>
OGRSFDriver *GetOGRDriver() const;<br>
void SetOGRDriver( OGRSFDriver *poDriver );<br>
virtual const char* GetDriverName();<br>
<br>
protected:<br>
friend class OGRSFDriverRegistrar;<br>
<br>
OGRSFDriver *m_poOGRDriver;<br>
<br>
};<br>
<br>
And it could eventually completely disappear (or maybe just remain as a<br>
convenience class if we don't want to touch existing drivers) if I manage<br>
to<br>
merge OGRSFDriver into GDALDriver, and get rid of OGRSFDriverRegistrar.<br>
<br>
Even<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
cheers<br>
Etienne<br>
<br>
<br>
<br>
On Tue, Mar 25, 2014 at 9:13 AM, Vincent Mora<br>
<<a href="mailto:vincent.mora@oslandia.com" target="_blank">vincent.mora@oslandia.com</a>><u></u>wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 25/03/2014 11:44, Even Rouault wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Selon Vincent Mora <<a href="mailto:vincent.mora@oslandia.com" target="_blank">vincent.mora@oslandia.com</a>>:<br>
<br>
On 24/03/2014 21:46, Even Rouault wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
<br>
"Release soon, release early", so for people who like UML diagrams<br>
(there<br>
<br>
</blockquote>
is<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
also a prototype of C++ classes for those who don't like UML very<br>
</blockquote></blockquote></blockquote></blockquote></blockquote>
much),<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
here's<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
a blog entry with the outcome of my thoughts for a possible<br>
re-organisation<br>
<br>
</blockquote>
of<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
the GDAL/OGR class hierarchy<br>
<br>
<br>
<a href="http://erouault.blogspot.ca/2014/03/draft-gdalogr-class-" target="_blank">http://erouault.blogspot.ca/<u></u>2014/03/draft-gdalogr-class-</a><br>
</blockquote></blockquote>
hierarchy-for-gdal.html<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I don't understand the need for HandleRasterData() and<br>
HandleVectorData().<br>
<br>
Isn't inheriting from GDALEmptyRasterDataset or<br>
</blockquote></blockquote></blockquote></blockquote>
GDALEmptyVectorDataset<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
sufficient to convey the information that the class doesn't handle<br>
</blockquote></blockquote></blockquote></blockquote>
those<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
kind of datasets.<br>
<br>
A call to GetLayerCount()/<u></u>GetRasterCount() from the class user is<br>
sufficient to say "no vector/raster data here", and a dynamic _cast<br>
</blockquote></blockquote></blockquote></blockquote>
to<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
GDALEmptyVectorDataset/<u></u>GDALEmptyRasterDataset is even clearer to me,<br>
</blockquote></blockquote></blockquote></blockquote>
so<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
HandleRasterData()/<u></u>HandleVectorData() would be a third way to tell<br>
</blockquote></blockquote></blockquote></blockquote>
the<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
same thing.<br>
<br>
</blockquote>
Actually, when thinking more about this, this kind of information<br>
</blockquote></blockquote></blockquote>
should<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
be at<br>
the driver level, and not at the dataset level. The idea is that there<br>
are use<br>
cases where you want to know if a driver can handle only vector, only<br>
raster, or<br>
both. For example if you still want to have separate Open dialog<br>
</blockquote></blockquote></blockquote>
boxes for<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
raster or vector.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Also I'm not sure about the names GDALAbstract* since those are<br>
</blockquote></blockquote></blockquote></blockquote>
partial<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
implementations.<br>
<br>
</blockquote>
I know I'm not very good at naming. Do you have an alternative<br>
</blockquote></blockquote></blockquote>
proposal ?<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
The<br>
issue is that with my draft we end up with a lot of classes and<br>
interfaces, so<br>
it is not obvious to find a good name to reflect their content.<br>
<br>
</blockquote>
Partial sound ok to me, but I not so good at naming ?<br>
<br>
Considering the number of classes, I gave it some thoughts this morning<br>
and, like Dmitry, thought of merging the Abtract (Partial) into the<br>
interface before realizing that, even if it works (you can overload de<br>
default function in Empty) it's a bit ugly and I ended up preferring<br>
</blockquote></blockquote>
your<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
solution.<br>
<br>
An altenative would be to have a Dataset with both aspects and provide<br>
three partial specialisations: one for vector (it will behave like<br>
</blockquote></blockquote>
empty<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
raster) one for raster, and one for both. Code duplications could be<br>
avoided by implementing protected member functions in the Dataset<br>
</blockquote></blockquote>
class and<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
simply calling them in the implementations of partial specialization.<br>
</blockquote></blockquote>
This<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
solution avoid the diamond shaped inheritance diagram of hell :)<br>
<br>
Even<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
</blockquote>
______________________________<u></u>_________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">http://lists.osgeo.org/<u></u>mailman/listinfo/gdal-dev</a><br>
<br>
</blockquote></blockquote>
<br>
______________________________<u></u>_________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">http://lists.osgeo.org/<u></u>mailman/listinfo/gdal-dev</a><br>
<br>
</blockquote></blockquote>
<br>
______________________________<u></u>_________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">http://lists.osgeo.org/<u></u>mailman/listinfo/gdal-dev</a><br>
<br>
<br>
</blockquote>
<br>
______________________________<u></u>_________________<br>
gdal-dev mailing list<br>
<a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" target="_blank">http://lists.osgeo.org/<u></u>mailman/listinfo/gdal-dev</a></div></div></blockquote></div><br></div>