[gdal-dev] HANA driver proposal

Even Rouault even.rouault at spatialys.com
Mon Jul 19 12:26:50 PDT 2021


Maxim,

thanks for the introduction.

I assume the driver would depend on the ODBC library, and would require 
users to build https://github.com/SAP/odbc-cpp-wrapper as the 
corresponding ODBC driver ?

As GDAL is an almost always growing beast, we need to be careful about 
what we accept in the upstream repository, and consider policies about 
how to retire things that are no longer useful, adequately maintained or 
becoming a nuisance for the overall project. I had quite a hard time a 
few months ago to do some spring cleanup (see the "Consider driver 
removal" thread in 
https://lists.osgeo.org/pipermail/gdal-dev/2021-January/date.html), and 
it would probably good to have a formal RFC covering the whole 
life-cycle of a driver, and not just how to get new code added.

Below a non-exhaustive lists of points that should be covered IMHO. A 
number of them are open questions where input from the community is welcome.

- criteria for acceptance:

     * (obvious) source code contributed to the repository should follow 
its licensing terms

     * is the driver of sufficient general interest ? For example, we'd 
likely don't want to accept drivers for a in-house file format of a 
company/organization that isn't distributed outside it.

     * who will review the code ? That can be really problematic. I've 
just merged today the Zarr driver but nobody was available to review it 
unfortunately. When we will be able to use the sponsorship through 
NumFOCUS, we'll hopefully be in a better shape to have more reviewers, 
but I'm not sure we'd want to spend the sponsorship to review code for 
proprietary services. That should likely be the responsibility for the 
proprietary service to find an interested reviewer.

      * should we require CI tests ? Some tests, especially ones relying 
on external services, can be flaky.

- each (at least, new) driver should have at least one name in front of 
it in the list in 
https://github.com/OSGeo/gdal/wiki/Maintainers-per-sub-system . What 
should we expect from the maintainer: monitoring of the mailing list and 
issue tracker (at least bi weekly ?), some form of handling of it 
(making sure the issue is well described, and some time frame to address 
it when relevant), reviewing pull requests in their "area of 
responsibility" (... and outside it. For example, I see a number of QGIS 
developers having reviewed your pull requests, but did you help 
reviewing their pull requests. I don't mean to pinpoint on anybody in 
particular, as it is a general pattern in most GDAL driver contributions 
too), participation to general / cross-driver maintenance tasks, ...

- if someone refactors GDAL internals, who is responsible for adjusting 
the various drivers to build and work properly ? Open question honestly, 
and from my experience, this is a really problematic one. I'd expect 
someone who wants to change some internal API (or enable a new class of 
compiler warnings, or use a new code analyzer, etc etc) to have perhaps 
interest and knowledge in 3 or 5 drivers and be willing to do the 
changes in them, but besides that, adjusting all the other drivers is a 
real burden (remainder: GDAL has > 250 drivers) and discourage such 
efforts. But even if we assumed that each driver would have a maintainer 
(the reality is more like 10% of the drivers have a maintainer), 
coordinating with 200 maintainers to gather patches from each of them 
have a working build isn't something that is going to work well.

- how can we get a sense of what is really used in GDAL ? Nobody really 
knows which parts of GDAL are used nowadays, except when we get bug 
reports or questions on them. Should we add some (opt-in) telemetry 
(that could potentially be enabled through GUI applications like QGIS) ?

- how do we decide if some code should be removed ?

     * There are different situations according if the driver is for a 
file format (files can stay forever, and GDAL is in a number of cases 
the only opensource way of accessing them. But conversely older GDAL 
releases are there forever too, with some effort to build them), a 
database or a web-accessible service. And according to their nature: 
FOSS vs proprietary.

     * Code for which there's no longer any declared maintainer or it 
has been found unresponsive (how do we declare that a maintainer is 
unresponsive?)

     * Code for which there are too many tickets unaddressed or is known 
(or assumed) to be broken ?

- how do we actually retire code ? A PSC mention outlining the reasons 
(among ones mentioned above) ? I like the experiment I did with the 
runtime deprecation of a few drivers (the 
GDALIsDriverDeprecatedForGDAL35StillEnabled() thing) to get some 
feedback when we're unsure about the real value of a driver before 
completely wiping it off (we recently get a ticket asking re-enabling of 
one of the drivers that was marked fro removal in 3.5), and perhaps this 
could be the norm when driver removal is considered

- I haven't looked at the procedures of the Linux kernel, but perhaps 
there's some interesting inspiration to be taken from there (at least 
for comparison)

Anyone wants to take the lead on such RFC ?

Even

Le 19/07/2021 à 13:38, Rylov, Maxim via gdal-dev a écrit :
>
> Dear GDAL/OGR Project Steering Committee,
>
> We, the SAP HANA Spatial Team, would like to add HANA support to the 
> GDAL library.
> SAP HANA <https://www.sap.com/products/hana.html> is an in-memory 
> database with an OGC-compliant 
> <http://www.opengeospatial.org/resource/products/details/?pid=1303>spatial 
> engine 
> <https://help.sap.com/viewer/cbbbfc20871e4559abfd45a78ad58c02/2.0.04/en-US/e1c934157bd14021a3b43b5822b2cbe9.html>.
> A free community edition of SAP HANA is available here 
> <https://www.sap.com/cmp/td/sap-hana-express-edition.html>.
>
> GDAL/OGR supports a great variety of data formats and databases like 
> PostgreSQL, MySQL,
> MS SQL, Oracle, IBM DB2 etc. Therefore, the proposed HANA driver would 
> complement
> the list of currently supported databases.
>
> The implementation of the new vector driver for HANA including tests 
> and its integration
> into the CI would be done by our team of course with some guidance 
> from the GDAL community.
> The driver’s footprint in the code should be minimal, as the whole 
> implementation will reside in
> the subfolder 
> https://github.com/OSGeo/gdal/tree/master/gdal/ogr/ogrsf_frmts/hana 
> <https://github.com/OSGeo/gdal/tree/master/gdal/ogr/ogrsf_frmts/hana>.
>
> Our team already gathered some experience of working with open source 
> communities.
> A few months ago, we introduced support of HANA in QGIS 
> <https://github.com/qgis/QGIS/pull/34988>and now we continue 
> maintaining our
> contribution 
> <https://github.com/qgis/QGIS/pulls?q=is%3Apr+is%3Aclosed+HANA>. We 
> keep tracking of new issues/bugs, CI runs, API changes, documentation 
> improvements
> related to HANA and make sure that they are fixed/implemented as soon 
> as possible. We hope that this
> information will minimize your concerns about the future maintenance 
> of the driver.
>
> Kind regards,
> Maxim Rylov on behalf of the HANA Spatial Team
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
http://www.spatialys.com
My software is free, but my time generally not.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20210719/6251ac46/attachment.html>


More information about the gdal-dev mailing list