[gdal-dev] Un-vendoring a number of third-party libraries?

Even Rouault even.rouault at spatialys.com
Fri Dec 15 06:35:51 PST 2023


Hi,

I'm considering removing from the source tree a number of third-party 
libraries that we have vendored over the years: zlib, libpng, libjpeg, 
giflib, liblerc

All of them are widely available in most packaging environments. In that 
list, only zlib is required (currently either as external or internal lib).

I believe the main reason for having them vendored is now mostly 
historical, dating back to times where there was no packaging system on 
Windows. Now we have Conda-Forge or vcpkg, it is easy to have those 
dependencies installed.

For libjpeg, there was a particular history related to 12-bit JPEG 
support that required to use the internal copy built in a special way, 
but a couple months ago, libjpeg-turbo 3.0 has been released with 
unified support for both 8-bit and 12-bit JPEG in the same build, and 
latest libtiff and GDAL releases are able to make use of it. Hence this 
justification no longer holds. Furthermore GDAL libjpeg copy is still 
good-ol' libjpeg 6b, without all the SIMD accelerations that are now in 
libjpeg-turbo, hence it is definitely not recommended any more to use 
GDAL internal copy of libjpeg.

For internal libpng, we have a small old patch to accept some invalid 
files ("Make screwy MSPaint "zero chunks" only a warning, not error", 
https://trac.osgeo.org/gdal/ticket/3416). I don't think it is critical 
to have that patch lost... At worse, it could be attempted to have it 
accepted by upstream.

Benefits of un-vendoring those libraries:

- currently, we must take care of updating them regularly, in particular 
to make sure they integrate the latest fixes for their vulnerabilities.

- they complicate the GDAL build scripts and configuration. For example, 
drivers can't be built as plugins if they depend of one of those 
libraries built as internal (because the internal copy is built in 
libgdal, but not exported, hence a plugin can't use its symbols, and 
thus must be built itself in libgdal core). We also must do tricks to 
rename their symbols to avoid clashes when integrating GDAL with other 
software which uses the corresponding external library.

- they require exceptions to static analyzers (cppcheck, coverity scan), 
since they don't use the same coding standards as GDAL

Looking a bit around in different open source build recipees of GDAL 
(Debian, Conda-Forge, vcpkg, OSGeo4W, gisinternals, rasterio-wheels), 
those proposed changes should have modest impact, as they already mostly 
use external libraries. What I've identified (I may have missed things) 
to require changes from the maintainers of those distributions to keep 
the same level of functionality:

- gisinternals doesn't seem to have a liblerc build

- rasterio-wheels doesn't seem to have libpng and giflib builds

As far as our code base is concerned, apart from the obvious removal of 
code and simplification of the build system, there would be some changes 
in CI configurations (like the Android CI build would be impacted to add 
at least a preliminary step of cross-compiling zlib).


Potential candidates, but would remain in-tree for now:

- libtiff: compulsory dependency. GDAL has been the main driver for most 
libtiff development over the last 10 years, and GDAL autotest suite 
tortures libtiff much more than libtiff own testsuite, hence it is quite 
convenient to have the capability of vendoring it.  Plus the fact that 
for "staging codecs" (that is codecs not yet integrated in official 
libtiff), currently JPEG-XL (a few years ago this was the LERC codec), 
we can't build them against an external libtiff.

- libgeotiff. compulsory dependency. If one uses internal libtiff, one 
also must use internal libgeotiff because of the renaming of symbols 
done when using internal libtiff.

- shapelib. compulsory dependency.  External default shapelib build uses 
32-bit file offset, whereas the internal shapelib is built with 64-bit 
offset support (to use .DBF files > 2 GB). We don't have build support 
for using it as external lib.

- libjson-c: compulsory dependency. I initially put it in the list of 
candidates to unvendor, as it is quite widely available, but now I 
recall that upstream libjson-c has an issue (especially/only on Windows) 
with non-C locales when parsing/outputing floating point numbers, which 
we have patched in our internal copy by using GDAL locale-safe 
functions. Ideally that should be fixed upstream, but not immediately 
trivial to port our changes.

- libqhull (used for the gdal_grid linear algorithm, which requires a 
Delaunay triangulation of the points): that one could be a candidate for 
unvendoring, as it is available in a number of distributions, but 
there's an issue currently which scipy which bundles it without renaming 
the symbols, hence if linking GDAL against external libqhull, and using 
GDAL + scipy, we have a clash of symbols 
(https://github.com/conda-forge/qgis-feedstock/issues/284#issuecomment-1356490896). 
When using internal libqhull, GDAL does rename its symbols, which works 
around this (scipy) issue.

Non-candidates:

- pcidsk sdk (for PCIDSK driver): doesn't seem to be packaged. We don't 
have build support for using it as external lib.

- libopencad (used by CAD driver): doesn't seem to be packaged

- libcsf (used by PCRaster driver): doesn't seem to be packaged

- infback9. That code originally comes from the "contrib" part of zlib, 
to add Deflate64 support (non-backwards compatible extension of Deflate, 
sometimes used by Windows zipper I believe). We don't have build support 
for using it as external lib.

- degrib and g2clib (used by the GRIB driver): they originally came from 
third-party sources, but they aren't widely packaged and we have heavily 
patched them (there was no real possibility of collaboration with the 
authors of those software at the time where we needed to make those 
changes). We don't have build support for using them as external 
libraries. For better or worse, they should be considered as GDAL code 
now...

- hdf-eos (used by HDF4 driver): originally comes from a third-party 
source, but GDAL copy was heavily patched long time ago. We don't have 
build support for using it as external lib. For better or worse, it 
should be considered as GDAL code now...


Thoughts ? (given the length of the email, it should probably be 
formalized as a RFC. I'll do that, unless there is a massive uprising 
against the proposal...)

Even

-- 
http://www.spatialys.com
My software is free, but my time generally not.



More information about the gdal-dev mailing list