[gdal-dev] Un-vendoring a number of third-party libraries?
Even Rouault
even.rouault at spatialys.com
Fri Dec 15 06:35:51 PST 2023
Hi,
I'm considering removing from the source tree a number of third-party
libraries that we have vendored over the years: zlib, libpng, libjpeg,
giflib, liblerc
All of them are widely available in most packaging environments. In that
list, only zlib is required (currently either as external or internal lib).
I believe the main reason for having them vendored is now mostly
historical, dating back to times where there was no packaging system on
Windows. Now we have Conda-Forge or vcpkg, it is easy to have those
dependencies installed.
For libjpeg, there was a particular history related to 12-bit JPEG
support that required to use the internal copy built in a special way,
but a couple months ago, libjpeg-turbo 3.0 has been released with
unified support for both 8-bit and 12-bit JPEG in the same build, and
latest libtiff and GDAL releases are able to make use of it. Hence this
justification no longer holds. Furthermore GDAL libjpeg copy is still
good-ol' libjpeg 6b, without all the SIMD accelerations that are now in
libjpeg-turbo, hence it is definitely not recommended any more to use
GDAL internal copy of libjpeg.
For internal libpng, we have a small old patch to accept some invalid
files ("Make screwy MSPaint "zero chunks" only a warning, not error",
https://trac.osgeo.org/gdal/ticket/3416). I don't think it is critical
to have that patch lost... At worse, it could be attempted to have it
accepted by upstream.
Benefits of un-vendoring those libraries:
- currently, we must take care of updating them regularly, in particular
to make sure they integrate the latest fixes for their vulnerabilities.
- they complicate the GDAL build scripts and configuration. For example,
drivers can't be built as plugins if they depend of one of those
libraries built as internal (because the internal copy is built in
libgdal, but not exported, hence a plugin can't use its symbols, and
thus must be built itself in libgdal core). We also must do tricks to
rename their symbols to avoid clashes when integrating GDAL with other
software which uses the corresponding external library.
- they require exceptions to static analyzers (cppcheck, coverity scan),
since they don't use the same coding standards as GDAL
Looking a bit around in different open source build recipees of GDAL
(Debian, Conda-Forge, vcpkg, OSGeo4W, gisinternals, rasterio-wheels),
those proposed changes should have modest impact, as they already mostly
use external libraries. What I've identified (I may have missed things)
to require changes from the maintainers of those distributions to keep
the same level of functionality:
- gisinternals doesn't seem to have a liblerc build
- rasterio-wheels doesn't seem to have libpng and giflib builds
As far as our code base is concerned, apart from the obvious removal of
code and simplification of the build system, there would be some changes
in CI configurations (like the Android CI build would be impacted to add
at least a preliminary step of cross-compiling zlib).
Potential candidates, but would remain in-tree for now:
- libtiff: compulsory dependency. GDAL has been the main driver for most
libtiff development over the last 10 years, and GDAL autotest suite
tortures libtiff much more than libtiff own testsuite, hence it is quite
convenient to have the capability of vendoring it. Plus the fact that
for "staging codecs" (that is codecs not yet integrated in official
libtiff), currently JPEG-XL (a few years ago this was the LERC codec),
we can't build them against an external libtiff.
- libgeotiff. compulsory dependency. If one uses internal libtiff, one
also must use internal libgeotiff because of the renaming of symbols
done when using internal libtiff.
- shapelib. compulsory dependency. External default shapelib build uses
32-bit file offset, whereas the internal shapelib is built with 64-bit
offset support (to use .DBF files > 2 GB). We don't have build support
for using it as external lib.
- libjson-c: compulsory dependency. I initially put it in the list of
candidates to unvendor, as it is quite widely available, but now I
recall that upstream libjson-c has an issue (especially/only on Windows)
with non-C locales when parsing/outputing floating point numbers, which
we have patched in our internal copy by using GDAL locale-safe
functions. Ideally that should be fixed upstream, but not immediately
trivial to port our changes.
- libqhull (used for the gdal_grid linear algorithm, which requires a
Delaunay triangulation of the points): that one could be a candidate for
unvendoring, as it is available in a number of distributions, but
there's an issue currently which scipy which bundles it without renaming
the symbols, hence if linking GDAL against external libqhull, and using
GDAL + scipy, we have a clash of symbols
(https://github.com/conda-forge/qgis-feedstock/issues/284#issuecomment-1356490896).
When using internal libqhull, GDAL does rename its symbols, which works
around this (scipy) issue.
Non-candidates:
- pcidsk sdk (for PCIDSK driver): doesn't seem to be packaged. We don't
have build support for using it as external lib.
- libopencad (used by CAD driver): doesn't seem to be packaged
- libcsf (used by PCRaster driver): doesn't seem to be packaged
- infback9. That code originally comes from the "contrib" part of zlib,
to add Deflate64 support (non-backwards compatible extension of Deflate,
sometimes used by Windows zipper I believe). We don't have build support
for using it as external lib.
- degrib and g2clib (used by the GRIB driver): they originally came from
third-party sources, but they aren't widely packaged and we have heavily
patched them (there was no real possibility of collaboration with the
authors of those software at the time where we needed to make those
changes). We don't have build support for using them as external
libraries. For better or worse, they should be considered as GDAL code
now...
- hdf-eos (used by HDF4 driver): originally comes from a third-party
source, but GDAL copy was heavily patched long time ago. We don't have
build support for using it as external lib. For better or worse, it
should be considered as GDAL code now...
Thoughts ? (given the length of the email, it should probably be
formalized as a RFC. I'll do that, unless there is a massive uprising
against the proposal...)
Even
--
http://www.spatialys.com
My software is free, but my time generally not.
More information about the gdal-dev
mailing list