[gdal-dev] Requiring numpy for the Python bindings

Greg Troxel gdt at lexort.com
Wed Dec 6 06:23:14 PST 2023


Laurențiu Nicola via gdal-dev <gdal-dev at lists.osgeo.org> writes:

> 1. do the platforms you care about package Firefox, librsvg, or any
> other popular software that's using Rust?

pkgsrc supports multiple operating system, multiple versions of those
systems, and multiple cpu architectures.  There are probably over a
hundred such 3-tuples.  pkgsrc has firefox, thunderbird, librsvg,
py-cryptography, matrix-synape, and more, just as you'd expect from any
other packaging system.

On some of these platforms, rust is functional.  On some of them, it is
not.  Guessing somewhat wildly, when firefox is not available, it's at
least 50% due to rust not being available.

Keeping rust going has been an ongoing source of work and problems.
This is partly because of not having a reasonable bootstrap story from
C/C++, and mostly because the singleton compiler (rust is technically a
langauge but in practice it is a single implementation, so far) has very
aggressive requirements for the bootstrap compiler.

> 2. do you have any reasons to believe that numpy will require Rust in
> the future? I skimmed the existing NEPs, including the 2.0 roadmap,
> and there's no mention of Rust, so it's unlikely to be on the plate
> for the next, say, 5 years.

I have no particular reason to believe it will happen.  I have seen many
things switch to rust and cause portability problems, and there appears
to be little concern for that during the switching process.  I don't
know if that's not understanding, not caring, thinking that switching is
so important that it is worth the harm, or something else I haven't
thought of.  I therefore find it hard to reason about these choices made
by others.

> 3. if numpy ever requires Rust, do you expect that GDAL will be unable
> to support both the Rust-enabled numpy, and a previous version at the
> same time?

Eventually, yes, that's how it usually works.  The pattern in general is
that some package becomes difficult because of a non-portable
dependency, and then people add in the previous version.   pkgsrc has
done this with py-cryptography.  But nobody maintains the old verison
and it becomes less and less reasonable

> 4. if numpy ever requires Rust, do you expect that the platforms you care about will stop packaging and/or updating it?

See 1; "platforms" isn't really apt here.  pkgsrc will certainly
continue to try to support rust and will almost almost certainly succeed
on things like recent versions of NetBSD on x86_64, aarch64, and
riscv64.  Rust will almost certainly be non-functional on minority CPU
types and older systems.

> 5. according to Debian's popcon, numpy has about 10x more users than
> GDAL [1] [2], so the packagers will be under a lot of pressure to
> support it anyway

Yes, but pressure doesn't work when it's technically infeasible.

> 6. if you're more worried about the availability of an up-to-date Rust
> toolchain, note that a lot of "foundational" Rust libraries have
> relatively conservative toolchain requirements. For example, the numpy
> crate (unrelated to the Python package) supports on Rust 1.48, which
> is from November 2020 [3].

Generally from a security viewpoint, it's only ok to run maintained
versions.  But yes, it may be that we version rust and support old, if
rust portability deteriorates.



I didn't mean this to be "it would be terrible to require numpy".  I am
just raising the portability issues becuase I find they often don't get
considered.  When A adds a dependency on P, then the set of systems (the
os-osversion-cpuarch tuple) A runs on is limited to the set P runs on,
and the future A' is limited to the future P'.  py-cryptography is a
good example of a package that used to run quite widely and now does
not.

Already, numpy requires a fortran compiler and BLAS type libs, which
don't really seem necessary for GDAL-type things.



I'm not really clear on why py-gdal uses numpy, and a quick scan of the
code makes it look like runtime import, with some of them guarded by
try/except.  A situation where you can still install py-gdal and the
functions that are "return me this layer as a numpy object" throw
exceptions and everything else works seems much better than having the
install fail.


What benefit does a hard requirement bring us, that is more important
than the downside of the current and future portability issues?


More information about the gdal-dev mailing list