[postgis-devel] Upgrade paths (again)

Sandro Santilli strk at kbt.io
Wed Sep 14 09:04:31 PDT 2022


On Wed, Sep 14, 2022 at 11:04:00AM -0400, Greg Troxel wrote:
> Sandro Santilli <strk at kbt.io> writes:
> 
> > I've changed the 'upgrade-paths' code so to not require PostgreSQL at
> > all but keep the current model of installing explictly stated upgrade
> > paths from old supported versions. The new aim is simply avoiding the
> > need to run updates on system catalogs.
> 
> Who is explicitly stating this and when?

PostGIS source tree is explictly stating it in:

    extensions/upgradeable_versions.mk

At *install* time (make install) upgrade scripts are installed to
upgrade from each of the versions therein listed to the version being
installed.

> Building a binary package has to have the same outcome whether or not
> postgis has ever been set up or in varying versions on the build
> machine.

This is what you get now and what you'll keep getting with the
current upgrade-paths branch (as of commit e1efd3d6c8).

> Anything that isn't in the binary package isn't available to the user.

And this is what prevents upgrades from releases which came out
AFTER the release of PostGIS found in your database (which aren't
necessarely downgrades, if you by policy kept tracking a stable
branch).

> Are you saying you want to pivot from:
> 
>   have the build/install process (and thus what is packaged) install
>   upgrade scripts from N old versions to the current version
> 
> to
> 
>   have the build/install process install some proto-upgrade-scripts and
>   also some program (perhaps a shell script) that, can be run something
>   like
> 
>     postgis-generate-upgrade-from 2.5.1
> 
>   and will output the upgrade script?

Yes, this is what I'm proposing.
The "upgrade-path" contains such script (in perl), currently having
this syntax:

  postgis install-extension-upgrades [--pg_sharedir <dir>] [<from>...]
                Ensure files required to upgrade PostGIS from
                the given version are installed on the system.
                The <from> arguments may be either version numbers
                or PostgreSQL share directories to scan to find available
                ones.

>  Or also run it?

In its current state the "upgrade-script" is also running it, passing
it the hard-coded list found in "upgradeable-versions.mk", but this
means the *same* files would be installed by running `make install`
from multiple versions of PostGIS, and thus the same binary packages
would contain a lot of the same files. I'd evaluate whether to avoid
doing this part.

>  Or maybe just
>   query all the databases and do it, but it has to be a pgsql superuser,
>   or only upgrade databases it has privs on?

In its current incarnation, the `postgis install-extension-upgrades`
does not implement querying databases. Could be a useful addition.

> From the packaging point of view, if you want to change the upgrade
> process it doesn't bother me.  But the "make && make install" cannot
> look into anything about the current system.  That breaks reproducible
> builds (not claiming we are fully there but a change in the set of
> installed files based on the current state is a bug) and will make the
> resulting package non-portable.

Ok, so the current state of the branch is still good for you (doesn't
query the system).

> > The new packaging problem with this model is that the SAME file would
> > be included in multiple packaged postgis versions, namely upgrade
> > paths in the form:
> >
> >     postgis--3.0.0--ANY.sql
> 
> I don't understand why that's a problem.  That file, for packaged
> version N, is how to upgrade from 3.0.0 to N.  It has the same name, and
> different contents, for every value of N.  That's true of most files in
> a package and it isn't a problem.

Maybe you didn't understand, I'm not using variables in that filename,
it's the actual name: `postgis--3.0.0--ANY.sql`.
The file with the SAME name would contained by all future versions of
postgis IF we implement this approach of providing "generic upgrade
paths" (<old_known_version>--ANY) instead of the specific ones
( <old_known_version>--<package_version> ).

Why would we want "generic upgrade paths" ? The idea was that each
version would install the "<this_version>--ANY" upgrade path, but
given the scenario you described ( files in old packages are removed
before files in new package are installed ) this is now less
interesting. The only advantage of generic upgrade paths would be that
you could upgrade from any old version to any new version w/out having
to install any more upgrade paths ( old -> ANY -> new )

> I don't understand why you don't want to install them, other than that
> there are many of them.

That's the main reason:

  # ls -l `pg_config --sharedir`/extension/{postgis,address_standardizer}* | wc -l
  5222

The other is that those over 5k files are often not enough to allow me
to upgrade, because we maintain stable branches and so at some point
old databases end up being in a stable version which cannot be
upgraded to a newer major release, so I have to do create a bunch of
symlinks (or, with the new code call a script) in any case.

> If you don't, then what is a DBA supposed to do?  Where are they going
> to get the upgrade scripts from?

The `postgis` script will take care of that, they just need to state
which versions they want to allow users to upgrade from.


--strk;

  Libre GIS consultant/developer
  https://strk.kbt.io/services.html


More information about the postgis-devel mailing list