[postgis-devel] Upgrade paths (again)

Sandro Santilli strk at kbt.io
Thu Aug 4 22:56:23 PDT 2022


On Mon, Aug 01, 2022 at 08:00:27AM -0400, Greg Troxel wrote:
> 
> Sandro Santilli <strk at kbt.io> writes:
> 
> >> When I read
> >>   https://postgis.net/docs/manual-3.2/postgis_administration.html#upgrading
> >> I don't really understand
> >>   - how that relates to this discussion
> >
> > This discussion is about how to implement the ability for users to
> > upgrade their EXTENSION-based PostGIS code in their databases.
> > We're talking about the procedure documented as "soft" upgrade (ie:
> > not requiring dump/restore), that is:
> >
> >   https://postgis.net/docs/manual-3.2/postgis_administration.html#soft_upgrade_extensions
> 
> It would be great to add to the manual
>  - that the previous section (non-extension) is basically obsolete
>  - the command to run to tell you which mode you are in
>  - how to move from old-way to extension

It would !  Here are the sources, if you want to give it a try:

  https://git.osgeo.org/gitea/postgis/postgis/src/branch/master/doc/administration.xml

> >>   - what packaging system should do, if anything, when
> >>     * replacing postgis 3.1.x with 3.2.x
> >>     * changing from pgsql 12 with postgis 3.2 to pgsql 13 with postgis 3.2
> >>     * changing from pgsql 12 with postgis 3.2 to pgsql 13 with postgis 3.3
> >
> > This is a great question. So far getting rid of everything installed
> > by old package and installing the new package seems to work, but
> > requires the new package to install a file for each older postgis
> > version from which a user is allowed to upgrade from. This currently
> > means about 400 files if I'm not mistaken, you probably have a better
> > idea as I assume you have a manifest of what gets installed ?
> 
> That's half the question.  331 files.  83 each for base, raster,
> topology, and 82 for tiger_geocoder.

I've made tiger_geocoder consistent with the rest last saturday, with
commit bfcf1fe960f4be78bd958933b7d2c66e35d8e660 -- are these numbers
coming from 3.3.0dev ?

> The other half is whether the package upgrade itself should do anything
> to databases, but I am convinced that it can't because postgresql may
> not be running.

I'm thinking an administrative command would not really need
PostgreSQL running but could just look at the available extension
version in the extension/ directory, and install upgrade paths from
that information. If the package would run this command, it would
need to do so BEFORE uninstalling the old package, beacuse
uninstalling the old package would get rid of those old extension
files thus hiding the important information from the admin command.

> > What I proposed above was NOT installing any of those files but
> > provide an administration command that would install only the ones
> > which could be needed (not easy to tell because it depends on what
> > version of PostGIS extension is found installed in any database served
> > by a any cluster using a specific PostgreSQL install directory).
> 
> So that command would have to be run by root, which can write the
> extension dir, and root would have to be a pgsql superuser so it can
> query the db, and that would have to happen while the db is running, and
> after that, somebody could upgrade?

See above, my new idea is access to PostgreSQL backend would not be
required.

> >>   - what a package manager should do if pgsql is not running when the
> >>     above upgrades happen  (pkgsrc typically doesn't do a lot of this,
> >>     and leaves the user to dump/restore across postgresql version
> >>     changes, i.e. dump, nuke db dir, upgrade, restore.
> >
> > With my proposal it's all to be defined and your input is welcome.
> > With current situation there's no need for pgsql to be running during
> > system upgrades.
> 
> Yes, but that doesn't upgrade the dbs.  I think it just about has to be
> that way.

Yeah, I think it has to be that way beacause "shit happens", on
database upgrade, and we don't want a problem with a single database
preventing a system-level upgrade (PostGIS is supposedly able to run
half-broken as long as the shared library name does not change across
upgrades)

> > but I suspect 99.9% of users are using postgis "as an extension".
> 
> The upgrade docs don't make one think that.

[...]

> a full "psql db -c 'foo bar'" in the upgrade docs would be nice.

As above: contributions welcome !

> > That's the first change I'm proposing, but I'm foreseeing a setup
> > in which we STOP installing "an increasing number of files" but
> > rather provide an administrative command to install them on demand.
> 
> It's not clear to me that this is a win.

Is the new idea explored above making it more of a win to you ?

The "win" of doing this at "package configuration time" is that
we'd won't need anymore to maintain a list of "upgradeable versions"
which often gets outdated by new patch-level releases coming out from
different branches:

  https://git.osgeo.org/gitea/postgis/postgis/src/branch/master/extensions/upgradeable_versions.mk

If an upgrade path is missing, the system administrator would just
invoke a postgis admin command (or a package re-configuration command)
and get the missing parts.

> >> I realize Windows doesn't really support symlinks in a reasonable way,
> >> and presumably this "duplicated files" is a Windows-only concern.  If
> >> that's off base, please straighten me out.
> >
> > You're right that the "duplicated files" is a windows only issue,
> > but I'm also concerned about increasing number of symlinks...
> 
> It strikes me that this is really postgresql bug.

This was one of the reasons I didn't want to move to extension and
kept my love for non-extension / unpackaged...

All I could do was drafting an idea to improve PostgreSQL but did not
find fundings to implement that idea. Here's where I'm left:

  https://commitfest.postgresql.org/38/3654/

--strk;

  Libre GIS consultant/developer
  https://strk.kbt.io/services.html


More information about the postgis-devel mailing list