Dealing with topology data corruption on upgrade to 3.6.0

Regina Obe lr at pcorp.us
Wed Oct 29 18:09:40 PDT 2025


> On Wed, Oct 15, 2025 at 03:11:00PM -0400, Regina Obe wrote:
> > > https://trac.osgeo.org/postgis/ticket/5983
> >
> > I'm thinking of a plan that will work as follows to fix.
> >
> > 1) roll back the catalog updates, that should fix existing installs
> > without having to restore from a huge backup
> > 2) Rename the existing domain and topogeometry type - something like
> > topogeometry_small
> > 3) Create new types (with the existing name, topogeometry, topoelement
> etc).
> > 4) Put in autocasts to topogeometry_small -> topogeometry
> 
> I like the idea of having 2 distinct types so not enforce an on-disk growth of
> topologies that do not need large IDs.
> 

Yah that would be ideal and would be less painful to roll-back existing to the lower integer version
as there in theory would be much fewer records that were damaged.

But there is a lot more code and testing involved for that path forward.


> > Unfortunately all this kinda breaks our rule of no new functions or
> > types in a micro, but I don't see how we have much choice here.
> 
> I guess we could consider 3.6 dead and immediately jump on 3.7 if we don't
> want to break that rule.
> 

That would require releasing 3.7 pretty soon though.   Which might break packager rules.  Not sure how many packagers have a rule that if they've already released a 3.6 for PG whatever
Only 3.6 micros can flow into that or since our .so is major versioned, it's okay to do that.

I also ran into an issue using pg_ugrade from 3.5 to 3.6  which I need to revisit to see what the error was there, cause I ended up just upgrading my old cluster to 3.6 before upgrading to the new cluster to get around the issue.


> What concerns me most is that what would fix old tables or even just old
> records could break new tables or new records. Think about the situation in
> which someone added NEW TopoGeometry objects to an existing table, after
> the upgrade. Some rows will REALLY have 64bit integers written on disk and
> some other will have 32bit integers instead. The only way we MIGHT have to
> tell what kind of data is really in what record would be to check the xmin
> attribute and compare it the the xmin of the postgis extension version record.
> 
> --strk;
> 
>   Libre GIS consultant/developer 🎺
>   https://strk.kbt.io/services.html

I'm not quite sure how to compare xmins.  How do you compare them.

Like for example I can do equality check

select xmin, * FROM pg_extension WHERE xmin = 16676274;

But doing inequality does not work

SELECT xmin, * FROM pg_extension WHERE xmin > 16676274;    --  ERROR:  operator does not exist: xid > integer

select xmin, * FROM pg_extension WHERE xmin::bigint > 16676274;  -- ERROR:  cannot cast type xid to bigint



I presume that's because since xmin is stored as 32-bit it gets wrap around quite frequently so it probably gets recycled on vacuum or some such thing so a higher xmin might not guarantee a newer transaction.

Since you need to visit every record anyway, I still feel its safer to see if the record is corrupted with what we've been discussing here

https://trac.osgeo.org/postgis/ticket/5983


Probably the fastest path to fix, not ideal, is to detect there is corruption in at least one table during topology upgrade, and include a function for users to fix the corruption and instruct them to do so and then come 3.7 put in the grand scheme topogeometry_small/ big options if we truly feel it impacts performance significantly for users.

It's a shame we can't just use numeric, we rely on sequences.  Cause numeric are variably stored https://www.postgresql.org/docs/10/datatype-numeric.html  so I imagine would be a better solution for people truly concerned about space.










More information about the postgis-devel mailing list