[PostGIS] #5983: Data corruption in topology.topoelement and topology.topogeometry after upgrade to 3.6.0
PostGIS
trac at osgeo.org
Wed Oct 29 08:10:38 PDT 2025
#5983: Data corruption in topology.topoelement and topology.topogeometry after
upgrade to 3.6.0
-----------------------+---------------------------
Reporter: packi | Owner: robe
Type: defect | Status: new
Priority: blocker | Milestone: PostGIS 3.6.1
Component: topology | Version: 3.6.x
Resolution: | Keywords:
-----------------------+---------------------------
Comment (by strk):
Replying to [comment:18 francoisb]:
>
> The upgrade path from 3.6.0 is indeed challenging, because one needs to
distinguish:
> - Topogeometry columns that need fixing, because created before the
3.6.0 upgrade, not somehow manually fixed already, and still have bad
values.
> - Topogeometry columns that do NOT need fixing, because created after
the 3.6.0 upgrade (and have correct values to begin with), or the user
somehow managed to fix the bad values already.
I think it's even worse as the SAME TABLE could contain records that need
fixing and records that do not... Although I suspect it would be very
unlikely that someone added more records to a corrupted topology table.
> > Or we could use some fuzzy logic to check if any data value would be
invalid, and bet on that being caused by the corruption.
>
> Exactly. A possible heuristic to answer this: All topogeometry columns
have a built-in `CHECK` constraint to enforce a correct `type` subfield
(from 1 to 4, depending on layer geometry type). Bad topogeometry values
have garbage for this `type` (depending on memory contents, usually 0 at
the beginning of a session), which is therefore violating the constraint.
An actual value other than the one of the 4 valid ones defined in the
`CHECK` constraint, and this in any row of a table, could be a criteria to
trigger data recovery path for the entire table.
There is also a built-in "xmin" field in each record that tells you the
identifier of the transaction which introduced that record. That value
could be compared with the "xmin" value of the record identifying the
PostGIS version (for example the "postgis_version" record in pg_proc).
Using too much fuzzy heuristics could be very dangerous so we should maybe
provide the fix as a separate script to optionally run ?
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5983#comment:21>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list