[PostGIS] #5983: Potential data corruption in topology.topoelement after upgrade to 3.6.0
PostGIS
trac at osgeo.org
Wed Oct 8 00:53:22 PDT 2025
#5983: Potential data corruption in topology.topoelement after upgrade to 3.6.0
-----------------------+---------------------------
Reporter: packi | Owner: robe
Type: defect | Status: new
Priority: blocker | Milestone: PostGIS 3.6.1
Component: topology | Version: 3.6.x
Resolution: | Keywords:
-----------------------+---------------------------
Comment (by francoisb):
The `topology.topogeometry` composite type appears to be affected too,
probably with the same root cause. I can file a separate ticket if
applicable.
One observation about the "garbage" values, which are not entirely random.
Consider:
- Topology `topo_1` with topology ID 1
- Table `table_1` with a column `topogeom` of type `topology.topogeometry`
- Layer ID 1 of `topo_1` of type poly (type 3), is deployed in column
`topogeom`
After the 3.6.0 upgrade, the `relation` table for the topology looks all
good:
{{{
SELECT
topogeo_id,
array_agg(element_id ORDER BY element_id) AS element_ids
FROM topo_1.relation
GROUP BY 1;
}}}
Output limited to one example topogeometry:
{{{
│ topogeo_id │ element_ids │
│ 57 │ {134,138} │
}}}
However, the `topogeom` column has garbage in all `id` and `type`
subfields:
{{{
SELECT topogeom
FROM table_1;
}}}
Output limited to the row corresponding to the example topogeo_id = 57
from above:
{{{
│ topogeom │
│ (1,1,12884901945,262144) │
}}}
Expected is: (1,1,57,3)
Selecting subfields individually or with `.*` doesn't change the behavior:
{{{
SELECT (topogeom).id
FROM table_1;
}}}
{{{
│ id │
│ 12884901945 │
}}}
{{{
SELECT (topogeom).*
FROM table_1;
}}}
{{{
│ topology_id │ layer_id │ id │ type │
│ 1 │ 1 │ 12884901945 │ 0 │
}}}
The value for `type` is truely random and inconsistent. It doesn't change
if the same query is executed immediately after (presumably memory
contents doesn't change), but it will change if other queries are run in
between, or across different sessions; a hint that it depends on some
unrelated memory contents.
The value for `id`, however, is consistently the same and predictable. The
hex representation for `12884901945` as 64-bit:
00000003 00000039
- `3` appears in the 32 most significant bits, the expected `type` value
- `57` (in decimal, or `39` in hex) appears in the 32 least significant
bits, the expected `id` value
Clearly, the actual bytes for `id` are 32-bit in reality, but the
PostgreSQL/PostGIS code is incorrectly trying to read 64-bit. The extra
32-bit read are from the `type`, because this field happens to be
contiguous in the topogeometry composite type.
One could programmatically compute the correct `topogeometry` values, with
a single SQL UPDATE query, by parsing the bytes of the garbage `id`, to
recover correct `id` and `type`.
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5983#comment:8>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list