[PostGIS] #5983: Potential data corruption in topology.topoelement after upgrade to 3.6.0

PostGIS trac at osgeo.org
Wed Oct 8 00:53:22 PDT 2025


#5983: Potential data corruption in topology.topoelement after upgrade to 3.6.0
-----------------------+---------------------------
  Reporter:  packi     |      Owner:  robe
      Type:  defect    |     Status:  new
  Priority:  blocker   |  Milestone:  PostGIS 3.6.1
 Component:  topology  |    Version:  3.6.x
Resolution:            |   Keywords:
-----------------------+---------------------------
Comment (by francoisb):

 The `topology.topogeometry` composite type appears to be affected too,
 probably with the same root cause. I can file a separate ticket if
 applicable.

 One observation about the "garbage" values, which are not entirely random.

 Consider:
 - Topology `topo_1` with topology ID 1
 - Table `table_1` with a column `topogeom` of type `topology.topogeometry`
 - Layer ID 1 of `topo_1` of type poly (type 3), is deployed in column
 `topogeom`

 After the 3.6.0 upgrade, the `relation` table for the topology looks all
 good:

 {{{
 SELECT
   topogeo_id,
   array_agg(element_id ORDER BY element_id) AS element_ids
 FROM topo_1.relation
 GROUP BY 1;
 }}}

 Output limited to one example topogeometry:

 {{{
 │ topogeo_id │ element_ids │
 │         57 │ {134,138}   │
 }}}

 However, the `topogeom` column has garbage in all `id` and `type`
 subfields:

 {{{
 SELECT topogeom
 FROM table_1;
 }}}

 Output limited to the row corresponding to the example topogeo_id = 57
 from above:

 {{{
 │          topogeom        │
 │ (1,1,12884901945,262144) │
 }}}

 Expected is: (1,1,57,3)

 Selecting subfields individually or with `.*` doesn't change the behavior:

 {{{
 SELECT (topogeom).id
 FROM table_1;
 }}}

 {{{
 │     id      │
 │ 12884901945 │
 }}}

 {{{
 SELECT (topogeom).*
 FROM table_1;
 }}}

 {{{
 │ topology_id │ layer_id │     id      │ type │
 │           1 │        1 │ 12884901945 │    0 │
 }}}

 The value for `type` is truely random and inconsistent. It doesn't change
 if the same query is executed immediately after (presumably memory
 contents doesn't change), but it will change if other queries are run in
 between, or across different sessions; a hint that it depends on some
 unrelated memory contents.

 The value for `id`, however, is consistently the same and predictable. The
 hex representation for `12884901945` as 64-bit:

 00000003 00000039

 - `3` appears in the 32 most significant bits, the expected `type` value
 - `57` (in decimal, or `39` in hex) appears in the 32 least significant
 bits, the expected `id` value

 Clearly, the actual bytes for `id` are 32-bit in reality, but the
 PostgreSQL/PostGIS code is incorrectly trying to read 64-bit. The extra
 32-bit read are from the `type`, because this field happens to be
 contiguous in the topogeometry composite type.

 One could programmatically compute the correct `topogeometry` values, with
 a single SQL UPDATE query, by parsing the bytes of the garbage `id`, to
 recover correct `id` and `type`.
-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5983#comment:8>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list