[postgis-tickets] [PostGIS] #5385: Postgres malloc assertion fail when using pg_cancel_backend with ST_AsMVT
PostGIS
trac at osgeo.org
Thu May 18 09:32:07 PDT 2023
#5385: Postgres malloc assertion fail when using pg_cancel_backend with ST_AsMVT
----------------------------+---------------------------
Reporter: gbartonowenstl | Owner: pramsey
Type: defect | Status: new
Priority: high | Milestone: PostGIS 3.3.3
Component: postgis | Version: 3.3.x
Keywords: malloc, crash |
----------------------------+---------------------------
We recently experienced a very odd behaviour where our postgres database
would experience a malloc assertion fail which would force the entire
database into recovery mode, dropping all currently running queries for
10-20 seconds while it recovers - which, as you can imagine, caused
singificant disruption to our service.
The log is as follows:
{{{
Apr 20 16:30:46 wag-prod postgres[12134]: [21] <database: - user: -
timestamp:2023-04-20 16:30:46.440 UTC - host: - commandtag: -
sqlerrorcode:00000 - pid:12134 - processstarttime:2023-04-19 01:54:51 UTC>
LOG: server process (PID 1037998) was terminated by signal 6: Aborted
<database: - user: - timestamp:2023-04-20 16:30:46.440 UTC - host: -
commandtag: - sqlerrorcode:00000 - pid:12134 - processstarttime:2023-04-19
01:54:51 UTC> DETAIL: Failed process was running: -- <URL>
<MVT Query>
Apr 20 16:30:46 wag-prod postgres[12134]: [22] <database: - user: -
timestamp:2023-04-20 16:30:46.440 UTC - host: - commandtag: -
sqlerrorcode:00000 - pid:12134 - processstarttime:2023-04-19 01:54:51 UTC>
LOG: terminating any other active server processes
}}}
{{{
bprod 10.82.122.108(57900) SELECT: malloc.c:2617: sysmalloc: Assertion
`(old_top == initial_top (av) && old_size == 0) || ((unsigned long)
(old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end
& (pagesize - 1)) == 0)' failed.
}}}
We traced the root cause to our newly increased use of calling
pg_cancel_backend() within the narrow window in which ST_AsMVT or
ST_AsMVTGeom queries were in flight.
I've created a recreation that calls a stand alone MVT query, and runs
pg_cancel_backend in a loop, until we hit this recovery mode case.
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/5385>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list