[postgis-users] Issue with unreasonable memory usage when analyzing table of geometry type

Peter Geoghegan pg at heroku.com
Tue Apr 9 09:47:36 PDT 2013


Hello,

I appear to have a problem running ANALYZE on a table with the Postgis
geometry type. The table in question is 1099 MB, but only has 10631
tuples.

The call stack during the ANALYZE usually looks like this:

(gdb) bt
#0 pglz_decompress (source=0x43f6bd98, dest=0x43fb5b2c "") at
pg_lzcompress.c:711
#1 0x0000000000493c17 in heap_tuple_untoast_attr_slice (attr=0x7f8f49b8fd90,
sliceoffset=0, slicelength=40) at tuptoaster.c:217
#2 0x0000000000880010 in pg_detoast_datum_slice (datum=0x7f8f49b8fd90,
first=0, count=40)
at fmgr.c:2266
#3 0x00007f8f4aa6e085 in gserialized_datum_get_gidx_p (gsdatum=140253393911184,
gidx=gidx at entry=0x7fffbbac5d40) at gserialized_gist.c:248
#4 0x00007f8f4aa6e170 in gserialized_datum_get_gbox_p (gsdatum=<optimized out>,
gbox=0x7fffbbac5ef0) at gserialized_gist.c:65
#5 0x00007f8f4aa379ec in compute_geometry_stats (stats=0x1310678,
fetchfunc=0x5a0ae7 <std_fetch_func>, samplerows=10631, totalrows=10631)
at geometry_estimate.c:865
#6 0x000000000059e02f in do_analyze_rel (onerel=0x7f9054bd1000,
vacstmt=0x12ca5a0,
acquirefunc=0x59f0f7 <acquire_sample_rows>, relpages=726, inh=0
'\000', elevel=17)
at analyze.c:525
#7 0x000000000059d5a9 in analyze_rel (relid=17173, vacstmt=0x12ca5a0,
bstrategy=0x130d9e8) at analyze.c:265
#8 0x0000000000612aab in vacuum (vacstmt=0x12ca5a0, relid=0, do_toast=1 '\001',
bstrategy=0x130d9e8, for_wraparound=0 '\000', isTopLevel=1 '\001') at
vacuum.c:247
#9 0x0000000000764c7a in standard_ProcessUtility (parsetree=0x12ca5a0,
queryString=0x12c9af8 "analyze verbose onesite_overview ;", params=0x0,
isTopLevel=1 '\001', dest=0x12ca928, completionTag=0x7fffbbac6910 "")
---Type <return> to continue, or q <return> to quit---
at utility.c:1027
#10 0x00007f9052a97e15 in pgss_ProcessUtility ()
from /home/pg/pgsql/lib/pg_stat_statements.so
#11 0x000000000076395b in ProcessUtility (parsetree=0x12ca5a0,
queryString=0x12c9af8 "analyze verbose onesite_overview ;", params=0x0,
isTopLevel=1 '\001', dest=0x12ca928, completionTag=0x7fffbbac6910 "")
at utility.c:332
#12 0x00000000007628e0 in PortalRunUtility (portal=0x1248398,
utilityStmt=0x12ca5a0,
isTopLevel=1 '\001', dest=0x12ca928, completionTag=0x7fffbbac6910 "")
at pquery.c:1185
#13 0x0000000000762ab8 in PortalRunMulti (portal=0x1248398,
isTopLevel=1 '\001',
dest=0x12ca928, altdest=0x12ca928, completionTag=0x7fffbbac6910 "") at
pquery.c:1316
#14 0x0000000000761fe6 in PortalRun (portal=0x1248398,
count=9223372036854775807,
isTopLevel=1 '\001', dest=0x12ca928, altdest=0x12ca928,
completionTag=0x7fffbbac6910 "") at pquery.c:814
#15 0x000000000075be41 in exec_simple_query (
query_string=0x12c9af8 "analyze verbose onesite_overview ;") at postgres.c:1046
#16 0x0000000000760190 in PostgresMain (argc=2, argv=0x1229380,
username=0x12291c8 "pg")
at postgres.c:3958
#17 0x0000000000700321 in BackendRun (port=0x124c370) at postmaster.c:3619
#18 0x00000000006ff991 in BackendStartup (port=0x124c370) at postmaster.c:3304
#19 0x00000000006fc7b3 in ServerLoop () at postmaster.c:1367
#20 0x00000000006fc0aa in PostmasterMain (argc=2, argv=0x1227230) at
postmaster.c:1127
#21 0x0000000000667225 in main (argc=2, argv=0x1227230) at main.c:199

This is with PostGIS 2.0.3.

These are very fat tuples - the geometry datums are very large. So I'm
surprised by this:

#5 0x00007f8f4aa379ec in compute_geometry_stats (stats=0x1310678,
fetchfunc=0x5a0ae7 <std_fetch_func>, samplerows=10631, totalrows=10631)
at geometry_estimate.c:865

It looks like acquire_sample_rows (called through acquirefunc) has
told ANALYZE to sample every single row! PostGIS apparently isn't
inclined to release memory for each row sampled, as they are
detoasted. The amount of memory allocated here tops out at about 2
Gigabytes.

I think what needs to happen here is that PostGIS needs to add a
type-specific typanalyze function to less aggressively size the
minimum number of tuples to sample (in other words, to provide a
custom std_typanalyze function, as some built-in types do).

Thanks
-- 
Peter Geoghegan


More information about the postgis-users mailing list