[PostGIS] #6078: Improve "raster_overviews" definition
PostGIS
trac at osgeo.org
Mon Jun 1 03:25:23 PDT 2026
#6078: Improve "raster_overviews" definition
--------------------------+---------------------------
Reporter: Laurenz Albe | Owner: pramsey
Type: enhancement | Status: new
Priority: medium | Milestone: PostGIS 3.6.3
Component: postgis | Version: 3.6.x
Keywords: |
--------------------------+---------------------------
A customer is having problems with the query
{{{
#!sql
SELECT o_table_name, overview_factor, o_raster_column, o_table_schema
FROM raster_overviews
WHERE r_table_schema = 'some_schema'
AND r_table_name = 'some_table'
AND r_raster_column = 'some_column'
ORDER BY overview_factor;
}}}
The execution plan is as follows:
{{{
Sort (cost=17083.32..17083.33 rows=1 width=196) (actual
time=9455.757..9455.762 rows=9 loops=1)
Sort Key: ((TRIM(BOTH FROM
split_part(pg_get_constraintdef(pg_constraint.oid), ','::text,
2)))::integer)
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=1476736
-> Nested Loop (cost=9.28..17083.31 rows=1 width=196) (actual
time=45.400..9455.729 rows=9 loops=1)
Join Filter: (c.relnamespace = pg_constraint.connamespace)
Buffers: shared hit=1476733
-> Nested Loop (cost=8.86..17081.96 rows=1 width=208) (actual
time=1.561..314.613 rows=15113 loops=1)
Join Filter: (c.relnamespace = n.oid)
Rows Removed by Join Filter: 120840
Buffers: shared hit=146994
-> Nested Loop (cost=8.86..17080.62 rows=1 width=140)
(actual time=1.554..268.991 rows=15113 loops=1)
Buffers: shared hit=131881
-> Hash Join (cost=8.44..17072.13 rows=16 width=68)
(actual time=0.317..160.670 rows=15115 loops=1)
Hash Cond: (a.atttypid = t.oid)
Buffers: shared hit=10941
-> Seq Scan on pg_attribute a
(cost=0.00..15789.68 rows=485268 width=72) (actual time=0.007..125.174
rows=488960 loops=1)
Filter: (NOT attisdropped)
Buffers: shared hit=10937
-> Hash (cost=8.43..8.43 rows=1 width=4)
(actual time=0.020..0.021 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage:
9kB
Buffers: shared hit=4
-> Index Scan using
pg_type_typname_nsp_index on pg_type t (cost=0.41..8.43 rows=1 width=4)
(actual time=0.017..0.018 rows=1 loops=1)
Index Cond: (typname =
'raster'::name)
Buffers: shared hit=4
-> Index Scan using pg_class_oid_index on pg_class c
(cost=0.42..0.53 rows=1 width=72) (actual time=0.007..0.007 rows=1
loops=15115)
Index Cond: (oid = a.attrelid)
Filter: ((NOT
pg_is_other_temp_schema(relnamespace)) AND has_table_privilege(oid,
'SELECT'::text) AND ((relkind)::text = ANY ('{r,v,m,f}'::text[])))
Rows Removed by Filter: 0
Buffers: shared hit=120940
-> Seq Scan on pg_namespace n (cost=0.00..1.15 rows=15
width=68) (actual time=0.001..0.001 rows=9 loops=15113)
Buffers: shared hit=15113
-> Index Scan using
pg_constraint_conrelid_contypid_conname_index on pg_constraint
(cost=0.42..1.33 rows=1 width=12) (actual time=0.604..0.605 rows=0
loops=15113)
Index Cond: (conrelid = a.attrelid)
Filter: ((pg_get_constraintdef(oid) ~~
'%_overview_constraint(%'::text) AND
((split_part(split_part(pg_get_constraintdef(oid), '''::name'::text, 1),
''''::text, 2))::name = 'some_schema'::name) AND
((split_part(split_part(pg_get_constraintdef(oid), '''::name'::text, 2),
''''::text, 2))::name = 'some_schema'::name) AND
((split_part(split_part(pg_get_constraintdef(oid), '''::name'::text, 3),
''''::text, 2))::name = 'some_column'::name))
Rows Removed by Filter: 12
Buffers: shared hit=1329703
Planning:
Buffers: shared hit=387
Planning Time: 1.658 ms
Execution Time: 9455.844 ms
}}}
The problematic part is
{{{
-> Hash Join (... rows=16 ...) (actual ... rows=15115 ...)
Hash Cond: (a.atttypid = t.oid)
-> Seq Scan on pg_attribute a (... rows=485268 ...) (actual ...
rows=488960 ...)
Filter: (NOT attisdropped)
-> Hash (... rows=1 ...) (actual ... rows=1 ...)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using pg_type_typname_nsp_index on pg_type t
(... rows=1 ...) (actual ... rows=1 ...)
Index Cond: (typname = 'raster'::name)
}}}
The estimates on both {{{pg_attribute}}} and {{{pg_type}}} are correct,
but the row count estimate on the join is horribly off. This causes
PostgreSQL to choose a nested loop join, which is the cause of the bad
performance.
The underlying reason for the bad estimate is that PostgreSQL cannot tell
at query planning time what the OID of the {{{raster}}} type is going to
be, so it cannot use the statistics on {{{pg_attribute}}} that would tell
it how many {{{raster}}} columns there are.
I looked at the definition of {{{raster_overviews}}}, and the join with
{{{pg_type}}} is unnecessary. Instead, you can use the {{{regtype}}}
pseudo-type (that has been around since 2002) to directly query
{{{pg_attribute}}}.
So the following patch would simplify the view definition and would
additionally give better estimates:
{{{
#!diff
diff --git a/raster/rt_pg/rtpostgis.sql.in b/raster/rt_pg/rtpostgis.sql.in
index 566a23b86..b02018c4b 100644
--- a/raster/rt_pg/rtpostgis.sql.in
+++ b/raster/rt_pg/rtpostgis.sql.in
@@ -8448,13 +8448,11 @@ CREATE OR REPLACE VIEW raster_overviews AS
FROM
pg_class c,
pg_attribute a,
- pg_type t,
pg_namespace n,
(SELECT connamespace, conrelid, conkey,
pg_get_constraintdef(oid) As consrc
FROM pg_constraint) AS s
- WHERE t.typname = 'raster'::name
- AND a.attisdropped = false
- AND a.atttypid = t.oid
+ WHERE a.attisdropped = false
+ AND a.atttypid = 'raster'::regtype
AND a.attrelid = c.oid
AND c.relnamespace = n.oid
AND c.relkind = ANY(ARRAY['r'::char, 'v'::char, 'm'::char,
'f'::char])
}}}
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/6078>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list