[postgis-users] Queries on partitioned table not behaving as expected

Rubio Vaughan rubio.vaughan at geodan.nl
Tue Nov 3 03:21:49 PST 2015


Thank you Bborie.

Very nice of you to reply. I got the idea for spatial constraints on 
partitioned tables from the PostGIS Cookbook (chapter 2, Extending 
inheritance - table partitioning), of which you are co-author. The 
example in that chapter uses spatial functions in the CHECK clause. Does 
the inclusion in that book mean that there are scenarios in which 
spatial constraints do work in combination with constraint exclusion?

I tried your suggestion, using a helper table with extents, and 
unfortunately the JOIN does not result in constraint exclusion, it still 
scans all indexes. Explicitly using the extent identifier in a WHERE 
clause does optimize the query (see below). It seems PostgreSQL is a bit 
sensitive about constraint exclusion. I might look at writing a custom 
function in order to achieve our goal.

Best regards,
Rubio

--------------------------------------------------------------------------------
EXPLAIN
WITH units AS (
  SELECT unit
  FROM ahn3_pointcloud.ahn_units
  WHERE ST_Intersects(geom, ST_MakeEnvelope(120740,486076,121074,486292, 
28992))
)
SELECT * FROM ahn3_pointcloud.ahn3_all a
JOIN units u ON a.ahn_unit = u.unit
LIMIT 10

"Limit  (cost=8.45..63.49 rows=10 width=66)"
"  CTE units"
"    ->  Index Scan using ahn_units_geom_idx on ahn_units (cost=0.14.<snip>
"          Index Cond: (geom && '0103000020407100000100000005000000000<snip>
"          Filter: _st_intersects(geom, '01030000204071000001000000050<snip>
"  ->  Hash Join  (cost=0.03..122289.50 rows=22216 width=66)"
"        Hash Cond: ((a.ahn_unit)::text = (u.unit)::text)"
"        ->  Append  (cost=0.00..105405.22 rows=4443223 width=42)"
"              ->  Seq Scan on ahn3_all a  (cost=0.00..0.00 rows=1 wid<snip>
"              ->  Seq Scan on c_25gn1 a_1 (cost=0.00..20843.10 rows=<snip>
"              ->  Seq Scan on c_30fz1 a_2 (cost=0.00..27248.34 rows=<snip>
"              ->  Seq Scan on c_30fz2 a_3 (cost=0.00..24983.57 rows=<snip>
"              ->  Seq Scan on c_25ez1 a_4  (cost=0.00..9141.21 rows=3<snip>
"              ->  Seq Scan on c_25dn2 a_5 (cost=0.00..23189.00 rows=<snip>
"        ->  Hash  (cost=0.02..0.02 rows=1 width=24)"
"              ->  CTE Scan on units u  (cost=0.00..0.02 rows=1 width=24)"

EXPLAIN
WITH units AS (
     SELECT unit
     FROM ahn3_pointcloud.ahn_units
     WHERE ST_Intersects(geom, 
ST_MakeEnvelope(120740,486076,121074,486292, 28992))
)
SELECT * FROM ahn3_pointcloud.ahn3_all a
WHERE a.ahn_unit = '25gn1'
LIMIT 10

"Limit  (cost=0.00..0.26 rows=10 width=42)"
"  ->  Append  (cost=0.00..23039.62 rows=878611 width=42)"
"        ->  Seq Scan on ahn3_all a  (cost=0.00..0.00 rows=1 width=60)"
"              Filter: ((ahn_unit)::text = '25gn1'::text)"
"        ->  Seq Scan on c_25gn1 a_1  (cost=0.00..23039.62 rows=878<snip>"
"              Filter: ((ahn_unit)::text = '25gn1'::text)"
--------------------------------------------------------------------------------


On 2-11-2015 18:28, Bborie Park wrote:
> That check constraint is not going to help you as it is too 
> complicated for the partitioning task. I don't believe the 
> partitioning constraint can be functional but needs to be simpler and 
> built upon a basic data type...
>
> Something like...
>
> CREATE TABLE max_extent AS (
>   id integer PRIMARY KEY,
>   geom geometry(POLYGON)
> );
>
> And then the parent/partition tables had a reference to pc_extent...
>
> CREATE TABLE ahn3_pointcloud.ahn3_all
> (
>   id integer NOT NULL,
>   pa pcpatch(7),
>   max_extent_id integer
> );
>
> Then your queries become...
>
> WITH max_extents AS (
> SELECT
>   id
> FROM max_extent
> WHERE PC_Intersects(geom, ST_MakeEnvelope(120740,486076,121074,486292, 
> 28992))
> )
> SELECT
> *
> FROM ahn3_all
> JOIN max_extents
> ON ahn3_all.max_extent_id = max_extents.id <http://max_extents.id>
>
> -bborie
>
> On Mon, Nov 2, 2015 at 8:45 AM, Rubio Vaughan <rubio.vaughan at geodan.nl 
> <mailto:rubio.vaughan at geodan.nl>> wrote:
>
>     Dear all,
>
>     We are trying to load a large LIDAR pointcloud dataset into
>     multiple tables in PostGIS, using the point cloud extension.
>     Eventually the whole data set will consist of 1400+ tables,
>     containing about 1 million records each. Using a union view on all
>     these tables would result in terrible performance, which is why we
>     are trying to optimize query performance using partitioned tables.
>     According to the documentation, the use of partitioned tables with
>     CHECK constraints should cause the query planner to only scan
>     those tables for which the CHECK constraint matches. Excerpt from
>     the documentation:
>     --------------------------------------------------------------------------------
>     *18.7.4. Other Planner Options*
>     constraint_exclusion(enum): Controls the query planner's use of
>     table constraints to optimize queries. The allowed values of
>     constraint_exclusion are on (examine constraints for all tables),
>     off (never examine constraints), and partition (examine
>     constraints only for inheritance child tables and UNION ALL
>     subqueries). partition is the default setting. It is often used
>     with inheritance and partitioned tables to improve performance.
>     When this parameter allows it for a particular table, the planner
>     compares query conditions with the table's CHECK constraints, and
>     omits scanning tables for which the conditions contradict the
>     constraints.
>     --------------------------------------------------------------------------------
>
>     However, as you can see from the example query below, the indexes
>     for all child tables are still scanned. I would expect the query
>     planner to only scan table c_25gn1, which contains the queried
>     region. Does anyone here have experience with partitioned tables?
>     I would be delighted to get some pointers for figuring out this
>     problem.
>
>     Thanks in advance for any help!
>
>     Best regards,
>     Rubio Vaughan
>
>     *General info:**
>     *--------------------------------------------------------------------------------
>     SELECT postgis_full_version();
>     "POSTGIS="2.2.0 r14208" GEOS="3.5.0-CAPI-1.9.0 r4084"
>     SFCGAL="1.2.0" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 2.0.1,
>     released 2015/09/15" LIBXML="2.9.1" LIBJSON="0.11.99" TOPOLOGY
>     (topology procs from "2.1.4 r12966" need upgrade) RASTER"
>
>     SHOW "constraint_exclusion";
>     "partition"
>     --------------------------------------------------------------------------------
>
>     *The parent table:*
>     --------------------------------------------------------------------------------
>     CREATE TABLE ahn3_pointcloud.ahn3_all
>     (
>     id integer NOT NULL,
>     pa pcpatch(7)
>     )
>     --------------------------------------------------------------------------------
>
>     *One of the child tables:*
>     --------------------------------------------------------------------------------
>     CREATE TABLE ahn3_pointcloud.c_25dn2
>     (
>       id serial NOT NULL,
>       pa pcpatch(7),
>       CONSTRAINT c_25dn2_pkey PRIMARY KEY (id),
>       CONSTRAINT check_extent CHECK (pc_intersects(pa,
>     st_geomfromtext('POLYGON((115000 487500,120000 487500,120000
>     481250,115000 481250,115000 487500))'::text, 28992)))
>     )
>     INHERITS (ahn3_pointcloud.ahn3_all)
>     --------------------------------------------------------------------------------
>
>     *Example query:*
>     --------------------------------------------------------------------------------
>     EXPLAIN SELECT COUNT(pa) FROM ahn3_pointcloud.ahn3_all
>     WHERE PC_Intersects(pa,
>     ST_MakeEnvelope(120740,486076,121074,486292, 28992))
>
>     "Aggregate  (cost=301989.36..301989.37 rows=1 width=32)"
>     "  ->  Append  (cost=0.00..301248.82 rows=296216 width=32)"
>     "        ->  Seq Scan on ahn3_all  (cost=0.00..0.00 rows=1 width=32)"
>     "              Filter:
>     (('010300002040710000010000000500000000000000407AFD4000000000<snip>
>     "        ->  Bitmap Heap Scan on c_25gn1 (cost=6064.85..59546.09
>     rows=58574 width=32)"
>     "              Recheck Cond:
>     ('010300002040710000010000000500000000000000407AFD40000<snip>
>     "              Filter:
>     _st_intersects('010300002040710000010000000500000000000000407<snip>
>     "              ->  Bitmap Index Scan on c_25gn1_idx 
>     (cost=0.00..6050.20 rows=175722 width=0)"
>     "                    Index Cond:
>     ('010300002040710000010000000500000000000000407AFD4<snip>
>     "        ->  Bitmap Heap Scan on c_30fz1 (cost=7990.39..77907.99
>     rows=76576 width=32)"
>     "              Recheck Cond:
>     ('010300002040710000010000000500000000000000407AFD40000<snip>
>     "              Filter:
>     _st_intersects('010300002040710000010000000500000000000000407<snip>
>     "              ->  Bitmap Index Scan on c_30fz1_idx 
>     (cost=0.00..7971.24 rows=229727 width=0)"
>     "                    Index Cond:
>     ('010300002040710000010000000500000000000000407AFD4<snip>
>     "        ->  Bitmap Heap Scan on c_30fz2 (cost=7393.57..71499.52
>     rows=70210 width=32)"
>     "              Recheck Cond:
>     ('010300002040710000010000000500000000000000407AFD40000<snip>
>     "              Filter:
>     _st_intersects('010300002040710000010000000500000000000000407<snip>
>     "              ->  Bitmap Index Scan on c_30fz2_idx 
>     (cost=0.00..7376.02 rows=210631 width=0)"
>     "                    Index Cond:
>     ('010300002040710000010000000500000000000000407AFD4<snip>
>     "        ->  Bitmap Heap Scan on c_25ez1 (cost=2624.69..26079.63
>     rows=25688 width=32)"
>     "              Recheck Cond:
>     ('010300002040710000010000000500000000000000407AFD40000<snip>
>     "              Filter:
>     _st_intersects('010300002040710000010000000500000000000000407<snip>
>     "              ->  Bitmap Index Scan on c_25ez1_idx 
>     (cost=0.00..2618.27 rows=77064 width=0)"
>     "                    Index Cond:
>     ('010300002040710000010000000500000000000000407AFD4<snip>
>     "        ->  Bitmap Heap Scan on c_25dn2 (cost=6714.83..66215.58
>     rows=65167 width=32)"
>     "              Recheck Cond:
>     ('010300002040710000010000000500000000000000407AFD40000<snip>
>     "              Filter:
>     _st_intersects('010300002040710000010000000500000000000000407<snip>
>     "              ->  Bitmap Index Scan on c_25dn2_idx 
>     (cost=0.00..6698.54 rows=195500 width=0)"
>     "                    Index Cond:
>     ('010300002040710000010000000500000000000000407AFD4<snip>
>     --------------------------------------------------------------------------------
>
>
>
>
>     _______________________________________________
>     postgis-users mailing list
>     postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>     http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20151103/5896156f/attachment.html>


More information about the postgis-users mailing list