[postgis-users] Queries on partitioned table not behaving as expected
Rubio Vaughan
rubio.vaughan at geodan.nl
Tue Nov 3 03:22:39 PST 2015
Thank you Nicolas.
Going for a smaller set of bigger tables is a good suggestion. However,
considering I'm having trouble getting constraint exclusion to work
altogether I might have to look at other solutions (such as a custom
function).
Best regards,
Rubio
On 2-11-2015 19:53, Nicolas Ribot wrote:
> Hi,
>
> AFAIU, the restriction on partitioned tables is even bigger. From PG
> doc
> (http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION)
> :
>
> "The following caveats apply to constraint exclusion:
>
> *
>
> Constraint exclusion only works when the query's WHERE clause
> contains constants (or externally supplied parameters). For
> example, a comparison against a non-immutable function such
> as|CURRENT_TIMESTAMP| cannot be optimized, since the planner
> cannot know which partition the function value might fall into at
> run time.
>
> "
> Using a subquery to provide values won't trigger the constraint exclusion.
> One solution could be an anonymous block or a function to explicitly
> write query with constant values got from a previous query.
>
> Rubio, take also care that a lot of partitioned tables is not
> recommended by Pg doc:
> "All constraints on all partitions of the master table are examined
> during constraint exclusion, so large numbers of partitions are likely
> to increase query planning time considerably. Partitioning using these
> techniques will work well with up to perhaps a hundred partitions;
> don't try to use many thousands of partitions."
>
> I would go for a smaller set of bigger tables (10's millions rows or so).
>
> Nicolas
>
> On 2 November 2015 at 18:28, Bborie Park <dustymugs at gmail.com
> <mailto:dustymugs at gmail.com>> wrote:
>
> That check constraint is not going to help you as it is too
> complicated for the partitioning task. I don't believe the
> partitioning constraint can be functional but needs to be simpler
> and built upon a basic data type...
>
> Something like...
>
> CREATE TABLE max_extent AS (
> id integer PRIMARY KEY,
> geom geometry(POLYGON)
> );
>
> And then the parent/partition tables had a reference to pc_extent...
>
> CREATE TABLE ahn3_pointcloud.ahn3_all
> (
> id integer NOT NULL,
> pa pcpatch(7),
> max_extent_id integer
> );
>
> Then your queries become...
>
> WITH max_extents AS (
> SELECT
> id
> FROM max_extent
> WHERE PC_Intersects(geom,
> ST_MakeEnvelope(120740,486076,121074,486292, 28992))
> )
> SELECT
> *
> FROM ahn3_all
> JOIN max_extents
> ON ahn3_all.max_extent_id = max_extents.id <http://max_extents.id>
>
> -bborie
>
> On Mon, Nov 2, 2015 at 8:45 AM, Rubio Vaughan
> <rubio.vaughan at geodan.nl <mailto:rubio.vaughan at geodan.nl>> wrote:
>
> Dear all,
>
> We are trying to load a large LIDAR pointcloud dataset into
> multiple tables in PostGIS, using the point cloud extension.
> Eventually the whole data set will consist of 1400+ tables,
> containing about 1 million records each. Using a union view on
> all these tables would result in terrible performance, which
> is why we are trying to optimize query performance using
> partitioned tables. According to the documentation, the use of
> partitioned tables with CHECK constraints should cause the
> query planner to only scan those tables for which the CHECK
> constraint matches. Excerpt from the documentation:
> --------------------------------------------------------------------------------
> *18.7.4. Other Planner Options*
> constraint_exclusion(enum): Controls the query planner's use
> of table constraints to optimize queries. The allowed values
> of constraint_exclusion are on (examine constraints for all
> tables), off (never examine constraints), and partition
> (examine constraints only for inheritance child tables and
> UNION ALL subqueries). partition is the default setting. It is
> often used with inheritance and partitioned tables to improve
> performance. When this parameter allows it for a particular
> table, the planner compares query conditions with the table's
> CHECK constraints, and omits scanning tables for which the
> conditions contradict the constraints.
> --------------------------------------------------------------------------------
>
> However, as you can see from the example query below, the
> indexes for all child tables are still scanned. I would expect
> the query planner to only scan table c_25gn1, which contains
> the queried region. Does anyone here have experience with
> partitioned tables? I would be delighted to get some pointers
> for figuring out this problem.
>
> Thanks in advance for any help!
>
> Best regards,
> Rubio Vaughan
>
> *General info:**
> *--------------------------------------------------------------------------------
> SELECT postgis_full_version();
> "POSTGIS="2.2.0 r14208" GEOS="3.5.0-CAPI-1.9.0 r4084"
> SFCGAL="1.2.0" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL
> 2.0.1, released 2015/09/15" LIBXML="2.9.1" LIBJSON="0.11.99"
> TOPOLOGY (topology procs from "2.1.4 r12966" need upgrade) RASTER"
>
> SHOW "constraint_exclusion";
> "partition"
> --------------------------------------------------------------------------------
>
> *The parent table:*
> --------------------------------------------------------------------------------
> CREATE TABLE ahn3_pointcloud.ahn3_all
> (
> id integer NOT NULL,
> pa pcpatch(7)
> )
> --------------------------------------------------------------------------------
>
> *One of the child tables:*
> --------------------------------------------------------------------------------
> CREATE TABLE ahn3_pointcloud.c_25dn2
> (
> id serial NOT NULL,
> pa pcpatch(7),
> CONSTRAINT c_25dn2_pkey PRIMARY KEY (id),
> CONSTRAINT check_extent CHECK (pc_intersects(pa,
> st_geomfromtext('POLYGON((115000 487500,120000 487500,120000
> 481250,115000 481250,115000 487500))'::text, 28992)))
> )
> INHERITS (ahn3_pointcloud.ahn3_all)
> --------------------------------------------------------------------------------
>
> *Example query:*
> --------------------------------------------------------------------------------
> EXPLAIN SELECT COUNT(pa) FROM ahn3_pointcloud.ahn3_all
> WHERE PC_Intersects(pa,
> ST_MakeEnvelope(120740,486076,121074,486292, 28992))
>
> "Aggregate (cost=301989.36..301989.37 rows=1 width=32)"
> " -> Append (cost=0.00..301248.82 rows=296216 width=32)"
> " -> Seq Scan on ahn3_all (cost=0.00..0.00 rows=1
> width=32)"
> " Filter:
> (('010300002040710000010000000500000000000000407AFD4000000000<snip>
> " -> Bitmap Heap Scan on c_25gn1
> (cost=6064.85..59546.09 rows=58574 width=32)"
> " Recheck Cond:
> ('010300002040710000010000000500000000000000407AFD40000<snip>
> " Filter:
> _st_intersects('010300002040710000010000000500000000000000407<snip>
> " -> Bitmap Index Scan on c_25gn1_idx
> (cost=0.00..6050.20 rows=175722 width=0)"
> " Index Cond:
> ('010300002040710000010000000500000000000000407AFD4<snip>
> " -> Bitmap Heap Scan on c_30fz1
> (cost=7990.39..77907.99 rows=76576 width=32)"
> " Recheck Cond:
> ('010300002040710000010000000500000000000000407AFD40000<snip>
> " Filter:
> _st_intersects('010300002040710000010000000500000000000000407<snip>
> " -> Bitmap Index Scan on c_30fz1_idx
> (cost=0.00..7971.24 rows=229727 width=0)"
> " Index Cond:
> ('010300002040710000010000000500000000000000407AFD4<snip>
> " -> Bitmap Heap Scan on c_30fz2
> (cost=7393.57..71499.52 rows=70210 width=32)"
> " Recheck Cond:
> ('010300002040710000010000000500000000000000407AFD40000<snip>
> " Filter:
> _st_intersects('010300002040710000010000000500000000000000407<snip>
> " -> Bitmap Index Scan on c_30fz2_idx
> (cost=0.00..7376.02 rows=210631 width=0)"
> " Index Cond:
> ('010300002040710000010000000500000000000000407AFD4<snip>
> " -> Bitmap Heap Scan on c_25ez1
> (cost=2624.69..26079.63 rows=25688 width=32)"
> " Recheck Cond:
> ('010300002040710000010000000500000000000000407AFD40000<snip>
> " Filter:
> _st_intersects('010300002040710000010000000500000000000000407<snip>
> " -> Bitmap Index Scan on c_25ez1_idx
> (cost=0.00..2618.27 rows=77064 width=0)"
> " Index Cond:
> ('010300002040710000010000000500000000000000407AFD4<snip>
> " -> Bitmap Heap Scan on c_25dn2
> (cost=6714.83..66215.58 rows=65167 width=32)"
> " Recheck Cond:
> ('010300002040710000010000000500000000000000407AFD40000<snip>
> " Filter:
> _st_intersects('010300002040710000010000000500000000000000407<snip>
> " -> Bitmap Index Scan on c_25dn2_idx
> (cost=0.00..6698.54 rows=195500 width=0)"
> " Index Cond:
> ('010300002040710000010000000500000000000000407AFD4<snip>
> --------------------------------------------------------------------------------
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> <mailto:postgis-users at lists.osgeo.org>
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20151103/4dd63525/attachment.html>
More information about the postgis-users
mailing list