[postgis-users] Queries on partitioned table not behaving as expected
Andy Colson
andy at squeakycode.net
Mon Nov 2 12:41:00 PST 2015
Yep, +1 do all this.
Additionally: select performance is not a reason to use partitioned
tables. Its only update/delete performance that's increased.
Ok, _some_ selects might be faster with partitions, but I'd bet not
indexed select statements. You'd have to benchmark it.
If you really want 1400+ tables you might be able to write yourself a
function that figures out what table/tables to query and generates a sql
statement dynamically.
Like from the docs:
http://www.postgresql.org/docs/9.4/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
(sorry, watch the word wrap)
-Andy
On 11/2/2015 12:53 PM, Nicolas Ribot wrote:
> Hi,
>
> AFAIU, the restriction on partitioned tables is even bigger. From PG doc
> (http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION)
> :
>
> "The following caveats apply to constraint exclusion:
>
> *
>
> Constraint exclusion only works when the query's WHERE clause
> contains constants (or externally supplied parameters). For example,
> a comparison against a non-immutable function such
> as|CURRENT_TIMESTAMP| cannot be optimized, since the planner cannot
> know which partition the function value might fall into at run time.
>
> "
> Using a subquery to provide values won't trigger the constraint exclusion.
> One solution could be an anonymous block or a function to explicitly
> write query with constant values got from a previous query.
>
> Rubio, take also care that a lot of partitioned tables is not
> recommended by Pg doc:
> "All constraints on all partitions of the master table are examined
> during constraint exclusion, so large numbers of partitions are likely
> to increase query planning time considerably. Partitioning using these
> techniques will work well with up to perhaps a hundred partitions; don't
> try to use many thousands of partitions."
>
> I would go for a smaller set of bigger tables (10's millions rows or so).
>
> Nicolas
>
> On 2 November 2015 at 18:28, Bborie Park <dustymugs at gmail.com
> <mailto:dustymugs at gmail.com>> wrote:
>
> That check constraint is not going to help you as it is too
> complicated for the partitioning task. I don't believe the
> partitioning constraint can be functional but needs to be simpler
> and built upon a basic data type...
>
> Something like...
>
> CREATE TABLE max_extent AS (
> id integer PRIMARY KEY,
> geom geometry(POLYGON)
> );
>
> And then the parent/partition tables had a reference to pc_extent...
>
> CREATE TABLE ahn3_pointcloud.ahn3_all
> (
> id integer NOT NULL,
> pa pcpatch(7),
> max_extent_id integer
> );
>
> Then your queries become...
>
> WITH max_extents AS (
> SELECT
> id
> FROM max_extent
> WHERE PC_Intersects(geom,
> ST_MakeEnvelope(120740,486076,121074,486292, 28992))
> )
> SELECT
> *
> FROM ahn3_all
> JOIN max_extents
> ON ahn3_all.max_extent_id = max_extents.id <http://max_extents.id>
>
> -bborie
>
> On Mon, Nov 2, 2015 at 8:45 AM, Rubio Vaughan
> <rubio.vaughan at geodan.nl <mailto:rubio.vaughan at geodan.nl>> wrote:
>
> Dear all,
>
> We are trying to load a large LIDAR pointcloud dataset into
> multiple tables in PostGIS, using the point cloud extension.
> Eventually the whole data set will consist of 1400+ tables,
> containing about 1 million records each. Using a union view on
> all these tables would result in terrible performance, which is
> why we are trying to optimize query performance using
> partitioned tables. According to the documentation, the use of
> partitioned tables with CHECK constraints should cause the query
> planner to only scan those tables for which the CHECK constraint
> matches. Excerpt from the documentation:
> --------------------------------------------------------------------------------
> *18.7.4. Other Planner Options*
> constraint_exclusion(enum): Controls the query planner's use of
> table constraints to optimize queries. The allowed values of
> constraint_exclusion are on (examine constraints for all
> tables), off (never examine constraints), and partition (examine
> constraints only for inheritance child tables and UNION ALL
> subqueries). partition is the default setting. It is often used
> with inheritance and partitioned tables to improve performance.
> When this parameter allows it for a particular table, the
> planner compares query conditions with the table's CHECK
> constraints, and omits scanning tables for which the conditions
> contradict the constraints.
> --------------------------------------------------------------------------------
>
> However, as you can see from the example query below, the
> indexes for all child tables are still scanned. I would expect
> the query planner to only scan table c_25gn1, which contains the
> queried region. Does anyone here have experience with
> partitioned tables? I would be delighted to get some pointers
> for figuring out this problem.
>
> Thanks in advance for any help!
>
> Best regards,
> Rubio Vaughan
>
More information about the postgis-users
mailing list