[postgis-users] Queries on partitioned table not behaving as expected

Andy Colson andy at squeakycode.net
Mon Nov 2 12:41:00 PST 2015


Yep, +1 do all this.

Additionally: select performance is not a reason to use partitioned 
tables.  Its only update/delete performance that's increased.

Ok, _some_ selects might be faster with partitions, but I'd bet not 
indexed select statements.  You'd have to benchmark it.

If you really want 1400+ tables you might be able to write yourself a 
function that figures out what table/tables to query and generates a sql 
statement dynamically.

Like from the docs:

http://www.postgresql.org/docs/9.4/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

(sorry, watch the word wrap)

-Andy


On 11/2/2015 12:53 PM, Nicolas Ribot wrote:
> Hi,
>
> AFAIU, the restriction on partitioned tables is even bigger. From PG doc
> (http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION)
> :
>
> "The following caveats apply to constraint exclusion:
>
>   *
>
>     Constraint exclusion only works when the query's WHERE clause
>     contains constants (or externally supplied parameters). For example,
>     a comparison against a non-immutable function such
>     as|CURRENT_TIMESTAMP| cannot be optimized, since the planner cannot
>     know which partition the function value might fall into at run time.
>
> "
> Using a subquery to provide values won't trigger the constraint exclusion.
> One solution could be an anonymous block or a function to explicitly
> write query with constant values got from a previous query.
>
> Rubio, take also care that a lot of partitioned tables is not
> recommended by Pg doc:
> "All constraints on all partitions of the master table are examined
> during constraint exclusion, so large numbers of partitions are likely
> to increase query planning time considerably. Partitioning using these
> techniques will work well with up to perhaps a hundred partitions; don't
> try to use many thousands of partitions."
>
> I would go for a smaller set of bigger tables (10's millions rows or so).
>
> Nicolas
>
> On 2 November 2015 at 18:28, Bborie Park <dustymugs at gmail.com
> <mailto:dustymugs at gmail.com>> wrote:
>
>     That check constraint is not going to help you as it is too
>     complicated for the partitioning task. I don't believe the
>     partitioning constraint can be functional but needs to be simpler
>     and built upon a basic data type...
>
>     Something like...
>
>     CREATE TABLE max_extent AS (
>        id integer PRIMARY KEY,
>        geom geometry(POLYGON)
>     );
>
>     And then the parent/partition tables had a reference to pc_extent...
>
>     CREATE TABLE ahn3_pointcloud.ahn3_all
>     (
>        id integer NOT NULL,
>        pa pcpatch(7),
>        max_extent_id integer
>     );
>
>     Then your queries become...
>
>     WITH max_extents AS (
>     SELECT
>        id
>     FROM max_extent
>     WHERE PC_Intersects(geom,
>     ST_MakeEnvelope(120740,486076,121074,486292, 28992))
>     )
>     SELECT
>     *
>     FROM ahn3_all
>     JOIN max_extents
>        ON ahn3_all.max_extent_id = max_extents.id <http://max_extents.id>
>
>     -bborie
>
>     On Mon, Nov 2, 2015 at 8:45 AM, Rubio Vaughan
>     <rubio.vaughan at geodan.nl <mailto:rubio.vaughan at geodan.nl>> wrote:
>
>         Dear all,
>
>         We are trying to load a large LIDAR pointcloud dataset into
>         multiple tables in PostGIS, using the point cloud extension.
>         Eventually the whole data set will consist of 1400+ tables,
>         containing about 1 million records each. Using a union view on
>         all these tables would result in terrible performance, which is
>         why we are trying to optimize query performance using
>         partitioned tables. According to the documentation, the use of
>         partitioned tables with CHECK constraints should cause the query
>         planner to only scan those tables for which the CHECK constraint
>         matches. Excerpt from the documentation:
>         --------------------------------------------------------------------------------
>         *18.7.4. Other Planner Options*
>         constraint_exclusion(enum): Controls the query planner's use of
>         table constraints to optimize queries. The allowed values of
>         constraint_exclusion are on (examine constraints for all
>         tables), off (never examine constraints), and partition (examine
>         constraints only for inheritance child tables and UNION ALL
>         subqueries). partition is the default setting. It is often used
>         with inheritance and partitioned tables to improve performance.
>         When this parameter allows it for a particular table, the
>         planner compares query conditions with the table's CHECK
>         constraints, and omits scanning tables for which the conditions
>         contradict the constraints.
>         --------------------------------------------------------------------------------
>
>         However, as you can see from the example query below, the
>         indexes for all child tables are still scanned. I would expect
>         the query planner to only scan table c_25gn1, which contains the
>         queried region. Does anyone here have experience with
>         partitioned tables? I would be delighted to get some pointers
>         for figuring out this problem.
>
>         Thanks in advance for any help!
>
>         Best regards,
>         Rubio Vaughan
>



More information about the postgis-users mailing list