[postgis-users] Queries on partitioned table not behaving as expected

Rubio Vaughan rubio.vaughan at geodan.nl
Tue Nov 3 03:28:19 PST 2015


Thank you Remic.  I'm going to look into this and report back.

Cheers
Rubio

On 3-11-2015 11:45, Rémi Cura wrote:
>
> Hey,
> i tried to do exactly this,
> as it is currently the only thing preventing postgis (thus 
> pgpointcloud) to scale well.
> No strategy succeded, postgres mailing list was also clueless.
>
> So un short no way to use partitionning without extensive tricks with 
> postgres rules.
>
> Instead, it is very simple to design a table holding extent of each 
> lidar tables, then a fonction performing the check and creating the 
> union statment on the fly.
> Maybe you can even use rule on father table so this mechanism is 
> totally transparent to the user.
>
> Please keep us posted !
> Cheers,
> Remic
>
> Le 3 nov. 2015 05:47, "Andy Colson" <andy at squeakycode.net 
> <mailto:andy at squeakycode.net>> a écrit :
>
>     Yep, +1 do all this.
>
>     Additionally: select performance is not a reason to use
>     partitioned tables.  Its only update/delete performance that's
>     increased.
>
>     Ok, _some_ selects might be faster with partitions, but I'd bet
>     not indexed select statements.  You'd have to benchmark it.
>
>     If you really want 1400+ tables you might be able to write
>     yourself a function that figures out what table/tables to query
>     and generates a sql statement dynamically.
>
>     Like from the docs:
>
>     http://www.postgresql.org/docs/9.4/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
>
>     (sorry, watch the word wrap)
>
>     -Andy
>
>
>     On 11/2/2015 12:53 PM, Nicolas Ribot wrote:
>
>         Hi,
>
>         AFAIU, the restriction on partitioned tables is even bigger.
>         From PG doc
>         (http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION)
>         :
>
>         "The following caveats apply to constraint exclusion:
>
>           *
>
>             Constraint exclusion only works when the query's WHERE clause
>             contains constants (or externally supplied parameters).
>         For example,
>             a comparison against a non-immutable function such
>             as|CURRENT_TIMESTAMP| cannot be optimized, since the
>         planner cannot
>             know which partition the function value might fall into at
>         run time.
>
>         "
>         Using a subquery to provide values won't trigger the
>         constraint exclusion.
>         One solution could be an anonymous block or a function to
>         explicitly
>         write query with constant values got from a previous query.
>
>         Rubio, take also care that a lot of partitioned tables is not
>         recommended by Pg doc:
>         "All constraints on all partitions of the master table are
>         examined
>         during constraint exclusion, so large numbers of partitions
>         are likely
>         to increase query planning time considerably. Partitioning
>         using these
>         techniques will work well with up to perhaps a hundred
>         partitions; don't
>         try to use many thousands of partitions."
>
>         I would go for a smaller set of bigger tables (10's millions
>         rows or so).
>
>         Nicolas
>
>         On 2 November 2015 at 18:28, Bborie Park <dustymugs at gmail.com
>         <mailto:dustymugs at gmail.com>
>         <mailto:dustymugs at gmail.com <mailto:dustymugs at gmail.com>>> wrote:
>
>             That check constraint is not going to help you as it is too
>             complicated for the partitioning task. I don't believe the
>             partitioning constraint can be functional but needs to be
>         simpler
>             and built upon a basic data type...
>
>             Something like...
>
>             CREATE TABLE max_extent AS (
>                id integer PRIMARY KEY,
>                geom geometry(POLYGON)
>             );
>
>             And then the parent/partition tables had a reference to
>         pc_extent...
>
>             CREATE TABLE ahn3_pointcloud.ahn3_all
>             (
>                id integer NOT NULL,
>                pa pcpatch(7),
>                max_extent_id integer
>             );
>
>             Then your queries become...
>
>             WITH max_extents AS (
>             SELECT
>                id
>             FROM max_extent
>             WHERE PC_Intersects(geom,
>             ST_MakeEnvelope(120740,486076,121074,486292, 28992))
>             )
>             SELECT
>             *
>             FROM ahn3_all
>             JOIN max_extents
>                ON ahn3_all.max_extent_id = max_extents.id
>         <http://max_extents.id> <http://max_extents.id>
>
>             -bborie
>
>             On Mon, Nov 2, 2015 at 8:45 AM, Rubio Vaughan
>             <rubio.vaughan at geodan.nl <mailto:rubio.vaughan at geodan.nl>
>         <mailto:rubio.vaughan at geodan.nl
>         <mailto:rubio.vaughan at geodan.nl>>> wrote:
>
>                 Dear all,
>
>                 We are trying to load a large LIDAR pointcloud dataset
>         into
>                 multiple tables in PostGIS, using the point cloud
>         extension.
>                 Eventually the whole data set will consist of 1400+
>         tables,
>                 containing about 1 million records each. Using a union
>         view on
>                 all these tables would result in terrible performance,
>         which is
>                 why we are trying to optimize query performance using
>                 partitioned tables. According to the documentation,
>         the use of
>                 partitioned tables with CHECK constraints should cause
>         the query
>                 planner to only scan those tables for which the CHECK
>         constraint
>                 matches. Excerpt from the documentation:
>         --------------------------------------------------------------------------------
>                 *18.7.4. Other Planner Options*
>                 constraint_exclusion(enum): Controls the query
>         planner's use of
>                 table constraints to optimize queries. The allowed
>         values of
>                 constraint_exclusion are on (examine constraints for all
>                 tables), off (never examine constraints), and
>         partition (examine
>                 constraints only for inheritance child tables and
>         UNION ALL
>                 subqueries). partition is the default setting. It is
>         often used
>                 with inheritance and partitioned tables to improve
>         performance.
>                 When this parameter allows it for a particular table, the
>                 planner compares query conditions with the table's CHECK
>                 constraints, and omits scanning tables for which the
>         conditions
>                 contradict the constraints.
>         --------------------------------------------------------------------------------
>
>                 However, as you can see from the example query below, the
>                 indexes for all child tables are still scanned. I
>         would expect
>                 the query planner to only scan table c_25gn1, which
>         contains the
>                 queried region. Does anyone here have experience with
>                 partitioned tables? I would be delighted to get some
>         pointers
>                 for figuring out this problem.
>
>                 Thanks in advance for any help!
>
>                 Best regards,
>                 Rubio Vaughan
>
>
>     _______________________________________________
>     postgis-users mailing list
>     postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>     http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20151103/6d9f6068/attachment.html>


More information about the postgis-users mailing list