[postgis-users] Queries on partitioned table not behaving as expected

Rubio Vaughan rubio.vaughan at geodan.nl
Tue Nov 3 03:26:28 PST 2015


Andy,

Thanks for the suggestion. Select performance wasn't the goal in itself. 
But we're trying to keep the data in multiple tables for 
maintenance/update purposes. Since this severely affects performance we 
investigated partitioned tables in order to get performance close to  
single-table select performance. As you suggest, it looks like my next 
step will involve a custom function.

Best regards,
Rubio

On 2-11-2015 21:41, Andy Colson wrote:
> Yep, +1 do all this.
>
> Additionally: select performance is not a reason to use partitioned 
> tables.  Its only update/delete performance that's increased.
>
> Ok, _some_ selects might be faster with partitions, but I'd bet not 
> indexed select statements.  You'd have to benchmark it.
>
> If you really want 1400+ tables you might be able to write yourself a 
> function that figures out what table/tables to query and generates a 
> sql statement dynamically.
>
> Like from the docs:
>
> http://www.postgresql.org/docs/9.4/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN 
>
>
> (sorry, watch the word wrap)
>
> -Andy
>
>
> On 11/2/2015 12:53 PM, Nicolas Ribot wrote:
>> Hi,
>>
>> AFAIU, the restriction on partitioned tables is even bigger. From PG doc
>> (http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION) 
>>
>> :
>>
>> "The following caveats apply to constraint exclusion:
>>
>>   *
>>
>>     Constraint exclusion only works when the query's WHERE clause
>>     contains constants (or externally supplied parameters). For example,
>>     a comparison against a non-immutable function such
>>     as|CURRENT_TIMESTAMP| cannot be optimized, since the planner cannot
>>     know which partition the function value might fall into at run time.
>>
>> "
>> Using a subquery to provide values won't trigger the constraint 
>> exclusion.
>> One solution could be an anonymous block or a function to explicitly
>> write query with constant values got from a previous query.
>>
>> Rubio, take also care that a lot of partitioned tables is not
>> recommended by Pg doc:
>> "All constraints on all partitions of the master table are examined
>> during constraint exclusion, so large numbers of partitions are likely
>> to increase query planning time considerably. Partitioning using these
>> techniques will work well with up to perhaps a hundred partitions; don't
>> try to use many thousands of partitions."
>>
>> I would go for a smaller set of bigger tables (10's millions rows or 
>> so).
>>
>> Nicolas
>>
>> On 2 November 2015 at 18:28, Bborie Park <dustymugs at gmail.com
>> <mailto:dustymugs at gmail.com>> wrote:
>>
>>     That check constraint is not going to help you as it is too
>>     complicated for the partitioning task. I don't believe the
>>     partitioning constraint can be functional but needs to be simpler
>>     and built upon a basic data type...
>>
>>     Something like...
>>
>>     CREATE TABLE max_extent AS (
>>        id integer PRIMARY KEY,
>>        geom geometry(POLYGON)
>>     );
>>
>>     And then the parent/partition tables had a reference to pc_extent...
>>
>>     CREATE TABLE ahn3_pointcloud.ahn3_all
>>     (
>>        id integer NOT NULL,
>>        pa pcpatch(7),
>>        max_extent_id integer
>>     );
>>
>>     Then your queries become...
>>
>>     WITH max_extents AS (
>>     SELECT
>>        id
>>     FROM max_extent
>>     WHERE PC_Intersects(geom,
>>     ST_MakeEnvelope(120740,486076,121074,486292, 28992))
>>     )
>>     SELECT
>>     *
>>     FROM ahn3_all
>>     JOIN max_extents
>>        ON ahn3_all.max_extent_id = max_extents.id 
>> <http://max_extents.id>
>>
>>     -bborie
>>
>>     On Mon, Nov 2, 2015 at 8:45 AM, Rubio Vaughan
>>     <rubio.vaughan at geodan.nl <mailto:rubio.vaughan at geodan.nl>> wrote:
>>
>>         Dear all,
>>
>>         We are trying to load a large LIDAR pointcloud dataset into
>>         multiple tables in PostGIS, using the point cloud extension.
>>         Eventually the whole data set will consist of 1400+ tables,
>>         containing about 1 million records each. Using a union view on
>>         all these tables would result in terrible performance, which is
>>         why we are trying to optimize query performance using
>>         partitioned tables. According to the documentation, the use of
>>         partitioned tables with CHECK constraints should cause the query
>>         planner to only scan those tables for which the CHECK constraint
>>         matches. Excerpt from the documentation:
>> --------------------------------------------------------------------------------
>>         *18.7.4. Other Planner Options*
>>         constraint_exclusion(enum): Controls the query planner's use of
>>         table constraints to optimize queries. The allowed values of
>>         constraint_exclusion are on (examine constraints for all
>>         tables), off (never examine constraints), and partition (examine
>>         constraints only for inheritance child tables and UNION ALL
>>         subqueries). partition is the default setting. It is often used
>>         with inheritance and partitioned tables to improve performance.
>>         When this parameter allows it for a particular table, the
>>         planner compares query conditions with the table's CHECK
>>         constraints, and omits scanning tables for which the conditions
>>         contradict the constraints.
>> --------------------------------------------------------------------------------
>>
>>         However, as you can see from the example query below, the
>>         indexes for all child tables are still scanned. I would expect
>>         the query planner to only scan table c_25gn1, which contains the
>>         queried region. Does anyone here have experience with
>>         partitioned tables? I would be delighted to get some pointers
>>         for figuring out this problem.
>>
>>         Thanks in advance for any help!
>>
>>         Best regards,
>>         Rubio Vaughan
>>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users



More information about the postgis-users mailing list