[postgis-users] Partitioning spatial table
Bill Binko
bill at binko.net
Fri Feb 17 11:26:29 PST 2006
Arnaud Lesauvage wrote:
> ....
>
>
> Do you have any better idea on how to implement this ?
>
I think I do! I know there is already another response, and his advice
is good... but, I have found that the partitioning solution you
mentioned really solves other problems such as data storage, archival
support, etc, and not necessarily query performance.
There are many reasons why your query might be slow, but a few simple
things I'd do are:
1) Make sure it's using the GiST index (as mentioned post a EXPLAIN
ANALYZE and we'll help). There are many reasons including some simple
ones like the sort_mem and shared_buffers parameters in postgresql.conf
(which by default are just plain wrong)
2) Create functional indexes on any spatial functions you run (such as
centroid() or area()) and expect to join to
3) Create Conditional indexes on the fields you're considering
partitioning on. This is a Biggie, so let me explain:
Lets say you have a field like admin_area_id that you were going to
partition on. You usually add a clause like 'WHERE admin_area_id = 4"
which will limit the rest of the query to 1/100 of the data.
Rather than partitioning the table, just create partitioned
(conditional) indexes:
First create an index on the field you're going to split on
CREATE INDEX theTable_admin_id on theTable (admin_id);
Cluster it so that the rows are sorted on-disk by admin_id:
CLUSTER theTable on theTable_admin_id;
And create one GiST Index for each value:
CREATE INDEX theTable_admin_gist_1 on theTable USING GIST (theShape
GIST_GEOMETRY_OPS) WHERE admin_area_id=1;
CREATE INDEX theTable_admin_gist_2 on theTable USING GIST (theShape
GIST_GEOMETRY_OPS) WHERE admin_area_id=2;
...
CREATE INDEX theTable_admin_gist_100 on theTable USING GIST (theShape
GIST_GEOMETRY_OPS) WHERE admin_area_id=100;
Now, whenever you apply the 'WHERE admin_area_id = 4' clause, the
'theTable_admin_gist_4' index will be used. This index will have a
spatial index that only includes the shapes where the admin_area_id =
4. It will find the entries very quickly...because it only searches 1%
of the shapes. It will load them quickly due to the CLUSTER command.
One of the nicest things about this is that there is monthly
administration of this or other cruft necessary with partitioning: just
run VACUUM FULL ANALYZE regularly, and you're done.
Hope this helps.
Bill
this will keep the items close together on disk, so that you get most of
the benefits of the
More information about the postgis-users
mailing list