[postgis-users] splitting a PostGIS db across multiple disks

Brian Modra brian at zwartberg.com
Tue Apr 6 07:58:26 PDT 2010


On 06/04/2010, P Kishor <punk.kish at gmail.com> wrote:
> I asked this question yesterday, and received a very helpful pointer
> from Ben Madin re. TABLESPACES. As noted in my reply in that thread, I
> am also investigating the possibility of splitting a single table
> across multiple disks.
>
> However, I am going to post this question in a different way in this new
> thread.
>
> Suppose I have a table FOO0 that stores info about every state in the
> union. I know that some of these states will have mongo number of
> rows, but I don't have to build all the states immediately. So, I
> start with a few states' worth data, putting it in the default
> /usr/local/pgsql/data location.
>
> Then I start outgrowing that disk, and need to add another state, so I
> add another disk, create a new tablespace, and create a new table
> called FOO1 in this new tablespace. Then I can store the new states in
> FOO1. As long as I break up my table into FOO0, FOO1, FOO2, and so on,
> I can store each FOOn in a new tablespace. And, as long as I ensure
> that each FOOn table contains a geographically consistent spatial
> extent, I can build logic in my application to query the correct
> table.
>
> So, lets say 0 lon to -10 lon data are stored in FOO0, and -10 lon to
> -20 lon in FOO1, then if the user requests data for -5 lon to -15 lon,
> I will have to query both FOO0 and FOO1.

If you split the data by lat-long extents, then you will need to also
create duplicates of some of the data... anything that intersects the
boundary must be duplicated...

I assume you have a table with state boundaries in it?
You could use this as an "index" to the table name. Then store a state
per table...

So when you do a spatial query, you first use the extent of the query
to find which state(s) are withing/intersecting/contain the query
area. Then you can generate one or more spatial queries to the tables
containing the states data.

> More work for me, but it is doable, no? Any insights on how to handle
> something like this?

yes

> A corollary question -- are their any speed advantages to actually
> creating multiple PostGIS instances, perhaps even splitting them
> across multiple machines? Of course, it is going to be a pain in the
> ass for me to maintain more than one instance of PostGres/PostGIS, so
> I am not thrilled at that possibility. I'd rather have a single
> instance just be managing data across multiple locations as required.

Using multiple machines means that parallel retrievals are possible.
It has to make it faster.
... but your idea of splitting data geographically probably makes this
unlikely to happen in the majority of queries.

Rather than split the data geographically, is there some property in
the tables that can be used to split it up... so that certain rows
(based on this key) go to different tables?
This may result in a better performance benefit, because a single
spatial query will hit multiple databases, returning rows in parallel.

But then again, all this depends on how you gather your data for a
query... to make what I just described work, you'd have to execute
multiple queries at once, i.e. multi-threading.

>
>
> --
> Puneet Kishor http://www.punkish.org
> Carbon Model http://carbonmodel.org
> Charter Member, Open Source Geospatial Foundation http://www.osgeo.org
> Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor
> Nelson Institute, UW-Madison http://www.nelson.wisc.edu
> -----------------------------------------------------------------------
> Assertions are politics; backing up assertions with evidence is science
> =======================================================================
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>


-- 
Brian Modra   Land line: +27 23 5411 462
Mobile: +27 79 69 77 082
5 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
http://www.zwartberg.com/



More information about the postgis-users mailing list