[postgis-users] Database Design for many tables
Dylan Beaudette
dylan.beaudette at gmail.com
Fri Jan 5 17:51:09 PST 2007
Thanks for the initial pointers Webb!
My replies are in-line below:
On Friday 05 January 2007 16:58, Webb Sprague wrote:
> On 1/5/07, Dylan Beaudette <dylan.beaudette at gmail.com> wrote:
> > On Friday 05 January 2007 01:07, Mark Cave-Ayland wrote:
> > > On Fri, 2007-01-05 at 09:53 +0100, Christian Braun wrote:
> > > > Hi all,
> > > >
> > > > I have a question regarding the database design of a PostGIS
> > > > database.
>
> It might be really interesting if either of you could either direct us
> to entity relation models of your data or post prototype schemas or
> some other design documentation (any attempt would be better than
> nothing); then we would have something to work with.
Our database setup is quite simple, and based in-part on the USDA-NCSS digital
soil survey products: SSURGO and STATSGO.
> With respect to soils, I would imagine that you would create a table
> for every classification that partitions the space under question,
> giving each polygon in a table a unique id. If the soils data is
> county by county shapefiles, you probably want to append it to a
> single table, as long as it is the "same thing".
The SURRGO database consists of a single spatial table with polygons, and
polygon ids, along with an assortment of non-spatial tables. All of the soil
surveys for CA, AZ, and NV have been loaded into these tables: i.e. combined.
The STATSGO database has a *nearly* identical structure, with non-spatial
tables that have the exact same names as their SSURGO brethren.
For the sake of clarity, I would like to keep the original table names (for
SSURGO and STATSGO), yet keep them within the same database to allow
cross-querying. A simple solution would be to prefix the names of the STATSGO
tables such that 'component' becomes 'ss_component' ... but there must be a
more elegant approach, no?
In the past (dark and murky) I have kept two separate databases (in MySQL).
This allowed some form of organization, isolation of namespace, and the
ability to do cross-DB queries by prefixing the tablenames with the db name.
i have heard that there is a similar mechanism for postgresql, but I cannot
figure out how to make tablespaces work nicely with the postgis extensions.
> For non geographic data, typical normalization etc etc stuff applies,
> but you will be able to bring in geographic information through joins
> on the various primary keys.
Right. I think that we have this bit sorted out, with the system providing
excellent response time and flexibility (way to go postgis team!!).
> If you can keep everything in one big database and have people
> remotely access it (not a problem usually) that would be easiest. If
> performance suffers, have a master and slave set up, pushing out
> updates after they are QA'ed, otherwise you will have a nightmare of
> version control.
Yes. We are using a single database, with local and remote access by the psql
client application and QGIS.
> If you have people editing shared data, lets talk :) ... Give us
> your SPECIFIC use cases and we can probably help (or at least sound
> intelligent).
Fortunately there will be none of that. This data in question is strictly
read-only by others on the team. They will be creating subsets (i.e. new
tables) derived from the original tables.
> If you are using rasters, I have no idea how to integrate postgis with
> them, but you should mention it to the list or in your rough design
> docs.
We'll keep the raster stuff to GRASS + starspan for now.
> If you don't know what an entity relationship model is, or what
> database normalization is, you probably need to read up on database
> design, but that would be worthwhile anyway. Start with wikipedia, as
> usual.
>
> W
Thanks again for the tips, and looking forward to any input!
Cheers,
--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341
More information about the postgis-users
mailing list