[postgis-users] detabase design question

Dreas Nielsen dnielsen at halcyon.com
Fri Jan 25 10:41:29 PST 2008


Tim Bowden wrote:

>On Fri, 2008-01-25 at 14:09 +0200, Andre Schoonbee wrote:
>  
>
>>I do not have years of experience and am faced with a challenge:
>>My client have lots of vector data. Some are from a few years ago and they
>>want to load all data into postgis. The data covering a wide spectrum -
>>basically all spatial data for the country. This is census data, regions and
>>the subsequent changes to the regions, National rainfall and also regional
>>rainfall. Mining, roads and the changing of the roads the past 10 years.
>>Boreholes from 10 years ago, and subsequently replaced by pipelines.
>>Veterinarian data, etc...
>>
>>So some of the data is national data and some is regional data. But the
>>regional data is not always related to the current region, because the
>>regions have changed in the last couple of years. 
>>
>>So my question:  Is there a basic concept design that will cater for these
>>kind of scenarios? Any ideas might help
>>
>>Thanks
>>
>>Andre
>>    
>>
>
>Whilst there are great tools to put spatial data into postgis (ogr2ogr,
>shp2pgsql), imho it's most important to design according to what you
>want to get out.  I'd start with getting a firm idea of what type of
>reporting or questions are going to be asked about the data.  Once
>you've got that nailed you're well on the track to understanding what
>sort of design you'll need.
>
>If you've got as many different data sets as you indicate, you're in for
>quite a large job.  I'd strongly recommend getting some expert
>consulting advice on site as database design is often not a
>straightforward exercise.  Also start small, and add data sets
>gradually.  Iron out the kinks for each one before moving onto the next.
>
>HTH,
>Tim Bowden
>
>
>  
>
With such a large and diverse group of data sets, the spatial 
representation of the data is just one among many attributes that you 
are evidently looking to manage.  Therefore you are facing a data 
modeling problem that is much larger than just the representation of GIS 
data.  As recommended above, you may want to seek more general database 
design assistance than just PostGIS help.

If you're going to undertake this yourself, here are a couple of 
additions to the suggestions that have already been offered:

1. As noted, it is important to keep the uses of the data in mind.  
However, it is frequently also important to keep the natural structure 
of the data in mind, especially if the ultimate uses of the data are not 
clearly defined and firmly fixed.  That is, if you want to build a 
database that may be queried and used in ways that are not (and cannot 
be) currently fully described, then you should structure the data in a 
way that reasonably models the real world.  This will give you the 
greatest flexibility in being able to accommodate unanticipated future 
uses.  Viewing this model from the context of specific use cases may 
lead you to modify the model in some ways specifically to improve 
performance for those cases.  However, considering only specific use 
cases may result in a structure that is unable to answer other questions.

2. As Tim noted, an incremental approach is a good way to approach a big 
task like this.  However, do not limit yourself to a bottom-up approach 
to building the data model.  Because of the variety of data types that 
you are evidently dealing with, you may succeed brilliantly at modeling 
one particular data type in this way, then find you have to tear the 
entire structure apart when you move on to integrate the next data 
type.  Top-down and bottom-up approaches need to move forward together.

Have fun.

Dreas





More information about the postgis-users mailing list