[postgis-users] Best Practices for Configuring Large Data Sets

Bill Thoen bthoen at gisnet.com
Fri May 28 07:49:39 PDT 2010

I've got a database of Public Land Survey System (PLSS) data  for most  
of the contiguous United States, and I need to make it accessible to 
spatial queries using PostgreSQL and PostGIS. My basic question is how 
to physically configure it so that queries over any area are as 
efficient as possible.

Right now it is all stored in Shapefiles,  physically separated by state 
and within each state it's divided into three layers: township 
boundaries, section boundaries and quarter-section boundaries. To give 
you some idea of the scale of the entities, townships are square-ish 
polygons 6 miles on a side and contain 36 sections. Sections are approx 
1 sq mi in size. The files take up some space too  (for example, the 
data for Colorado is 862Mb and Wyoming is 763Mb.)

I need to use these data to perform queries on other national data sets 
to provide results for requests like: "Produce a count of all active 
mining claims west of the Mississippi (west of longitude 96W is good 
enough) by PLSS section that intersect any of the Inventoried Roadless 
Areas (IRAs)."

I think I know how to do queries like this, but it would be nice to not 
have to do them for each state. OTOH, if I combine the data so that it 
doesn't break at the state borders, every query is going to involve 
whopper-sized tables and the system might be too slow. I've thought 
about writing the queries starting out by filtering on an "Area of 
Interest" rectangle first, taking advantage of spatial indexing, but I 
have no feel yet for whether that will quickly enough reduce the load so 
that the queries don't take months to execute.

So if you have experience with this sort of thing, could I get your 
advice on how to balance files sizes to optimize performance and 
convenience?  Also if you know of any "red flag" conditions that I 
should watch out for where things become unstable or performance goes to 
pot (like MS Access' 2Gb limit. You DON'T want to get near that!) I'd 
appreciate knowing about them before stepping into them.

- Bill Thoen

- Bill Thoen 
  GISnet - www.gisnet.com

More information about the postgis-users mailing list