[postgis-users] Best Practices for Configuring Large Data Sets
Bill Thoen
bthoen at gisnet.com
Fri May 28 07:49:39 PDT 2010
I've got a database of Public Land Survey System (PLSS) data for most
of the contiguous United States, and I need to make it accessible to
spatial queries using PostgreSQL and PostGIS. My basic question is how
to physically configure it so that queries over any area are as
efficient as possible.
Right now it is all stored in Shapefiles, physically separated by state
and within each state it's divided into three layers: township
boundaries, section boundaries and quarter-section boundaries. To give
you some idea of the scale of the entities, townships are square-ish
polygons 6 miles on a side and contain 36 sections. Sections are approx
1 sq mi in size. The files take up some space too (for example, the
data for Colorado is 862Mb and Wyoming is 763Mb.)
I need to use these data to perform queries on other national data sets
to provide results for requests like: "Produce a count of all active
mining claims west of the Mississippi (west of longitude 96W is good
enough) by PLSS section that intersect any of the Inventoried Roadless
Areas (IRAs)."
I think I know how to do queries like this, but it would be nice to not
have to do them for each state. OTOH, if I combine the data so that it
doesn't break at the state borders, every query is going to involve
whopper-sized tables and the system might be too slow. I've thought
about writing the queries starting out by filtering on an "Area of
Interest" rectangle first, taking advantage of spatial indexing, but I
have no feel yet for whether that will quickly enough reduce the load so
that the queries don't take months to execute.
So if you have experience with this sort of thing, could I get your
advice on how to balance files sizes to optimize performance and
convenience? Also if you know of any "red flag" conditions that I
should watch out for where things become unstable or performance goes to
pot (like MS Access' 2Gb limit. You DON'T want to get near that!) I'd
appreciate knowing about them before stepping into them.
TIA,
- Bill Thoen
--
- Bill Thoen
GISnet - www.gisnet.com
303-786-9961
More information about the postgis-users
mailing list