[postgis-users] Enormous file geodatabase feature class

dnrg dananrg at yahoo.com
Fri Mar 7 02:53:43 PST 2008


Hi Webb,

Thanks for responding.
 
> Could you describe what you mean by "standardize"
> with an example? And do you mean "standardized
> against each other" or "standardized
> against a third specification"?

Sure. 

*Input*: 14 counties of parcel data; PIN number may be
named PIN2, ID, etc, etc. And each separate data set
has a varying number of attribute columns (some up to
30+ columns).

*Desired output*: a merged parcels data set in
PostgreSQL/PostGIS with only 5 selected attribute
columns with ~8 column names of my choice. I guess
that would be a third specification. Something like:
PIN, COUNTY, OWNER, ADDR, ADDR2, CITY, STATE (US
data), ZIP. Just enough to identify parcel owners, and
for parcel owners to identify their own parcels.
Obviously, data types need to jibe; and it will be fun
to free attribute columns from wasteful types like 255
chars for STATE, or whatever; I've noticed that many
of these county parcel data shapefiles are enormous in
part  because the creators seem to accept character
data type default lengths of 255.

Once in PostgreSQL/PostGIS, I can perform the analysis
I need, adding additional columns for the projected
wind, solar, and microhydro energy potential of each
parcel. 

Incidentally, if anyone wants to help, the project is
called ERMA / NC ERMA - the renewable Energy Resource
Mapping Application for North Carolina. No project web
site for it yet.

Helena Mitasova at NC State will be advising us on the
solar module (using GRASS--evidently there is already
a good module / model for this, and has been used in
Europe for assessing solar energy potential). Tobin
Bradley from Charlotte, NC has offered to help in some
capacity. I'm trying to put together a list of
volunteers / advisors.

And we have wind class rasters for all of North
Carolina (the data is a bit old, coming from TrueWind
LLC pre-LIDAR). My guess is that they used weather
station data + 10 meter DEMs, but who knows. Would
love to generate some new wind raster data using the
latest LIDAR, but the TrueWind model is proprietary.

We don't yet have a good algorithm for microhydro
potential; although we do have some of the best LIDAR
data in the US. 

> I bet you will have to write it yourself. It sounds
> like a big project. I could be wrong.

May end up having to do the merge manually. Not so
terrible for 14 counties. Will be painful for all 100
counties, and for future updates--obviously parcel
data isn't static, and can change weekly.

ERMA Phase I is for residential wind energy potential,
and the greatest wind energy potential is in the
Appalachian mounties (~14 Western NC counties).
Probably going to use MapServer as a platform to let
citizens discover the energy resources of their
parcels. Unless someone has a better suggestion.

> Are you getting shapefiles or geodatabases as source
> data?  I see no reason to import a shapefile into a
> geodb/ access thing (yuck!) as an
> intermediate step.

All shapefiles. Huge, honkin' shapefiles. One of them
is over 1G. This is the one QGIS choked on (but that
was QGIS on Windows--I still need to buy a new,
dual-core laptop with ~3-4G of RAM and put CentOS on
it).

> Postgresql is very scriptable, and would be my
> platform of choice for any big integration of 100's 
> county tables.  Or is there something I am missing?

Nope, sounds great.

> [ ESRI's ] stored procedures and triggers it must
use to maintain
>  the consistency of the data.

> Is there a description of these somewhere? It would
> be nice if someone developed a standard set of
> postgis triggers to maintain topology (at least not
> allow inserts of malformed data), etc.

Nice idea. I personally wouldn't touch it, as this
would probably be violating some license agreement;
but the source for triggers and stored procedures are
probably viewable in SDE on Oracle through the
DBA_SOURCES view and other methods.

> Watch out: with open source whenever you say
"someone
> could do X", the obvious retort is "why don't you do
> it and post the code?" :)

I'm slowly getting that. Probably takes me at least 3
times of reading / hearing something before it sinks
in. I know only a small amount of Python (having been
a Perl scripter in a previous life as a unix/nt
sysadmin), but hope to learn more. Don't think I'll be
casually picking up Java or C# any time soon (although
that's a goal I have for the distant future).

If I come up with code to programmatically do what I
need on parcels, I will gladly share it--and document
it. One of the drivers of ERMA is that we want it to
be successful, and built 100% on FOSS GIS, so that it
can be adopted by other states / countries / etc. It
could be done with ESRI technology, but not every
state / country has an ELA with ESRI (the state of
North Carolina has a statewide ELA with ESRI; but that
could evaporate in 5 years--who knows).

Another personal driver for me with ERMA is to learn
enough about GRASS / QGIS / PostgreSQL/PostGIS /
MapServer to contribute to the doc set and write
tutorials.

According to Karl Fogel in his fabulous book Producing
Open Source Software (thanks to whoever on this list
recommended it to me--I'd recommend it to other noobs
as well), documentation--especially examples and
tutorials--is particularly weak. If I develop
expertise with FOSS GIS, this is one area where
(hopefully) I can help. I see ERMA as having some
small potential for teaching others about FOSS GIS.

Karl Fogel's book:

   http://producingoss.com/

Need to re-read it again, now that I've given it a
once over.

> ESRI doesn't care about making open source better,
or 
> helping people get away from their products and
fees. 
> Period. Why should they?

Not directly I suppose. They have or had an open
source initiative called "52 degrees" (I think). Don't
recall what the objective is or was.

ESRI does at least benefit from open source, don't
they? Peter Schweitzer's MP (metadata parser) is
embedded in ArcCatalog--didn't realize that until I
had lunch with Peter one day and he told me (we were
both upset with the new ISO geospatial metadata
standard being copyrighted material; having to pay to
get documentation about the standard). And ESRI uses
GDAL, right?

That's not contributing in a direct way to reducing
its customer base (what for-profit corporation would
explicitly set out to do that?), but it does lend
additional credibility to what FOSS GIS people are
doing.

We can, and have, and will again, argue about what
level of support ESRI is giving to PostgreSQL (and
PostGIS). But what is inarguable is that, by doing
anything at all with PostgreSQL, they, I think, will
open their customers minds to FOSS; at least a
teensy-weensy bit. Don't you think? An unintended
consequence perhaps, but a consequence of some note. 

Dana






      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs



More information about the postgis-users mailing list