[postgis-devel] PostGIS 2.0.0SVN: TIGER Geocoder and TIGER's own Primary Keys and the use of tableA.statefp = tableB.statefp: I think its' a kludge

Paragon Corporation lr at pcorp.us
Wed Dec 21 03:04:28 PST 2011


Steve,

Come to think of it we could by having shp2pgsql not load in batch mode
since it would naturally error out on records that fail and move on, but as
Steve mentioned you still have the issue of the face problem with the tfids
not accounted for in what you keep.

You are also very mistaken that our intent is to geocode against the proper
TIGER database, normalized and with all primary keys propertly defined. 
We in fact don't care about that much.

This is not an exercise in how to build a perfectly normalized tiger
database.  We have that with postgis topology and in fact have a loader for
that that respects that perspective.

Our intent is to build a tiger geocoder that everyone can use that is easy
enough for most PostGIS users to load data to.  With that said -- we do want
things coupled if it makes life easier for 90% of users at the expense of
the 10% that can take care of themselves.  We want it in such a way that
everyone is capable to load regardless of what platform they are running
one.

Thanks,
Regina
http://www.postgis.us
> 
> > -----Original Message-----
> > From: Steve Walker [mailto:walker at mfgis.com]
> > Sent: Tuesday, December 20, 2011 11:32 PM
> > To: Paragon Corporation
> > Subject: Re: [postgis-devel] PostGIS 2.0.0SVN: TIGER Geocoder and 
> > TIGER's own Primary Keys and the use of tableA.statefp =
> > tableB.statefp: I think its' a kludge
> > 
> > Regina,
> > 
> > Thank you for the quick response.  
> > 
> > I'd like to keep looking at this.   
> > 
> > I do think there are ways of getting around the unique key 
> > constraints, as you noted shp2pgsql is problematic, but 
> ogr2ogr will 
> > allow us to reject individual records that violate pkey 
> constraints.  
> > So, if as you suggested the geocoder is constrained by the 
> limitations 
> > inherent within shp2pgsql then I would again suggest decoupling the 
> > geocoding functionality from the loading functionality.
> > 
> > Ideally, we should wish to geocode against the proper TIGER 
> database, 
> > normalized and with all primary keys properly defined, rather than 
> > against one inherent with the limitations shp2pgsql provides us, 
> > correct?
> > 
> > 
> > Beyond that, though, I'd like to dig further into the discussion of 
> > the necessity of populating tables with the 'statefp' attribute.  I 
> > think my example demonstrated one case where it was not 
> necessary, and 
> > perhaps
> > I could help find others?   
> > 
> > And that's what I've started trying to do.  Yet my biggest personal 
> > challenge is reading through the sql code and the chained 
> clauses, in
> > for example 'geocode.sql'.   A couple things that just made 
> > it tough for
> > me to understand - beyond my limited ability to chain 
> together all the
> > clauses - is some of the uses of aliases in the sql code.   
> > At one point
> > 'f' is an alias for 'featnames' at another point 'f' is an alias for
> > 'faces.'    I'm trying to re-write stuff by dropping the aliases in
> > favor of explicit table.attribute syntax so I can more 
> explicitly see 
> > the actual tables and attributes with which I'm working.
> > 
> > Select ... AS a  and Select ... AS b show up more than once, and I 
> > can't get a handle on whether they are the same 'a' and 'b' or 
> > different.
> > 
> > May I suggest some enhancements to the sql code so neophytes like 
> > myself can make more sense of it, such as using fully qualified 
> > table.attribute
> > syntax and using more meaningful aliases than 'a' and 'b.'?   
> > 
> > Thanks for all your work on this, sorry to be a late comer to the 
> > discussion, but I think I may be able to help some when I understand
> > more.   I do think I have a 100% complete and properly 
> > normalized TIGER
> > 2010 database to work against if that helps.
> > 
> > 
> > -S
> > 
> > 
> > 
> > 
> > 
> > On Tue, 2011-12-20 at 22:40 -0500, Paragon Corporation wrote:
> > > Steve,
> > >  
> > > Thanks for taking an interest in this.  Concerning some of your 
> > > observations.
> > >  
> > > 1) TLID  - correct these would be duplicated in counties 
> > which is why 
> > > we don't use it as a primary key.  We were thinking of 
> > after the fact 
> > > creating a routine that would purge duplicate tlids, but since 
> > > shp2pgsql doesn't have a skip failures option, and we 
> can't rely on 
> > > people having ogr2ogr because it doesn't ship with PostGIS, 
> > we kept it 
> > > as is.
> > >  
> > > It probably would help improve things if we made it a 
> unique key at 
> > > least.
> > >  
> > > 2) The statefp being needlessly added to all tables.  There 
> > is a very 
> > > good reason for that.  The main reason is we need it 
> there to take 
> > > advantage of constraint exclusion.
> > > Since each set of records goes in its own table, as silly 
> > as it sounds 
> > > we need statefp in there so that tables that couldn't 
> > possibly offer 
> > > results can be skipped.
> > >  
> > > Hope that answers your questions,
> > > Regina
> > > http://www.postgis.us
> > >  
> > >  
> > 
> > --
> > Steve Walker
> > Middle Fork Geographic Information Services
> > (360)671-2505
> > 
> > 
> 





More information about the postgis-devel mailing list