[postgis-users] Geocoder (from extras)

Paragon Corporation lr at pcorp.us
Sun May 8 10:14:28 PDT 2011


Yah it does.  Though I think I did break it out for simplicity because they
were no longer updating that zip500 or whatever that file was on census
anymore.  So I opted to store in separate state schemas and build
differently.  I'm not sure that adds much overhead anyway.

One thing I did notice with the code which we are cleaning up is that all
the functions come in as VOLATILE because they have no markers on them.
Most should be STABLE or IMMUTABLE.  Also some functions that should have
been implemented as SQL were implemented as plpgsql.

We are changing all of these.  This should make most difference with batch
geocoding.  Because for example for those small mini queries where you have
all addresses on the same street, that can be cached since they will all use
the same result of that subquery.

Right now they can't because that piece is marked volatile.

I've committed some of these changes already, but still need to test on some
large data sets.  If anyone wants to help test those out, that would be
create.


To upgrade your scripts -- rerun the create_geocode.sql .  That will allow
you to keep your data but replace the scripts.

As far as plans go, some statements are best not cached.  That would
probably require some experimentation.  

Steve W.  Note that most of the queries are parameterized except I think for
a couple.  For those couple I'm not sure it makes sense to plan them since
the plan would be wrong for subsequent.  If as you said it you change it to
break out to separate tables wirting 52 someodd queries, then the plan would
be valid.

Right now I feel the low hanging fruit is the actually cacheability of the
results. 

I'm also noticing the normalize_address takes much longer that I would
expect it to.  For some its like 50% of the time.  I'm troubleshooting why
that is because that function doesn't touch too many tables (if any) besides
the ones we include and load as part of tiger geocoder.

Thanks,
Regina
http://www.postgis.us


-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Stephen
Frost
Sent: Sunday, May 08, 2011 10:37 AM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Geocoder (from extras)

Stephen,

* Stephen Woodbridge (woodbri at swoodbridge.com) wrote:
> In a geocoder I wrote in 2000, I mapped city, state and/or zipcode 
> into a list of counties, then only searched those counties. This 
> greatly reduced the 3300 counties in Tiger to a small handful that 
> needed to be searched. The FIPS-4 document has a mapping of all place 
> names to counties. Zipcodes are easier to handle because you can just 
> index the records with zipcodes but if you wanted to widen the search, 
> then map the zipcode to a place name and then to counties.

Yeah, the geocoder does something similar to this (as I recall, we create a
list of zip codes to scan through, since that's indexed and the tables
aren't stored per-county in my setup).

	Thanks,

		Stephen





More information about the postgis-users mailing list