[postgis-users] Geocoding cross streets?

Stephen Woodbridge woodbri at swoodbridge.com
Tue Nov 29 10:03:52 PST 2011


On 11/29/2011 12:42 PM, Stephen Frost wrote:
> * Stephen Woodbridge (woodbri at swoodbridge.com) wrote:
>> I currently have some lists of names that are converted to optimized
>> pcre regular expressions. I uses these to help separate the street
>> from the city name. The lists are only used to create header files
>> that contain the regular expressions that get compiled into the
>> code. The idea being that these names are reasonably static for a
>> given data set.
>
> Ah, ok, I see.  When converting this to a PG function, I'd probably want
> to go ahead and pull those lists from the TIGER data set and compile the
> regexps on PG backend startup instead.  Does it handle misspelled names
> or do any kind of "sounds like" searching on the city names?  I'm
> guessing 'no', but figured I'd ask anyway..

The lists that I have generated are pulled from a number of sources, 
like the actual tiger data, the fips 4-2 placenames, I also have some 
common abbreviations, and misspellings, but it is not doing any sounds 
like searching. I think that I broke the regular expressions into 
separate state specific regular expressions because putting them all 
into a sine expression exceeded some limit in pcre.

The regex expressions are created in perl and are highly optimized. You 
probably can not read the regex's and make much sense out of them, but 
they are extremely efficient to evaluate.

Also you can take just that directory from PAGC and build it and it 
should create a command line executable that you can test with and run 
it in the debugger and valgrind, etc. Something like:

cd parseaddress
./configure
make
./parseaddress 101 W MLK AVE NORTH CHELMSFORD MA 01863

-Steve



More information about the postgis-users mailing list