[postgis-users] Geocoding cross streets?
Stephen Woodbridge
woodbri at swoodbridge.com
Tue Nov 29 10:03:52 PST 2011
On 11/29/2011 12:42 PM, Stephen Frost wrote:
> * Stephen Woodbridge (woodbri at swoodbridge.com) wrote:
>> I currently have some lists of names that are converted to optimized
>> pcre regular expressions. I uses these to help separate the street
>> from the city name. The lists are only used to create header files
>> that contain the regular expressions that get compiled into the
>> code. The idea being that these names are reasonably static for a
>> given data set.
>
> Ah, ok, I see. When converting this to a PG function, I'd probably want
> to go ahead and pull those lists from the TIGER data set and compile the
> regexps on PG backend startup instead. Does it handle misspelled names
> or do any kind of "sounds like" searching on the city names? I'm
> guessing 'no', but figured I'd ask anyway..
The lists that I have generated are pulled from a number of sources,
like the actual tiger data, the fips 4-2 placenames, I also have some
common abbreviations, and misspellings, but it is not doing any sounds
like searching. I think that I broke the regular expressions into
separate state specific regular expressions because putting them all
into a sine expression exceeded some limit in pcre.
The regex expressions are created in perl and are highly optimized. You
probably can not read the regex's and make much sense out of them, but
they are extremely efficient to evaluate.
Also you can take just that directory from PAGC and build it and it
should create a command line executable that you can test with and run
it in the debugger and valgrind, etc. Something like:
cd parseaddress
./configure
make
./parseaddress 101 W MLK AVE NORTH CHELMSFORD MA 01863
-Steve
More information about the postgis-users
mailing list