[postgis-devel] Geocoder - Normalizing Addresses

Paragon Corporation lr at pcorp.us
Thu Jun 30 16:12:57 PDT 2011


Emily,

Can you put this one in as a ticket and we'll check on it.  We are in the
middle of making some other changes to normalize address. What you say
sounds like a good idea off hand.

Thanks,
Regina 

-----Original Message-----
From: postgis-devel-bounces at postgis.refractions.net
[mailto:postgis-devel-bounces at postgis.refractions.net] On Behalf Of
egouge at refractions.net
Sent: Thursday, June 30, 2011 6:31 PM
To: postgis-devel at postgis.refractions.net
Subject: [postgis-devel] Geocoder - Normalizing Addresses


In my geocoding testing I've found an inconsistency in the geocoder
(normalizing
addresses) which I'm not sure is a bug or expected.

The following works as I would expect.

select normalize_address('949 N 3rd St, New Hyde Park, NY 11040');
              normalize_address              
---------------------------------------------
 (949,N,3rd,St,,,"New Hyde Park",NY,11040,t)


However if a "," is added after the state, then I get a completely different
answer, which is not what I would expect:

select normalize_address('949 N 3rd St, New Hyde Park, NY, 11040');
               normalize_address              
-----------------------------------------------
 (949,N,"3rd St, New Hyde",Park,,,,NY,11040,t)


I did some digging through the normalize address function to try to see why
these cases are treated different.

The source of difference seems to be on line 162 where the first attempt to
parse the location is made.  

tempString := substring(fullStreet, '(?i),' || ws || '+(.*?)(,?' || ws

|| '*' || cull_null(state) || '$)');

This line requires the end of the match to be the state with nothing after
it. 
However in second case a "," exists after the state.  I'm thinking that
having a "," after the state name should not affect how the location is
parsed out and would suggest either modifing the regualar expression to
allow a optional , after the state  or stripping all trailing commas from
the end of the string before matching. However my knowledge of normalizing
is very limited and this may not be the correct way to deal with this
problem.

Emily






_______________________________________________
postgis-devel mailing list
postgis-devel at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-devel





More information about the postgis-devel mailing list