[postgis-users] Geocoding Issues with Route, ##-## house numbers; upgrade questions

Paragon Corporation lr at pcorp.us
Wed Jul 27 14:36:40 PDT 2011


Dan,

 

> Hi,
> I'm using and abusing the geocoder, and I've come across a couple issues:

> 1)  Routes
> example:  '1820  ROUTE 32, MODENA, NY 12548':


>  rating |    lon     |    lat    | address | predirabbrev | streetname |
streettypeabbrev | postdirabbrev | internal | location | stateabbrev |  zip
| parsed 
--------+------------+-----------+---------+--------------+------------+----
--------------+---------------+----------+----------+-------------+-------+-
-------
 >    22 | -73.9374945714286 | 40.6108123469388 |    1820 | E            |
32nd       | St               |               |          | New York    | NY
| 11234 | t

> which is 85 miles away =)

I think item 1 I fixed already.  I forget if I committed my fix for it
though.  I think I did, but I haven't committed anything for a while since

I'm working on speeding up things, and sadly if things work faster in one
version of PostgreSQL, they work slower in another and so forth.  So I'm
working on a comfortable balance.  Mostly fiddling with index selectivity.


> 2) ##-## addresses

> example:  '112-31  196 STREET, SAINT ALBANS, NY'

 rating |    lon     |    lat    | address | predirabbrev | streetname |
streettypeabbrev | postdirabbrev | internal | location | stateabbrev |  zip
| parsed 
--------+------------+-----------+---------+--------------+------------+----
--------------+---------------+----------+----------+-------------+-------+-
-------
  >    20 | -73.756229 | 40.693842 |         |              | 196th      |
St               |               |          | New York | NY          | 11412
| t

> which is only .3 miles away, but note that it just ignored the house
number. 
This one I have listed as a bug already on my todo -

http://trac.osgeo.org/postgis/ticket/886  (although your above looks like a
slightly different issue which I may have already fixed)

Questions: 
a.  Is there something I can do to pre-process either of these types of
addresses to help the geocoder?  
> b.  If I know that the zip code is correct, is there a setting I can
adjust so that the geocoder never looks outside the provided zip code?

http://www.postgis.org/documentation/manual-svn/Geocode.html  (Give the
geometry filter option a try.  I haven't really stress tested it)

I've also got on todo to revamp the rating so that you can better control
the weighting scores, but that won't happen until I've tackled the speed

Listed here: http://trac.osgeo.org/postgis/ticket/1111

You can add yourself to the cc of these tickets if you want to be notified
when they are amended/closed

*      According to normalize_address.sql, I'm using this version of the
Geocoder:
> 7616 2011-07-07 12:41:13Z
> If this is the version I 'installed' - ie started with - do I still need
to run upgrade_geocoder.sh? what about 

Yes - latest version is: 7632 2011-07-12  (so you are already behind :-) ) 

 

*      Missing_Indexes_Generate_Script()?

I have that now as part of the update script to install missing indexes.  It
runs pretty fast if you have all the key indexes in place already.

So basically runs this command now --
http://www.postgis.org/documentation/manual-svn/Install_Missing_Indexes.html

> Lastly, a small contribution:  I noticed the geocoder was also having
problems with addresses like '45 3 STREET' and '45 WEST 3 

           > STREET', and I found that by adding a suffix to the '3' ('3' ->
'3RD') gave it a push in the right direction.  The regular expression I'm
using to catch these is:

    > foo=re.match(r'([0-9\-]+ +)([0-9]+)( +[a-zA-Z_]+)', street)
    > foo2=re.match(r'([0-9\-]+ +)([WESTASOUHNOR]+ )([0-9]+)( +[a-zA-Z_]+)',
street)

Thanks - I'll check that out.

Regina

http://www.postgis.us <http://www.postgis.us/> 








-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20110727/bdeb301d/attachment.html>


More information about the postgis-users mailing list