[postgis-tickets] [PostGIS] #2260: Benchmarking speed between built-in tiger normalizer and pagc_address_parser

PostGIS trac at osgeo.org
Wed Apr 3 05:10:46 PDT 2013


#2260: Benchmarking speed between built-in tiger normalizer and
pagc_address_parser
---------------------------------+------------------------------------------
 Reporter:  robe                 |       Owner:  robe         
     Type:  task                 |      Status:  new          
 Priority:  medium               |   Milestone:  PostGIS 2.1.0
Component:  pagc_address_parser  |     Version:  trunk        
 Keywords:                       |  
---------------------------------+------------------------------------------
 I've started to benchmark speed/quality differences between built-in
 normalizer and pagc one.  On a first glance it appears the built-in
 normalizer is faster.  This may have to do with how I'm calling it, the
 fact that pagc I have currently compiled with debug flags -- so spitting
 out a lot of notices, the fact that the built-in normalizer is taking
 advantage of indexes and doesn't need to load the lookup tables (thus less
 sensitive to shared memory), or a memory leak somewhere or a combination
 of one or more of the above and other things.

 Interestingly since the pagc normalizes better, the speed slow-down in
 geocoding has gone up a bit so it ends up being win anyway.

 So I was able to run it thru addresses I couldn't geocode before and was
 able to.

 This suggests 2 approaches of using pagc

 1) As a pure drop in replacement for existing normalizer
 2) As a complementary -- used to prenormalize difficult addresses.

-- 
Ticket URL: <http://trac.osgeo.org/postgis/ticket/2260>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list