[postgis-tickets] [PostGIS] #2260: Benchmarking speed between built-in tiger normalizer and pagc_address_parser
PostGIS
trac at osgeo.org
Wed Apr 3 05:10:46 PDT 2013
#2260: Benchmarking speed between built-in tiger normalizer and
pagc_address_parser
---------------------------------+------------------------------------------
Reporter: robe | Owner: robe
Type: task | Status: new
Priority: medium | Milestone: PostGIS 2.1.0
Component: pagc_address_parser | Version: trunk
Keywords: |
---------------------------------+------------------------------------------
I've started to benchmark speed/quality differences between built-in
normalizer and pagc one. On a first glance it appears the built-in
normalizer is faster. This may have to do with how I'm calling it, the
fact that pagc I have currently compiled with debug flags -- so spitting
out a lot of notices, the fact that the built-in normalizer is taking
advantage of indexes and doesn't need to load the lookup tables (thus less
sensitive to shared memory), or a memory leak somewhere or a combination
of one or more of the above and other things.
Interestingly since the pagc normalizes better, the speed slow-down in
geocoding has gone up a bit so it ends up being win anyway.
So I was able to run it thru addresses I couldn't geocode before and was
able to.
This suggests 2 approaches of using pagc
1) As a pure drop in replacement for existing normalizer
2) As a complementary -- used to prenormalize difficult addresses.
--
Ticket URL: <http://trac.osgeo.org/postgis/ticket/2260>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list