[postgis-tickets] [PostGIS] #2260: Benchmarking speed between built-in tiger normalizer and pagc_address_parser
PostGIS
trac at osgeo.org
Thu Apr 25 02:39:44 PDT 2013
#2260: Benchmarking speed between built-in tiger normalizer and
pagc_address_parser
----------------------------------+-----------------------------------------
Reporter: robe | Owner: robe
Type: task | Status: closed
Priority: medium | Milestone: PostGIS 2.1.0
Component: pagc_address_parser | Version: trunk
Resolution: fixed | Keywords:
----------------------------------+-----------------------------------------
Changes (by robe):
* status: new => closed
* resolution: => fixed
Comment:
Great that works well on my EDB 64-bit install now and also in my
pagc_normalize wrapper.
The speeds are about the same now, though I suspect yours with a larger
set would outperform the tiger built in one.
{{{
testpostgis210=# SELECT address, pagc_normalize_address(address)
testpostgis210-# FROM test_parse;
SELECT address, pagc_normalize_address(address)
FROM test_parse;
address |
pagc_normalize_address
-----------------------------------------------------+-------------------------------------------------------
529 Main Street, Boston MA, 02129 |
(529,,MAIN,St,,,Boston,MA,02129,t)
77 Massachusetts Avenue, Cambridge, MA 02139 |
(77,,MASSACHUSETTS,Ave,,,Cambridge,MA,02139,t)
25 Wizard of Oz, Walaford, KS 99912323 | (25,,"WIZARD
OF",,,"# OZ WALAFORD","KS 99912323",,,t)
26 Capen Street, Medford, MA |
(26,,CAPEN,St,,,Medford,MA,,t)
124 Mount Auburn St, Cambridge, Massachusetts 02138 | (124,,"MOUNT
AUBURN",St,,,Cambridge,MA,02138,t)
950 Main Street, Worcester, MA 01610 |
(950,,MAIN,St,,,Worcester,MA,01610,t)
949 N 3rd St, New Hyde Park, NY, 11040 | (949,N,3,St,,,"New
Hyde Park",NY,11040,t)
8401 W 35W Service Dr NE, Blaine, MN 55449 | (8401,W,"35 W","Svc
Dr",NE,,Blaine,MN,55449,t)
(8 rows)
Time: 106.295 ms
}}}
{{{
address |
normalize_address
----------------------------------------------------+-------------------------------------------------
529 Main Street, Boston MA, 02129 |
(529,,Main,St,,,Boston,MA,02129,t)
77 Massachusetts Avenue, Cambridge, MA 02139 |
(77,,Massachusetts,Ave,,,Cambridge,MA,02139,t)
25 Wizard of Oz, Walaford, KS 99912323 | (25,,"Wizard of
Oz",,,,Walaford,KS,99912323,t)
26 Capen Street, Medford, MA |
(26,,Capen,St,,,Medford,MA,,t)
124 Mount Auburn St, Cambridge, Massachusetts 02138 | (124,,"Mount
Auburn",St,,,Cambridge,MA,02138,t)
950 Main Street, Worcester, MA 01610 |
(950,,Main,St,,,Worcester,MA,01610,t)
949 N 3rd St, New Hyde Park, NY, 11040 | (949,N,3rd,St,,,"New
Hyde Park",NY,11040,t)
8401 W 35W Service Dr NE, Blaine, MN 55449 | (8401,W,35W,"Svc
Dr",NE,,Blaine,MN,55449,t)
8 rows)
ime: 100.177 ms
}}}
{{{
testpostgis210=# SELECT address, normalize_address(address) FROM
test_parse;
SELECT address, normalize_address(address) FROM test_parse;
address |
normalize_address
-----------------------------------------------------+-------------------------------------------------
529 Main Street, Boston MA, 02129 |
(529,,Main,St,,,Boston,MA,02129,t)
77 Massachusetts Avenue, Cambridge, MA 02139 |
(77,,Massachusetts,Ave,,,Cambridge,MA,02139,t)
25 Wizard of Oz, Walaford, KS 99912323 | (25,,"Wizard of
Oz",,,,Walaford,KS,99912323,t)
26 Capen Street, Medford, MA |
(26,,Capen,St,,,Medford,MA,,t)
124 Mount Auburn St, Cambridge, Massachusetts 02138 | (124,,"Mount
Auburn",St,,,Cambridge,MA,02138,t)
950 Main Street, Worcester, MA 01610 |
(950,,Main,St,,,Worcester,MA,01610,t)
949 N 3rd St, New Hyde Park, NY, 11040 |
(949,N,3rd,St,,,"New Hyde Park",NY,11040,t)
8401 W 35W Service Dr NE, Blaine, MN 55449 | (8401,W,35W,"Svc
Dr",NE,,Blaine,MN,55449,t)
(8 rows)
Time: 100.177 ms
}}}
I have one minor gripe that your function is not schema aware which
required me to strip off the tiger schema in my input function. I'll
ticket that as a separate issue but wil go ahead and change my wrapper
function for now.
--
Ticket URL: <http://trac.osgeo.org/postgis/ticket/2260#comment:29>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list