[postgis-users] standardize_address: Rule does not work
Paragon Corporation
lr at pcorp.us
Tue Oct 27 15:08:38 PDT 2015
Hi Iris,
There are actually two standard_address functions:
http://postgis.net/docs/manual-2.2/standardize_address.html
1) takes a single line (which is what you are using) and internally relies on parse_address which does have a lot of hard-coded assumptions about things (particularly US and Canada assumptions)
stdaddr standardize_address(text lextab, text gaztab, text rultab, text address);
To check if you are being bitten by internal hard-codings in parse_address, then try the parse_address function and see if what you get back makes sense:
http://postgis.net/docs/manual-2.2/parse_address.html
2) Is a two line standardizer that takes micro and macro
(I apologize I neglected to provide an example for the second variant since I use the first mostly. I'll add some documentation to that effect) and more detail about what is considered micro/ macro
standardize_address(text lextab, text gaztab, text rultab, text micro, text macro);
micro is like - house_num street suffix
macro is like - city, state, zip country
Steve and I had discussed redoing the parse_address function to not use a hard-coded set of data, but that's another thing on our todo. Would be nice if others are willing to help out on this effort (either via development or monetary support) and also our upcoming next generation geocoder effort which we plan to use the address_standardizer with.
Thanks,
Regina
http://www.postgis.us
http://postgis.net
-----Original Message-----
From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Stephen Woodbridge
Sent: Tuesday, October 27, 2015 10:41 AM
To: postgis-users at lists.osgeo.org
Subject: Re: [postgis-users] standardize_address: Rule does not work
Hi Iris,
It has been able 3+ years since I did the initial integration of pagc into postgres and I'm afraid I have forgotten a lot in that time. But here are my thoughts.
* I do not think the code explicitly hard codes any mapping like str => street so I would check that you lex and gaz tables are truly empty and that there is not some other table in the search_path that it is finding. I think this is likely the issue.
* The pagc code was not built around parsing UTF8 character strings. It might be ok, but it might not handle them well. I don't think anyone has done any testing of this so any feedback as you work on your project would be interesting and please open tickets if you find specific issues.
* One needed enhancement to the the pagc parser is that it currently has no ability separate inline street types from their street name. IE:
street types that are appended to the street name without a space.
* I just pushed rules2txt and txt2rules to https://sourceforge.net/p/pagc/code/HEAD/tree/branches/sew-refactor/postgresql/
You might find these scripts helpful because the convert rules to human readable tokens and back to numbers.
Regina did the final integration of standardize_address into postgis and she might have some addition thoughts.
-Steve
On 10/27/2015 6:00 AM, Iris Rititnger (Terraplan) wrote:
> Hi,
>
> I am trying to use the standardize_address function to standardize
> german street names. But I have a problem when the street is
> recognized as some kind of special token and then followed by a space.
>
> When I try
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
> strasse 11')
> I get the result I wish for: name -- HEIN STRASSE; house_num -- 11
> When I try select * from
> standardize_address('lex_d','gaz_d','rules_d','Hein str 11') I get an
> empty row as result.
> When I try
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
> str11') I get the result I wish for: name -- HEIN STR; house_num -- 11
>
> My lex and gaz tables are empty. My rule table only has two rules
> 1 1 0 -1 5 5 1 -1 1 1
> 1 0 -1 5 1 -1 1 1
> It seems that str is recognized as something different even though my
> gaz and lex tables are empty.
>
> When I add the rule 1 2 0 -1 5 5 1 -1 1 1 and define
> ;1;"STR";"street";2; in my lex table I still have the same problem
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
> str11') gives me a result of name -- HEIN street and house_num -- 1
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein str
> 11') returns an empty row.
>
> To double check I installed pagc on my computer. Here I get the
> desired result when I define the rule 1 2 0 -1 5 5 1 -1 1 1, but this
> does not seem to work in Postgis.
>
> Do you know why it is not working. Can I somehow get postgres to show
> me the input and output tokens it tries to use like pagc shows me?
>
> Thanks,
> Iris
>
>
_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list