[postgis-users] standardize_address: Rule does not work
Stephen Woodbridge
woodbri at swoodbridge.com
Tue Oct 27 07:41:29 PDT 2015
Hi Iris,
It has been able 3+ years since I did the initial integration of pagc
into postgres and I'm afraid I have forgotten a lot in that time. But
here are my thoughts.
* I do not think the code explicitly hard codes any mapping like str =>
street so I would check that you lex and gaz tables are truly empty and
that there is not some other table in the search_path that it is
finding. I think this is likely the issue.
* The pagc code was not built around parsing UTF8 character strings. It
might be ok, but it might not handle them well. I don't think anyone has
done any testing of this so any feedback as you work on your project
would be interesting and please open tickets if you find specific issues.
* One needed enhancement to the the pagc parser is that it currently has
no ability separate inline street types from their street name. IE:
street types that are appended to the street name without a space.
* I just pushed rules2txt and txt2rules to
https://sourceforge.net/p/pagc/code/HEAD/tree/branches/sew-refactor/postgresql/
You might find these scripts helpful because the convert rules to human
readable tokens and back to numbers.
Regina did the final integration of standardize_address into postgis and
she might have some addition thoughts.
-Steve
On 10/27/2015 6:00 AM, Iris Rititnger (Terraplan) wrote:
> Hi,
>
> I am trying to use the standardize_address function to standardize
> german street names. But I have a problem when the street is recognized
> as some kind of special token and then followed by a space.
>
> When I try
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
> strasse 11')
> I get the result I wish for: name -- HEIN STRASSE; house_num -- 11
> When I try
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein str 11')
> I get an empty row as result.
> When I try
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein str11')
> I get the result I wish for: name -- HEIN STR; house_num -- 11
>
> My lex and gaz tables are empty. My rule table only has two rules
> 1 1 0 -1 5 5 1 -1 1 1
> 1 0 -1 5 1 -1 1 1
> It seems that str is recognized as something different even though my
> gaz and lex tables are empty.
>
> When I add the rule 1 2 0 -1 5 5 1 -1 1 1 and define
> ;1;"STR";"street";2; in my lex table I still have the same problem
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
> str11') gives me a result of name -- HEIN street and house_num -- 1
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein str
> 11') returns an empty row.
>
> To double check I installed pagc on my computer. Here I get the desired
> result when I define the rule 1 2 0 -1 5 5 1 -1 1 1, but this does not
> seem to work in Postgis.
>
> Do you know why it is not working. Can I somehow get postgres to show
> me the input and output tokens it tries to use like pagc shows me?
>
> Thanks,
> Iris
>
>
More information about the postgis-users
mailing list