[postgis-users] standardize_address: Rule does not work

Stephen Woodbridge woodbri at swoodbridge.com
Tue Oct 27 07:41:29 PDT 2015


Hi Iris,

It has been able 3+ years since I did the initial integration of pagc 
into postgres and I'm afraid I have forgotten a lot in that time. But 
here are my thoughts.

* I do not think the code explicitly hard codes any mapping like str => 
street so I would check that you lex and gaz tables are truly empty and 
that there is not some other table in the search_path that it is 
finding.  I think this is likely the issue.

* The pagc code was not built around parsing UTF8 character strings. It 
might be ok, but it might not handle them well. I don't think anyone has 
done any testing of this so any feedback as you work on your project 
would be interesting and please open tickets if you find specific issues.

* One needed enhancement to the the pagc parser is that it currently has 
no ability separate inline street types from their street name. IE: 
street types that are appended to the street name without a space.

* I just pushed rules2txt and txt2rules to 
https://sourceforge.net/p/pagc/code/HEAD/tree/branches/sew-refactor/postgresql/
You might find these scripts helpful because the convert rules to human 
readable tokens and back to numbers.

Regina did the final integration of standardize_address into postgis and 
she might have some addition thoughts.

-Steve

On 10/27/2015 6:00 AM, Iris Rititnger (Terraplan) wrote:
> Hi,
>
> I am trying to use the standardize_address function to standardize
> german street names.  But I have a problem when the street is recognized
> as some kind of special token and then followed by a space.
>
> When I try
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
> strasse 11')
> I get the result I wish for: name -- HEIN STRASSE; house_num -- 11
> When I try
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein str 11')
> I get an empty row as result.
> When I try
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein str11')
> I get the result I wish for: name -- HEIN STR; house_num -- 11
>
> My lex and gaz tables are empty.  My rule table only has two rules
> 1 1 0 -1 5 5 1 -1 1 1
> 1 0 -1 5 1 -1 1 1
> It seems that str is recognized as something different even though my
> gaz and lex tables are empty.
>
> When I add the rule 1 2 0 -1 5 5 1 -1 1 1 and define
> ;1;"STR";"street";2; in my lex table I still have the same problem
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
> str11') gives me a result of name -- HEIN street and house_num -- 1
> select * from standardize_address('lex_d','gaz_d','rules_d','Hein str
> 11') returns an empty row.
>
> To double check I installed pagc on my computer.  Here I get the desired
> result when I define the rule 1 2 0 -1 5 5 1 -1 1 1, but this does not
> seem to work in Postgis.
>
> Do you know why it is not working.  Can I somehow get postgres to show
> me the input and output tokens it tries to use like pagc shows me?
>
> Thanks,
> Iris
>
>



More information about the postgis-users mailing list