[postgis-users] standardize_address: Rule does not work

Iris Rittinger (Terraplan) mailinglisten at terraplan.com
Wed Oct 28 04:34:55 PDT 2015


Hi Regina, hi Steve,

thanks a lot for your answers.
When I try the parse_address function I do not get any useful result.
Seperating the macro and micro part in the standardize_address function 
does work and the result looks pretty good.

* The pagc code was not built around parsing UTF8 character strings.
  -- Yes, that is a problem.  It does not parse german letters

* One needed enhancement to the the pagc parser is that it currently has 
no ability separate inline street types from their street name
  -- Yes, that is in deed a problem for our use case

Would be nice if others are willing to help out on this effort (either via development or monetary support)
-- I will talk to our developers and my chef if there is a possiblity to get involved

and also our upcoming next generation geocoder effort which we plan to use the address_standardizer with
-- sound great, since that is what I am really looking for.  We have a reference table for geocoding and we would like to pass the incoming address against that table.

Thanks a lot,
Iris



Am 27.10.2015 um 23:08 schrieb Paragon Corporation:
> Hi Iris,
>
> There are actually two standard_address functions:
> http://postgis.net/docs/manual-2.2/standardize_address.html
>
>
>
> 1) takes a single line (which is what you are using) and internally relies on parse_address which does have a lot of hard-coded assumptions about things (particularly US and Canada assumptions)
> stdaddr standardize_address(text lextab, text gaztab, text rultab, text address);
>
> To check if you are being bitten by internal hard-codings in parse_address, then try the parse_address function and see if what you get back makes sense:
>
> http://postgis.net/docs/manual-2.2/parse_address.html
>
>
>
> 2)  Is a two line standardizer that takes micro and macro
> (I apologize I neglected to provide an example for the second variant since I use the first mostly.  I'll add some documentation to that effect) and more detail about what is considered micro/ macro
>
> standardize_address(text lextab, text gaztab, text rultab, text micro, text macro);
>
> micro is like -  house_num street suffix
> macro is like - city, state, zip country
>
>
> Steve and I had discussed redoing the parse_address function to not use a hard-coded set of data, but that's another thing on our todo.  Would be nice if others are willing to help out on this effort (either via development or monetary support) and also our upcoming next generation geocoder effort which we plan to use the address_standardizer with.
>
>
> Thanks,
> Regina
> http://www.postgis.us
> http://postgis.net
>
> -----Original Message-----
> From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Stephen Woodbridge
> Sent: Tuesday, October 27, 2015 10:41 AM
> To: postgis-users at lists.osgeo.org
> Subject: Re: [postgis-users] standardize_address: Rule does not work
>
> Hi Iris,
>
> It has been able 3+ years since I did the initial integration of pagc into postgres and I'm afraid I have forgotten a lot in that time. But here are my thoughts.
>
> * I do not think the code explicitly hard codes any mapping like str => street so I would check that you lex and gaz tables are truly empty and that there is not some other table in the search_path that it is finding.  I think this is likely the issue.
>
> * The pagc code was not built around parsing UTF8 character strings. It might be ok, but it might not handle them well. I don't think anyone has done any testing of this so any feedback as you work on your project would be interesting and please open tickets if you find specific issues.
>
> * One needed enhancement to the the pagc parser is that it currently has no ability separate inline street types from their street name. IE:
> street types that are appended to the street name without a space.
>
> * I just pushed rules2txt and txt2rules to https://sourceforge.net/p/pagc/code/HEAD/tree/branches/sew-refactor/postgresql/
> You might find these scripts helpful because the convert rules to human readable tokens and back to numbers.
>
> Regina did the final integration of standardize_address into postgis and she might have some addition thoughts.
>
> -Steve
>
> On 10/27/2015 6:00 AM, Iris Rititnger (Terraplan) wrote:
>> Hi,
>>
>> I am trying to use the standardize_address function to standardize
>> german street names.  But I have a problem when the street is
>> recognized as some kind of special token and then followed by a space.
>>
>> When I try
>> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
>> strasse 11')
>> I get the result I wish for: name -- HEIN STRASSE; house_num -- 11
>> When I try select * from
>> standardize_address('lex_d','gaz_d','rules_d','Hein str 11') I get an
>> empty row as result.
>> When I try
>> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
>> str11') I get the result I wish for: name -- HEIN STR; house_num -- 11
>>
>> My lex and gaz tables are empty.  My rule table only has two rules
>> 1 1 0 -1 5 5 1 -1 1 1
>> 1 0 -1 5 1 -1 1 1
>> It seems that str is recognized as something different even though my
>> gaz and lex tables are empty.
>>
>> When I add the rule 1 2 0 -1 5 5 1 -1 1 1 and define
>> ;1;"STR";"street";2; in my lex table I still have the same problem
>> select * from standardize_address('lex_d','gaz_d','rules_d','Hein
>> str11') gives me a result of name -- HEIN street and house_num -- 1
>> select * from standardize_address('lex_d','gaz_d','rules_d','Hein str
>> 11') returns an empty row.
>>
>> To double check I installed pagc on my computer.  Here I get the
>> desired result when I define the rule 1 2 0 -1 5 5 1 -1 1 1, but this
>> does not seem to work in Postgis.
>>
>> Do you know why it is not working.  Can I somehow get postgres to show
>> me the input and output tokens it tries to use like pagc shows me?
>>
>> Thanks,
>> Iris
>>
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users

-- 
TERRAPLAN
Iris Rittinger
GIS-Projektmanagement
Gewerbestraße 33
79227 Schallstadt
Fon 07664 - 5053 13
Fax 07664 - 5053 21
www.terraplan.com



More information about the postgis-users mailing list