[postgis-users] [slightly off-topic] Question on build a C address parse for an embedded geocoder

Paul Ramsey pramsey at opengeo.org
Sun Oct 21 09:03:35 PDT 2012


You can stuff things into an upper memory context, but I'm not sure
how wise that would be. I does however seem to be the only reasonable
approach to getting things to last much longer than a statement.

P.

On Sat, Oct 20, 2012 at 9:12 PM, Stephen Woodbridge
<woodbri at swoodbridge.com> wrote:
> Hi Dev's,
>
> I am interested in writing an address standardized in C that could be
> callable from SQL. I understand the basics of doing this from supporting
> pgRouting and writing some additional commands. It would get used something
> like:
>
> select * from standardize(address, city, state, country, postcode);
> select * from standardize(address_one_line);
>
> and would return a standardized set of fields like: house_num, street, city,
> state, country, postcode. These could be then used to create standardized
> reference table or it could be passed into the geocoder that would search
> the standardized reference table.
>
> What I am struggling with is how to best initial the address
> parser/standardize. The concept I have in mind is to have some tables that
> represent the lexicon, gazetteer, parsing rules, etc. This data could be
> specific to country and/or country-state. I could be fairly small or quite
> large. For example, there are about 40K unique city names based on the USPS
> zipcodes and about 7K of them have duplicate standardizations based on
> state.
>
> On the one hand I can read these tables on every request and build the
> internal structures, parse the request, and throw out the internal
> structures.
>
> Basically once the reference source records have been standardized you
> should not be changing the above tables because you want to standardize
> future search requests based on the same rules that the reference road
> segments were standardized.
>
> And ideally you do not want to spend the time to rebuild these internal
> structures on every search request.
>
> So is there a mechanism for building some internal data and holding on to it
> between requests. I suppose I could store it in a blob, but it would then
> need to be de-toasted on every search request.
>
> Maybe, I'm this is an non-issue, but it seems to impact the design depending
> on what options I might have and how they are implemented and accessed from
> the code.
>
> Thoughts?
>
> Thanks for any help or suggestions,
>   -Steve W
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list