[postgis-users] RE: geocoding

bill at binko.net bill at binko.net
Fri May 20 11:40:37 PDT 2005


> I'm probably the person that you saw in threads working on a plpgsql
> geocoder.  Ralph is right, it is much more of a text parsing problem
> than a spatial problem.  PL/PGSQL and PL/Perl (great for splitting the
> text into component parts with regular expressions --  ie
> streetnumber, streetname, street prefix direction etc.)  together with
> good indexes and precalculated fields (like metaphones of the
> streetnames) can make a very effective text parser and matcher just
> utilizing what's good about the PostgreSQL database itself.  Throwing
> the line_interpolate_point or centroid (if a polygon based geocoder --
> from parcels or zipcodes etc.) from PostGIS into the mix is just
> iceing on the cake.

David,

What is the status on this?  I would also like to replace my geocoder.us
based geocoder (which I have to shell out to) with something in my
database.

Also, for performance reasons, remember that PostgreSQL has the ability to
index on things not in columns.  So you can do stuff like:

create index road_sdx on roads (soundex(roadname));

That doesn't create a column for the soundex, but does calculate it on
each update/insert and keeps it in the index (only).  It's like have a
column that only lives in the index.  You can also do stuff like:

create index road_sdx_lookup on roads(soundex(roadname), roadname, road_id);

If you do it in that order, queries like this will be very fast since they
never have to read the row from the table at all:

select roadname, road_id from roads where soundex(roadname) = 'B520';

Just some off-topic database --sorry for the digression.

Bill



More information about the postgis-users mailing list