MapServer and geocoding

imap at chesapeake.net imap at chesapeake.net
Wed Feb 23 22:55:02 PST 2000


Daniel,

I see a .ca domain.  Are you looking for a Canadian or US geocoder?  

TIGER data is US only. I'll comment on the suitability issue of that data.
The newer Tiger '98 data (and also '97 data) are poorly coded for mapping
purposes. 
The feature classifications CFCC that you would use to distinguish
Interstates,
US Hwy and State Hwy. etc are miscoded in many instances.  I work for the
people
who produce the data and I can't even get them to acknowlege the problem, but
it
does exist.  Their use of the data is limited to black and white paper maps,
single
width, so they really dont see the problem.  It *IS* however or pretty good
quality for geocoding purposes.  The '98 version is much improved in the
address
range department, but there are still alot of missing ranges.  Expect to get
a 80% hit ratio.  It is very good in urban areas, and worst in rural areas.
To improve the hit ratio, you need to supplement the tiger vector data with
the USPS ZIP4 and TIGER/ZIP+4 datasets to derive an interpolation that is
slightly less accurate, but very close to being exact.

Here is the real issue...  If you preserve all of the Tiger attributes in
Shapefile format, you will end up with a *huge* dataset.  Expect to occupy
about 25GB with all of the data that you will need for mapping and geocoding
with Tiger Shapefiles (all layers).  If you add the USPS ZIP4 and TIGER/ZIP4
datasets, that eats up another 5-6GB, and that puts you in the 30GB range.
You may be able to reduce this slightly, if you whack out some of the
.DBF attribute data.

The commercial geocoders (MapInfo MapMarker for example), they somehow crunch
all of that data onto 2 cdroms which is great, but the software costs you
$9000 annually to run (and it is WinNT).  

My company sells a geocoder which uses a binary version of Tiger that is
5.6GB for the entire US.  Still big data, but that data isnt suitable 
for use with mapserv.  We actually have our own map engine that uses that 
same data format.  This software is however, significantly cheaper,
cost is $995 for the geocoder and the Tiger Binary data is $175/region
or $1500 for the entire US (there are 10 region/cdroms total).  Also,
it will run on any unix platforms + WinNT.  

If you *really* wanted a geocoder that uses ESRI shapefiles, I am
confident that I could build one of those in about a week at a
cost of $3000.  I'd propose the deliverable to you would be the
Tiger '98 dataset (you choose the layers you want) suitable for 
mapping and geocoding purposes, a unix/solaris version of the 
geocoding engine.  I'd retain the rights to the source code, but 
grant your company an unlimited use license of the finished product.  
That would NOT include the USPS ZIP+4 stuff.  I think there are some 
licensing/resell issues with that data.  If you are interested in 
having me port this, just fax me a purchase order and I'll get busy.

Just in case you are wondering..  Our current geocoder is a
command line program that generates expanded or terse/brief
output.  The engine is modelled after the Etak Eagle Geocoder
client for compatibility reasons input is:  address|city|state|zip
The city AND state *OR* Zip is required.  If one or the other
is missing, it will still resolve.  See the
sample output as follows:

[cstuber at imap /geocoder1.2]$ geocoder
Enter address as: street|city|state|zip (^D quits)
-->5390 Sea Isle Rd|Memphis|TN|
Std. Primary House Number:      <5390>
Std. Matching Street Name:      <Sea Isle>
Std. Suffix Type:               <Road>
***
Match Count     <1>
Match Type      <1>
Match DB        <T>

Match Addr      <5390 SEA ISLE ROAD>
Match City      <MEMPHIS>
Match State     <TN>
Match Zip       <38119>
Match St FIPS   <47>
Match Cty FIPS  <157>
Match Lat       <35.09421>
Match Lon       <-89.88207>

The terse (line) mode (input line is 5390 Sea Isle|||38119)

[cstuber at imap /geocoder1.2]$ geocoder -l
MatCnt|MatType|MatDB|MatAddr|MatCity|MatState|MatZip|MatSFIPS|MatCFIPS|MatLat|MatLon|Pop|Elev|
5390 Sea Isle|||38119
1|1|T|5390 SEA ISLE ROAD|MEMPHIS|TN|38119|47|157|35.09421|-89.88207|||

Redirection of both input/output is supported, so you can geocode in
batch mode.   One side note about porting this to use Shapefiles...
If you data were Shapefiles were pre-projected, your returned lat/lon 
would be in the projected map coordinates.   


Regards,

Chris Stuber (mapsurfer)
Silicon Mapping Solutions, Inc.
PO Box 741
Owings, MD  20736

w 410-257-3187
f 410-257-1978


Stephen Lime wrote:
> 
> Nope, no geocoder. It's as much a data problem as a software problem. It probably isn't that hard to write one if you really understand the process. However, you need a decent set of databases to match against and that ain't cheap. About the only freebie dataset I know of is Tiger. others on the list can speak to its suitability.
> 
> If you can kick out a coordinate then the app can be integrated with mapserver. A OpenSource geocoder based on free data would be very cool indeed.
> 
> Steve
> 
> <<< Daniel Morissette <danmo at videotron.ca>  2/23  3:56p >>>
> MapServer users,
> 
> Has anyone ever used the MapServer for an application that needed to do
> geocoding (address matching) on the fly.  I don't think that MapServer
> has any direct support for geocoding, does it?
> 
> Does anyone have recommendations for a library or package (OpenSource or
> commercial) that would complement well MapServer and allow an
> application to do geocoding on the fly (ideally something that runs on
> Solaris).
> 
> Thanks,
> --
> ------------------------------------------------------------
>  Daniel Morissette                       danmo at videotron.ca
>               http://pages.infinit.net/danmo/
> ------------------------------------------------------------
>   Don't put for tomorrow what you can do today, because if
>       you enjoy it today you can do it again tomorrow.

-- 

Chris Stuber (mapsurfer)
Silicon Mapping Solutions, Inc.
410-257-3187



More information about the MapServer-users mailing list