[OSGeo-Discuss] Automatic geocoding of PDF documents

Stark Hans-Jörg hansjoerg.stark at fhnw.ch
Sat Jan 14 17:05:09 EST 2012


perhaps OpenAddresses (www.openaddresses.org) may also be helpful. it is far from being complete yet but for some regions the data is fairly dense (and if donated complete) - and: it provides geocoding as rest services (see the wiki). 

cheers,
hj

Am 14.01.2012 um 20:59 schrieb "Andrew Turner" <ajturner at highearthorbit.com>:

> On Fri, Jan 13, 2012 at 6:00 PM, slesage <slesage at geo.gob.bo> wrote:
>> Hi,
>> 
>> does anybody knows about some opensource software dedicated to automatic
>> geocoding of text documents ? The idea of that "black box" would be:
>> * give, as an input, a text document or a PDF,
>> * receive, as an output, a list of place names with their coordinates / a
>> map of POI corresponding to that places.
>> 
>> Using the geonames database (http://www.geonames.org/), the solution appears
>> to be only a fulltext search, that could be done using Lucene
>> (https://lucene.apache.org/java/docs/index.html).
>> 
>> I found the metacarta solution
>> (http://www.metacarta.com/products-platform-geotag.htm) but couldn't find
>> any opensource solution.
> 
> The reason that there isn't an open-source solution is because it is
> Very Difficult. Even geocoding is difficult and until a short while
> ago there weren't any decent open-source geocoders. So we worked with
> Schuyler (formerly of Metacarta) to build an open-source one [1].
> 
> Your idea of using Geonames gazeteer with Apache Lucene is interesting
> and I think I've seen it suggested before. However, at best it will
> find location names but will be missing any logic for disambiguation
> or words or relative locations. So you could likely find that "Paris"
> was mentioned, but not sure if it's Paris, France or Paris, Texas, US.
> 
> Gisgraphy [2] is an open-source option that says it provides Full-text
> searching. I don't know more about it though.
> 
> Definitely share what else you find or try.
> 
> Andrew
> 
> 
> [1] https://github.com/geocommons/geocoder
> [2] http://www.gisgraphy.com/download/index.htm
> 
>> 
>> Thanks for your suggestions.
>> 
>> Sylvain Lesage.
>> _______________________________________________
>> Discuss mailing list
>> Discuss at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/discuss
> 
> 
> 
> -- 
> Andrew Turner
> mobile: 248.982.3609
> andrew at fortiusone.com
> http://highearthorbit.com
> 
> http://geocommons.com           Helping build the Geospatial Web
> Introduction to Neogeography - http://oreilly.com/catalog/neogeography
> _______________________________________________
> Discuss mailing list
> Discuss at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/discuss


More information about the Discuss mailing list