AW: [OSGeo-Discuss] Batch geocoding

Stark Hans-Jörg hansjoerg.stark at fhnw.ch
Fri Feb 4 06:49:50 PST 2011


Von: discuss-bounces at lists.osgeo.org [mailto:discuss-bounces at lists.osgeo.org] Im Auftrag von JP Glutting
Gesendet: Freitag, 4. Februar 2011 15:34
An: OSGeo Discussions
Betreff: Re: [OSGeo-Discuss] Batch geocoding

Hi Hans-Jörg,

I am not aware that Barcelona (the city?) has a very open geodata strategy, but that certainly would be nice. If you know of anyone in the area who might know more, I am more than willing to contact them.
[shj] my mistake. I thought I had read something on this some time ago. Maybe also because Spain has strong support in the FOSS domain.

I have the addresses, do you mean the coordinates? The address format is not ideal, but I am pretty happy with them, as they seem to geocode well in the tests I have done.
[shj] yes, addresses along with co-ordinates.

The Yahoo API allows geocoding of 50k addresses a day, which is plenty for what I need. I am going to try to use that.
[shj] that'll certainly do.

Cheers,
JP

On Fri, Feb 4, 2011 at 3:26 PM, Stark Hans-Jörg <hansjoerg.stark at fhnw.ch<mailto:hansjoerg.stark at fhnw.ch>> wrote:
Hi JP

Barcelona is unfortunately not well covered yet in OA. But I thought that Spain has a very "open" strategy in terms of providing geodata. If you manage to get Barcelona addresses (perhaps from council or any other "official body") the OA team will insert these into OA and then you can use the REST service.

Good luck!
-hj

Von: discuss-bounces at lists.osgeo.org<mailto:discuss-bounces at lists.osgeo.org> [mailto:discuss-bounces at lists.osgeo.org<mailto:discuss-bounces at lists.osgeo.org>] Im Auftrag von JP Glutting
Gesendet: Freitag, 4. Februar 2011 15:24
An: OSGeo Discussions
Betreff: Re: [OSGeo-Discuss] Batch geocoding

Thanks for all the responses! I will track them all down and see how they work.

Stark, I have 146,472 addresses in the city of Barcelona. Many of them are duplicates, but at the very least I have 31,514 that need to be coded, although that would leave out many that I want to use. I am in the process of developing filters to clean out apartment numbers, etc., to be able to pull a single coordinate for a whole set of addresses, and there are at least 6,000 that are invalid (from the original 146k, so 140k). It is a lot. I will take a look at openaddresses and do some testing.

Thanks!
JP
On Fri, Feb 4, 2011 at 3:10 PM, Stark Hans-Jörg <hansjoerg.stark at fhnw.ch<mailto:hansjoerg.stark at fhnw.ch>> wrote:
The OpenAddresses project (www.openaddresses.org<http://www.openaddresses.org>) is supposed to solve exactly your problem.
You can use the provided geocoding services (http://code.google.com/p/openaddresses/wiki/RESTService)

OpenAddresses has some regions where data was donated - there you will get high-quality results. Unfortunately this is not yet globally available...

Hth
-hj

Von: discuss-bounces at lists.osgeo.org<mailto:discuss-bounces at lists.osgeo.org> [mailto:discuss-bounces at lists.osgeo.org<mailto:discuss-bounces at lists.osgeo.org>] Im Auftrag von JP Glutting
Gesendet: Freitag, 4. Februar 2011 14:34
An: discuss at lists.osgeo.org<mailto:discuss at lists.osgeo.org>
Betreff: [OSGeo-Discuss] Batch geocoding

Hello,

I have a large set of addresses (around 150k) that I need to geocode for a study (my Masters thesis on heat-related mortality). I am looking into different solutions, but I can't find anything that seems like it would work properly.

I could script a solution using Google's map API, but there is a limit of 2,500 addreses per day (I can get around them with a little patience).

Right now the best solution I am looking at geopy for geocoding addresses (http://code.google.com/p/geopy/). It seems like a good system, I think I can use it to pull addresses out of my database and write back coordinates. There is one thing that I am not sure, about, though, is whether I am actually allowed to use the Google API without my use being liked to a specific web page. The terms of service and form for getting a Google API key require a URL linked to a Google account. In fact, it looks like the API can only be used through a web site:

"5.2 Account Key. After supplying Google with your account information and the URL of your Maps API Implementation, and accepting the Terms, you will be issued an alphanumeric key assigned to you by Google that is uniquely associated with your Google Account and the URL of your Maps API Implementation. Your Maps API Implementation must import the Google Maps APIs using this key as described in the Maps APIs Documentation<http://code.google.com/apis/maps/documentation/>, and Google will block requests with an invalid key or invalid URL. You may only obtain and use a key in accordance with these Terms and the Maps APIs Documentation<http://code.google.com/apis/maps/documentation/>."

So it looks like I can't even get it to work without a URL.

I can always write a script that loops through results extracted from the database, creates URLs and parses the XML results one at a time, but that seems like a fairly inelegant solution.

Does anyone have any good ideas about how to geocode a few thousand addresses?

Many thanks,
JP

_______________________________________________
Discuss mailing list
Discuss at lists.osgeo.org<mailto:Discuss at lists.osgeo.org>
http://lists.osgeo.org/mailman/listinfo/discuss


_______________________________________________
Discuss mailing list
Discuss at lists.osgeo.org<mailto:Discuss at lists.osgeo.org>
http://lists.osgeo.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/discuss/attachments/20110204/cc8934d7/attachment-0002.html>


More information about the Discuss mailing list