AW: [OSGeo-Discuss] Batch geocoding
hansjoerg.stark at fhnw.ch
Fri Feb 4 06:26:33 PST 2011
Barcelona is unfortunately not well covered yet in OA. But I thought that Spain has a very "open" strategy in terms of providing geodata. If you manage to get Barcelona addresses (perhaps from council or any other "official body") the OA team will insert these into OA and then you can use the REST service.
Von: discuss-bounces at lists.osgeo.org [mailto:discuss-bounces at lists.osgeo.org] Im Auftrag von JP Glutting
Gesendet: Freitag, 4. Februar 2011 15:24
An: OSGeo Discussions
Betreff: Re: [OSGeo-Discuss] Batch geocoding
Thanks for all the responses! I will track them all down and see how they work.
Stark, I have 146,472 addresses in the city of Barcelona. Many of them are duplicates, but at the very least I have 31,514 that need to be coded, although that would leave out many that I want to use. I am in the process of developing filters to clean out apartment numbers, etc., to be able to pull a single coordinate for a whole set of addresses, and there are at least 6,000 that are invalid (from the original 146k, so 140k). It is a lot. I will take a look at openaddresses and do some testing.
On Fri, Feb 4, 2011 at 3:10 PM, Stark Hans-Jörg <hansjoerg.stark at fhnw.ch<mailto:hansjoerg.stark at fhnw.ch>> wrote:
The OpenAddresses project (www.openaddresses.org<http://www.openaddresses.org>) is supposed to solve exactly your problem.
You can use the provided geocoding services (http://code.google.com/p/openaddresses/wiki/RESTService)
OpenAddresses has some regions where data was donated - there you will get high-quality results. Unfortunately this is not yet globally available...
Von: discuss-bounces at lists.osgeo.org<mailto:discuss-bounces at lists.osgeo.org> [mailto:discuss-bounces at lists.osgeo.org<mailto:discuss-bounces at lists.osgeo.org>] Im Auftrag von JP Glutting
Gesendet: Freitag, 4. Februar 2011 14:34
An: discuss at lists.osgeo.org<mailto:discuss at lists.osgeo.org>
Betreff: [OSGeo-Discuss] Batch geocoding
I have a large set of addresses (around 150k) that I need to geocode for a study (my Masters thesis on heat-related mortality). I am looking into different solutions, but I can't find anything that seems like it would work properly.
I could script a solution using Google's map API, but there is a limit of 2,500 addreses per day (I can get around them with a little patience).
Right now the best solution I am looking at geopy for geocoding addresses (http://code.google.com/p/geopy/). It seems like a good system, I think I can use it to pull addresses out of my database and write back coordinates. There is one thing that I am not sure, about, though, is whether I am actually allowed to use the Google API without my use being liked to a specific web page. The terms of service and form for getting a Google API key require a URL linked to a Google account. In fact, it looks like the API can only be used through a web site:
"5.2 Account Key. After supplying Google with your account information and the URL of your Maps API Implementation, and accepting the Terms, you will be issued an alphanumeric key assigned to you by Google that is uniquely associated with your Google Account and the URL of your Maps API Implementation. Your Maps API Implementation must import the Google Maps APIs using this key as described in the Maps APIs Documentation<http://code.google.com/apis/maps/documentation/>, and Google will block requests with an invalid key or invalid URL. You may only obtain and use a key in accordance with these Terms and the Maps APIs Documentation<http://code.google.com/apis/maps/documentation/>."
So it looks like I can't even get it to work without a URL.
I can always write a script that loops through results extracted from the database, creates URLs and parses the XML results one at a time, but that seems like a fairly inelegant solution.
Does anyone have any good ideas about how to geocode a few thousand addresses?
Discuss mailing list
Discuss at lists.osgeo.org<mailto:Discuss at lists.osgeo.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Discuss