[postgis-users] Tigerdata for AZ, AS and VI

Andy Colson andy at squeakycode.net
Tue Dec 13 12:31:34 PST 2011


And instead of running the update, try running:

explain analyze
SELECT ag.id, 
(geocode(ag.address1||','||ag.city||','||ag.state||','||ag.zip)) As geo
       FROM qliq.geo_biz_addr As ag
       WHERE ag.rating IS NULL
       ORDER BY zip
       LIMIT 100


Also, the order by zip, combined with the limit, means it has to pull 
every record, then sort by zip, then pull the first 100.  If you can 
drop one or the other it would run faster.

-Andy


On 12/13/2011 12:37 PM, Greg Williamson wrote:
> Ravi --
>
> Could you run this with "EXPLAIN ANALYZE ..." and post the results; that might give something of a clue as to what issues the planner is encountering.
>
> Greg W.
>
>
>
> ----- Original Message -----
>> From: Ravi ada<raviada at dobeyond.com>
>> To: 'PostGIS Users Discussion'<postgis-users at postgis.refractions.net>
>> Cc:
>> Sent: Monday, December 12, 2011 8:25 PM
>> Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI
>>
>> T hanks Steve, That's what I thought too, I ran the
>> 'install_missing_indexes"
>> function, it ran for a few minutes and returned 't'. I am assuming it
>> ran
>> successfully. The performance is still same. I increased the work_mem to 4GB
>> in postgresql.conf. It is still not acceptable.
>>
>> Leo/Regina, anything specific that you want me to verify on my system?
>> Performance is terrible, I can never finish geocoding 3million addresses
>> with this performance.
>>
>> Any help is highly appreciated.
>>
>> Thanks
>> Ravi Ada
>>
>> -----Original Message-----
>> From: postgis-users-bounces at postgis.refractions.net
>> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Stephen
>> Woodbridge
>> Sent: Monday, December 12, 2011 9:04 PM
>> To: postgis-users at postgis.refractions.net
>> Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI
>>
>> Hi Ravi,
>>
>> I do not have this setup on my machine, but I am willing to hazard a guess
>> that you are missing an index, but then I have no idea which
>> one(s) that might be. Leo and Regina are probably the experts in this, so I
>> would look over their past posts on the geocoder. You might also look at the
>> load and prep scripts in svn and see if there is an index there that you do
>> not have on your tables.
>>
>> Regards,
>>     -Steve
>>
>> On 12/12/2011 9:50 PM, Ravi ada wrote:
>>>   In these examples, they used only 2GB memory and 3GHz machine but
>>>   still achieved a blazing fast results. The same queries mentioned in
>>>   the link taking 10 and even 100 times more time to query a particular
>>>   address. I am using a 16GB, 6 Core AMD machine, dedicated to this
>>>   process. I did the tuning on postgresql config file based on the
>>>   recommendations. I am attaching my file here.. Please let me know if
>>>   the tuning parameters look good.
>>>   http://postgis.refractions.net/documentation/manual-svn/Geocode.html
>>>
>>>   This query is supposed to take only (61ms) but on my machine is was
>> (734ms).
>>>   SELECT g.rating, ST_X(g.geomout) As lon, ST_Y(g.geomout) As lat,
>>>       (addy).address As stno, (addy).streetname As street,
>>>       (addy).streettypeabbrev As styp, (addy).location As city,
>>>   (addy).stateabbrev As st,(addy).zip
>>>       FROM geocode('75 State Street, Boston MA 02109') As g;
>>>
>>>   Thanks
>>>   Ravi Ada
>>>
>>>   -----Original Message-----
>>>   From: postgis-users-bounces at postgis.refractions.net
>>>   [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
>>>   Ravi ada
>>>   Sent: Monday, December 12, 2011 7:05 PM
>>>   To: 'PostGIS Users Discussion'
>>>   Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI
>>>
>>>   Thank you Steve. I downloaded AZ files again and loaded fine but
>>>   others are still the same problem. According to your explanation that
>> should be ok.
>>>
>>>   I got the postgis database loaded for all states now. I have about 3
>>>   mil addresses, may not all be normalized, which I am trying to batch
>>>   geocode them. I am using the example mentioned in this link.
>>>   http://www.postgresonline.com/journal/archives/181-pgscript_intro.html
>>>
>>>   I am even using 100 as a batch, my update query is too slow. Its
>>>   updating at
>>>   1500 per hour. That's too slow, I will never be able to finish them.
>>>
>>>   I have 16GB RAM, and 7200 rpm disk partitioned to hold the postgres
>>>   table spaces. I am not sure what makes it run faster. Anybody has done
>>>   so many addresses before? What makes the performance go faster? I am
>>>   attaching the query and query plan here. www.pastie.org/3008194
>>>
>>>   Any help is appreciated.
>>>
>>>   Thanks
>>>   Ravi Ada
>>>
>>>   -----Original Message-----
>>>   From: postgis-users-bounces at postgis.refractions.net
>>>   [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
>>>   Stephen Woodbridge
>>>   Sent: Monday, December 12, 2011 8:50 AM
>>>   To: postgis-users at postgis.refractions.net
>>>   Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI
>>>
>>>   On 12/12/2011 9:18 AM, Ravi ada wrote:
>>>>   Hello All,
>>>>
>>>>   Has anyone experienced loading tigerdata into postgis database for
>>>>   Arizona, American Samoa and Virgin Islands. I getting
>> "*addr.dbf"
>>>>   cannot find errors. All the other states are loaded fine. I tried to
>>>>   download the shape files again thinking that they might have been
>>>>   corrupted during the transmission, but even after that I am getting
>>>>   the
>>>   same error.
>>>>
>>>>   Any ideas?
>>>
>>>   My download of Tiger has all the *addr* files for Arizona and I
>>>   believe I have accessed them all without a problem.
>>>
>>>
>>>   In general, the *addr* files are optional, and there are none for
>>>   Guam, American Samoa and Virgin Islands.
>>>
>>>   Typically if the county or county equivalent does not have roads with
>>>   address ranges in it, then it will not have any *addr* files. So it is
>>>   possible that a county in Arizona in say the desert might not have any
>>>   address ranges and therefore not have that file, but looking at the
>>>   list of counties in Arizona it looks like they all have those files.
>>>
>>>   -Steve W




More information about the postgis-users mailing list