[postgis-users] Tigerdata for AZ, AS and VI
Greg Williamson
gwilliamson39 at yahoo.com
Tue Dec 13 17:33:55 PST 2011
Have you run "analyze" recently on this table ? (since the last index build or the last major change in data)
The work_mem setting is fairly meaningless for this -- it applies when building indexes and the like; sort_mem controls how much RAM the system will try to use before it starts using disk; you might try tinkering with that unless it is already large (but remember that each sort in a query uses this much RAM so too aggressive a setting is bad).
HTH,
Greg W.
>________________________________
> From: Ravi Ada <raviada at dobeyond.com>
>To: Andy Colson <andy at squeakycode.net>; PostGIS Users Discussion <postgis-users at postgis.refractions.net>
>Sent: Tuesday, December 13, 2011 3:28 PM
>Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI
>
>
>Andy,
>Here is the explain analyze output.
>"Limit (cost=0.00..14.10 rows=100 width=73) (actual
time=4824.392..98929.180
>rows=100
loops=1)"
>" -> Index Scan using geo_biz_addr_zip_idx on geo_biz_addr
ag
>(cost=0.00..219048.99 rows=1553779 width=73) (actual
time=4824.381..98925.304
>rows=100
loops=1)"
>" Filter: (rating IS
NULL)"
>"Total runtime: 98930.371
ms"
>
>
>Here is the output for the query without ORDER BY zip.
>"Limit (cost=0.00..7.06 rows=100 width=73) (actual
time=63022.583..279475.286
>rows=100
loops=1)"
>" -> Seq Scan on geo_biz_addr ag (cost=0.00..109741.62
rows=1553779
>width=73) (actual time=63022.571..279474.529 rows=100
loops=1)"
>" Filter: (rating IS
NULL)"
>"Total runtime: 279475.678
ms"
>
>Surprisingly it took longer without the where clause, that may be because the
addresses are scattered around all the states or cities. but in any case, 100 to
300 secs to
geocode 100 addresses is too long. I got the work_mem to set to 4GB in postgresql.conf.
>
>
>Thanks
>Ravi Ada
>On Tue, 13 Dec 2011 14:31:34 -0600, Andy Colson
wrote
>> And instead of running the update, try
running:
>>
>> explain
analyze
>> SELECT
ag.id,
>>
>> (geocode(ag.address1||','||ag.city||','||ag.state||','||ag.zip)) As
geo
>> FROM qliq.geo_biz_addr As
ag
>> WHERE ag.rating IS
NULL
>> ORDER BY
zip
>> LIMIT
100
>>
>> Also, the order by zip, combined with the limit, means it has to
>> pull every record, then sort by zip, then pull the first 100. If
>> you can drop one or the other it would run
faster.
>>
>>
-Andy
>>
>> On 12/13/2011 12:37 PM, Greg Williamson
wrote:
>> > Ravi
--
>>
>
>> > Could you run this with "EXPLAIN ANALYZE ..." and post the
results;
that
>might give something of a clue as to what issues the planner is
encountering.
>>
>
>> > Greg
W.
>>
>
>>
>
>>
>
>> > ----- Original Message
-----
>> >> From: Ravi
ada<raviada at dobeyond.com>
>> >> To: 'PostGIS Users
Discussion'<postgis-users at postgis.refractions.net>
>> >>
Cc:
>> >> Sent: Monday, December 12, 2011 8:25
PM
>> >> Subject: Re: [postgis-users] Tigerdata for AZ, AS and
VI
>>
>>
>> >> T hanks Steve, That's what I thought too, I ran
the
>> >>
'install_missing_indexes"
>> >> function, it ran for a few minutes and returned 't'. I am assuming
it
>> >>
ran
>> >> successfully. The performance is still same. I increased the
work_mem to
4GB
>> >> in postgresql.conf. It is still not
acceptable.
>>
>>
>> >> Leo/Regina, anything specific that you want me to verify on my
system?
>> >> Performance is terrible, I can never finish geocoding 3million
addresses
>> >> with this
performance.
>>
>>
>> >> Any help is highly
appreciated.
>>
>>
>> >>
Thanks
>> >> Ravi
Ada
>>
>>
>> >> -----Original
Message-----
>> >> From:
postgis-users-bounces at postgis.refractions.net
>> >> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf
Of
Stephen
>> >>
Woodbridge
>> >> Sent: Monday, December 12, 2011 9:04
PM
>> >> To:
postgis-users at postgis.refractions.net
>> >> Subject: Re: [postgis-users] Tigerdata for AZ, AS and
VI
>>
>>
>> >> Hi
Ravi,
>>
>>
>> >> I do not have this setup on my machine, but I am willing to hazard
a
guess
>> >> that you are missing an index, but then I have no idea
which
>> >> one(s) that might be. Leo and Regina are probably the experts in
this, so
I
>> >> would look over their past posts on the geocoder. You might also
look at
the
>> >> load and prep scripts in svn and see if there is an index there
that you
do
>> >> not have on your
tables.
>>
>>
>> >>
Regards,
>> >>
-Steve
>>
>>
>> >> On 12/12/2011 9:50 PM, Ravi ada
wrote:
>> >>> In these examples, they used only 2GB memory and 3GHz
machine
but
>> >>> still achieved a blazing fast results. The same queries
mentioned
in
>> >>> the link taking 10 and even 100 times more time to
query a
particular
>> >>> address. I am using a 16GB, 6 Core AMD machine,
dedicated to
this
>> >>> process. I did the tuning on postgresql config file
based on
the
>> >>> recommendations. I am attaching my file here.. Please
let me know
if
>> >>> the tuning parameters look
good.
>> >>> http://postgis.refractions.net/documentation/manual-svn/Geocode.html
>>
>>>
>> >>> This query is supposed to take only (61ms) but on my
machine is
was
>> >>
(734ms).
>> >>> SELECT g.rating, ST_X(g.geomout) As lon,
ST_Y(g.geomout) As
lat,
>> >>> (addy).address As stno, (addy).streetname
As
street,
>> >>> (addy).streettypeabbrev As styp,
(addy).location As
city,
>> >>> (addy).stateabbrev As
st,(addy).zip
>> >>> FROM geocode('75 State Street, Boston MA
02109') As
g;
>>
>>>
>> >>>
Thanks
>> >>> Ravi
Ada
>>
>>>
>> >>> -----Original
Message-----
>> >>> From:
postgis-users-bounces at postgis.refractions.net
>> >>> [mailto:postgis-users-bounces at postgis.refractions.net]
On Behalf
Of
>> >>> Ravi
ada
>> >>> Sent: Monday, December 12, 2011 7:05
PM
>> >>> To: 'PostGIS Users
Discussion'
>> >>> Subject: Re: [postgis-users] Tigerdata for AZ, AS and
VI
>>
>>>
>> >>> Thank you Steve. I downloaded AZ files again and loaded
fine
but
>> >>> others are still the same problem. According to your
explanation
that
>> >> should be
ok.
>>
>>>
>> >>> I got the postgis database loaded for all states now. I
have about
3
>> >>> mil addresses, may not all be normalized, which I am
trying to
batch
>> >>> geocode them. I am using the example mentioned in this
link.
>> >>> http://www.postgresonline.com/journal/archives/181-pgscript_intro.html
>>
>>>
>> >>> I am even using 100 as a batch, my update query is too
slow.
Its
>> >>> updating
at
>> >>> 1500 per hour. That's too slow, I will never be able to
finish
them.
>>
>>>
>> >>> I have 16GB RAM, and 7200 rpm disk partitioned to hold
the
postgres
>> >>> table spaces. I am not sure what makes it run faster.
Anybody has
done
>> >>> so many addresses before? What makes the performance go
faster? I
am
>> >>> attaching the query and query plan here. www.pastie.org/3008194
>>
>>>
>> >>> Any help is
appreciated.
>>
>>>
>> >>>
Thanks
>> >>> Ravi
Ada
>>
>>>
>> >>> -----Original
Message-----
>> >>> From:
postgis-users-bounces at postgis.refractions.net
>> >>> [mailto:postgis-users-bounces at postgis.refractions.net]
On Behalf
Of
>> >>> Stephen
Woodbridge
>> >>> Sent: Monday, December 12, 2011 8:50
AM
>> >>> To:
postgis-users at postgis.refractions.net
>> >>> Subject: Re: [postgis-users] Tigerdata for AZ, AS and
VI
>>
>>>
>> >>> On 12/12/2011 9:18 AM, Ravi ada
wrote:
>> >>>> Hello
All,
>>
>>>>
>> >>>> Has anyone experienced loading tigerdata into
postgis database
for
>> >>>> Arizona, American Samoa and Virgin Islands. I
getting
>> >>
"*addr.dbf"
>> >>>> cannot find errors. All the other states are loaded
fine. I tried
to
>> >>>> download the shape files again thinking that they
might have
been
>> >>>> corrupted during the transmission, but even after
that I am
getting
>> >>>>
the
>> >>> same
error.
>>
>>>>
>> >>>> Any
ideas?
>>
>>>
>> >>> My download of Tiger has all the *addr* files for
Arizona and
I
>> >>> believe I have accessed them all without a
problem.
>>
>>>
>>
>>>
>> >>> In general, the *addr* files are optional, and there
are none
for
>> >>> Guam, American Samoa and Virgin
Islands.
>>
>>>
>> >>> Typically if the county or county equivalent does not
have roads
with
>> >>> address ranges in it, then it will not have any *addr*
files. So it
is
>> >>> possible that a county in Arizona in say the desert
might not have
any
>> >>> address ranges and therefore not have that file, but
looking at
the
>> >>> list of counties in Arizona it looks like they all have
those
files.
>>
>>>
>> >>> -Steve
W
>
>
>Thanks,
>Ravi
Ada
>918-630-7381
>
>_______________________________________________
>postgis-users mailing list
>postgis-users at postgis.refractions.net
>http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
>
More information about the postgis-users
mailing list