[postgis-users] Tigerdata for AZ, AS and VI
Paragon Corporation
lr at pcorp.us
Thu Dec 15 17:52:00 PST 2011
Ravi,
I thought I optimized those, but I may have missed something. The best way
to see what is going on is to use
normalize_address to see how its reading the values.
Anyrate -- please put in a bug ticket for this with some examples you are
trying and how long its taking.
http://trac.osgeo.org/postgis/newticket
Make sure to set the component to "tiger geocoder" in the ticket so it gets
assigned to me.
Thanks,
Regina
http://www.postgis.us
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On
> Behalf Of Ravi ada
> Sent: Thursday, December 15, 2011 12:35 PM
> To: 'PostGIS Users Discussion'
> Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI
>
> Thank you. I am able to speed up a bit by launching the query
> for each state and doing 6 states at a time. I am able to see
> all processors are being used with a 80-85% memory
> utilization. However I noticed that geocode function is
> taking forever to return for some addresses that are like
> '100 29TH EAST ST.' or 'I-35 HIGHWAY'. Some states have the
> convention of using numbers for the street names, it takes
> forever to geocode these addresses. Numbered streets and
> Highway service roads are the one taking the most time. I
> wonder why? Any ideas what we need to speed these up?
>
> Thanks,
> Ravi Ada
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On
> Behalf Of Stephen Woodbridge
> Sent: Wednesday, December 14, 2011 1:30 PM
> To: postgis-users at postgis.refractions.net
> Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI
>
> On 12/14/2011 10:12 AM, Ravi ada wrote:
> > Thanks Steve.
> > I did not get when you said '2. normalize the names as you load the
> > data', is this the step we need to do manually or load scripts
> > automatically do this step?
>
> the load scripts do this automatically.
>
> > I am normalizing my addresses before querying the reference data
> > (tiger data). Please clarify.
>
> You should use geocode() not geocode_address(), because
> geocode() will normalize the address in the same way the the
> reference addresses are normalized. If you split the address
> into fields and call
> geocode_address() this is NOT the same as normalizing the
> address. If you want to be able to get good matches, you have
> to use the same normalize function for both the reference and
> the input addresses.
>
> These functions are not about performance they are about
> using the tool the correct way.
>
> -Steve
>
> > Thanks
> > Ravi Ada
> >
> > -----Original Message-----
> > From: postgis-users-bounces at postgis.refractions.net
> > [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
> > Stephen Woodbridge
> > Sent: Wednesday, December 14, 2011 8:14 AM
> > To: postgis-users at postgis.refractions.net
> > Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI
> >
> > Ravi,
> >
> > The process for geocoding follows this:
> >
> > Load the data:
> > 1. get a reference set of streets (ie: the Tiger data) 2. normalize
> > the names as you load the data 3. build the indexes you
> need for the
> > queries
> >
> > Query for an address:
> > 1. normalize the address on input
> > 2. query the normalized reference
> >
> > Ok, so you know most of this because you have already done
> it, but the
> > important part here is the you normalize BOTH the reference
> data and
> > the input to a query. This resolves things like:
> >
> > main street != main st
> >
> > because the normalize parses the addresses and converts them into a
> > normalized standard form so that you can match. Yes it
> takes time to
> > normalize the request, but if you don't normalize it, then
> there is a
> > good change that you will not match an appropriate street in the
> > reference
> set.
> >
> > -Steve
> >
> > On 12/14/2011 9:06 AM, Ravi Ada wrote:
> >> Regina,
> >>
> >> Thanks so much for the reply. I ran the
> >> missing_indexes_generate_script(),
> >> actually it did not return anything, I am assuming all the indexes
> >> are in place. That may be because I ran install_missing_indexes()
> earlier.
> >> I changed the debug flag in geocode_address and it produced a very
> >> long query that it runs to geocode the address. I tried to cut and
> >> paste the query to run the plan, I am getting errors, I
> will figure
> >> that
> > out.
> >>
> >> My question is, do we use gecode or geocode_address for faster
> >> querying? I noticed that geocode_address takes the
> normalized address
> >> where as geocode takes address as string parameter. By adding
> >> additional normalize_address function when doing the
> geocode_address
> >> akes
> > it run any faster?
> >>
> >>
> >> Thanks
> >> Ravi Ada
> >>
> >>
> >>
> >>
> >> On Wed, 14 Dec 2011 04:18:27 -0500, Paragon Corporation wrote
> >>>> I just don't understand why the geocode function takes
> so long to
> >>>> return the coordinates. I am sure some of you on this list might
> >>>> have done the batch geocoding millions of addresses. I may be
> >>>> missing just a simple configuration which might make a
> whole lot of
> >>>> difference in the speed. I don't know what it is. I am following
> >>>> the examples exactly from this link
> >>>>
> (http://postgis.refractions.net/documentation/manual-svn/Geocode.ht
> >>>> m
> >>>> l)
> >>>>
> >>>> If someone is familiar with the problem willing to help me using
> >>>> GoTo Meeting connection to my machine, I can arrange that too. I
> >>>> just have to move along with my project and meet the
> deadlines. I
> >>>> am already delayed, everybody in my team asking me for
> this everyday.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Ravi Ada
> >>>>
> >>>>
> >>> Ravi,
> >>>
> >>> Sorry been busy with raster stuff so haven't been tuned into this
> >>> discussion.
> >>>
> >>> 1) The indexes the loader generates are not the only ones needed.
> >>> Initially I was constantly changing the loader script,
> but since we
> >>> were changing decisions as we changed code and optimal indexes
> >>> needed with aeach change required changing indexes, which indexes
> >>> would be best, I created a function that would put them in rather
> >>> than bothering with the loader (since a lot of people
> would already
> >>> have their data loaded)
> >>>
> >>> Have you tried running that. I suspect you are just
> missing indexes
> >>> as the timings you are getting are what I used to get earlier on.
> >>>
> >>> If you haven't run the update script (which runs this
> routine anyway)
> >>> or run this to get generated script for indexes you
> are missing
> >>> you should.
> >>>
> >>>
> http://www.postgis.org/documentation/manual-svn/Missing_Indexes_Gene
> >>> r
> >>> ate_Scr
> >>> ipt.html
> >>>
> >>> 2) There are a couple of other things to note: First
> address you do
> >>> around an area can take a lot more time because of the
> data caching
> >>> effects in postgresql. So for the example in the docs
> you describe.
> >>>
> >>> I can do a geocode of 75 State Street,Boston, MA -- and if I
> >>> haven't done any geocoding in a while that takes like 1-3 seconds
> >>>
> >>> Then if I do 80 State Street, Boston, MA -- that subsequent takes
> >>> anywhere from 60 ms - 150 ms.
> >>> I also don't have all the states loaded since I only
> needed it for
> >>> about 6 states. thought that should just increase the
> planner time
> >>> rather than later times.
> >>>
> >>> 3) For debugging performance there is a variable in the
> >>> geocode_address function called var_debug. Its false by default,
> >>> change
> > it to true.
> >>> That spits out the sql being run and is a better sql to
> pass to the
> >>> planner to check.
> >>>
> >>> We were hoping to make these debugging features more publically
> >>> exposed e.g via a config table, but haven't had the time
> to do that.
> >>>
> >>> Hope this all helps,
> >>> Regina
> >>> http://www.postgis.us
> >>>
> >>> _______________________________________________
> >>> postgis-users mailing list
> >>> postgis-users at postgis.refractions.net
> >>> http://postgis.refractions.net/mailman/listinfo/postgis-users
> >>
> >>
> >> Thanks,
> >> Ravi Ada
> >> 918-630-7381
> >>
> >> _______________________________________________
> >> postgis-users mailing list
> >> postgis-users at postgis.refractions.net
> >> http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
More information about the postgis-users
mailing list