[postgis-users] I've got a silly question about performance

Josh Livni josh at umbrellaconsulting.com
Fri Jan 18 11:21:01 PST 2008


Chris,

Good point - thanks for calling me out on that.   I think we're both 
making assumptions about a use case -- my assumption was he would 
probably be doing a similar calculation many times, and thus my bias 
towards using points rather than polys. 

Also, as noted - it's hard to say whether centroids or points would 
suffice or which is better.  If a parcel intersects two polygons, and 
you want to know which polygon it belongs to (and only assign it to one 
of the two that it intersects), and if you are working only with point 
proxies for parcels, then it's possible a centroid may be a more valid 
assignment than a point_on_surface, even if the centroid itself lies 
outside the parcel. 

Anyway, without having more detail on the actual use-case and workflow, 
I have no great opinion on what methodology might work best...

Cheers,
 
 -Josh


Chris Hermansen wrote:
> Josh, one of us is missing something here.
>
> Point on polygon is faster.  But in the method he proposed, first you
> have to calculate the points.  And getting centroids requires visiting
> all of the points on the polygon.
>
> So I'm not seeing how this could be faster.
>
> If he's doing it more than once on the same data set, I agree with your
> pre-calculation comment.
>
> Some other thoughts:
>
> Note that centroids would however be a bad choice, as they can easily be
> outside the polygon.
>
> Even points within the polygons are not necessarily a good choice,
> depending on how the two input geometries overlap.  I guess it would
> depend on how parcels are "really" assigned to disticts - ie can the
> assignment be replicated algorithmically?
>
> Josh Livni wrote:
>   
>> Well, and again assuming he doesn't mind getting the list of parcels
>> whose centroids intersect each district rather than the list of
>> parcels that intersect each district, checking for point-polygon
>> intersection is going to be faster than checking for polygon-polygon
>> intersection.  So for speed purposes it's probably a good way to go.
>>
>> For a first pass, after ensuring there is a gist index on the centroid
>> (or in this case point-on-surface - tho I'm not sure why), I would
>> expect the following to be faster than the equivalent query w/the full
>> polygons rather than the points:
>>   ...
>>    AND election.the_geom && ST_pointonsurface(parcel2.wkb_geometry)
>>    AND
>> ST_distance(election.the_geom,ST_pointonsurface(parcel2.wkb_geometry))
>> = 0
>>  ...
>> And if you really needed speed, adding a new geometry field with the
>> points precalculated and using that in the second line would make
>> things even faster.
>>
>>  -Josh
>>
>>
>>
>> Chris Hermansen wrote:
>>     
>>> If you want a list of parcels in each precinct, why are you first
>>> creating points?  That's just going to add extra - unnecessary -
>>> geoprocessing.
>>>
>>> Try st_intersects() and st_intersection().
>>>
>>> If you are concerned your parcels overlap more than one precinct, get
>>> the area() of each intersection(), get the max() of that, and group by
>>> your other variables.
>>>
>>> Gregory Williamson wrote:
>>>  
>>>       
>>>> The GIST index is the one you want / need for spatial operations. Make
>>>> sure that you ANALYZE <tablename>; after creating the index or
>>>> changing the data much (rule of thumb is maybe 10% but analyze is a
>>>> fast and fairly low impact operation).
>>>>
>>>> The ST_ functions usually (always?) invoke the && operator (which is
>>>> the spatial operator which will use the GIST indexes); they are
>>>> wrapper for the underlying functions.
>>>>
>>>> Posting your information [type of hardware, OS, postgreSQL and postGIS
>>>> versions, table structure and indexes, tables sizes, the query itself
>>>> and the results of EXPLAIN ANALYZE <query>] to the PostgreSQL
>>>> performance mail list might be worthwhile if this list doesn't help
>>>> enough.
>>>>
>>>> Greg Williamson
>>>> Senior DBA
>>>> Globexplorer LLC, a company owned by DigitalGlobe
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: postgis-users-bounces at postgis.refractions.net on behalf of
>>>> easpengren
>>>> Sent: Thu 1/17/2008 12:02 AM
>>>> To: postgis-users at postgis.refractions.net
>>>> Subject: Re: [postgis-users] I've got a silly question about
>>>> performance
>>>>
>>>>
>>>> OK, I think I've got that. When I imported these shape files into the
>>>> database, an index was created. I added a Gist index to both to see
>>>> what
>>>> would happen. Will that cause a problem?
>>>>
>>>> I do not understand Indices at all. I'll have to read up on them.
>>>>
>>>> We'll see how these things come out. I would like to have something
>>>> usable
>>>> in the next day or two.
>>>>
>>>> I'll likely end up creating a view of this query. Again, I'm still
>>>> figuring
>>>> this out. Speed is really quite important for me. I don't have the most
>>>> powerful machine doing this stuff (more RAM will likely show up this
>>>> week).
>>>> What I'd like to do is make sure the software is working as
>>>> efficiently as I
>>>> can get it.
>>>>
>>>> As I type this, I seem to have gotten a much quicker response from the
>>>> server.
>>>>
>>>> Wow.
>>>>
>>>> Eric
>>>>
>>>> Brent Wood-2 wrote:
>>>>    
>>>>         
>>>>> --- easpengren <phat-ass at thinkheavyindustries.com> wrote:
>>>>>
>>>>>      
>>>>>           
>>>>>> I'm still getting the hang of some of the finer points of creating
>>>>>> queries in
>>>>>> PostGIS, as is probably obvious with my last post.
>>>>>>
>>>>>> I've two tables, parcel2 that is a collection of parcels in a
>>>>>>         
>>>>>>             
>>>> county and
>>>>    
>>>>         
>>>>>> a
>>>>>> table election, which is a table of voting precincts in the same
>>>>>>         
>>>>>>             
>>>> county.
>>>>    
>>>>         
>>>>>> I'd
>>>>>> like to select all of the parcels in each precinct.
>>>>>>
>>>>>>         
>>>>>>             
>>>>> That query looks OK, but if parcels can be split across precincts,
>>>>>       
>>>>>           
>>>> then it
>>>>    
>>>>         
>>>>> won't necessarily give the correct answer.
>>>>>
>>>>>      
>>>>>           
>>>>>> I have this query:
>>>>>>
>>>>>> select precinct, sit_st_num, sit_st_dir, sit_st_nam, sit_st_typ,
>>>>>> city_code
>>>>>> from election, parcel2 where ST_contains(election.the_geom,
>>>>>> ST_pointonsurface(parcel2.wkb_geometry));
>>>>>>
>>>>>> This gets the job done, but it's very slow. What can I do to speed
>>>>>> this
>>>>>> up?
>>>>>>         
>>>>>>             
>>>>> See the PostGIS docs about creating spatial indices on the geometry
>>>>> columns in
>>>>> your two tables.
>>>>>
>>>>> http://postgis.refractions.net/docs/ch04.html#id2761842
>>>>> http://postgis.refractions.net/docs/ch04.html#id2761985
>>>>>
>>>>> If you have, or if you create them, then you need to modify your
>>>>>       
>>>>>           
>>>> query to
>>>>    
>>>>         
>>>>> use
>>>>> them:
>>>>>
>>>>> .... where election.the_geom && parcel2.wkb_geometry and
>>>>> ST_contains ...
>>>>>
>>>>> as described in:
>>>>> http://postgis.refractions.net/docs/ch04.html#id2762121
>>>>>
>>>>>
>>>>>
>>>>> Hope this helps...
>>>>>
>>>>>   Brent Wood
>>>>> _______________________________________________
>>>>> postgis-users mailing list
>>>>> postgis-users at postgis.refractions.net
>>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>>
>>>>>
>>>>>       
>>>>>           
>>>> -- 
>>>> View this message in context:
>>>> http://www.nabble.com/I%27ve-got-a-silly-question-about-performance-tp14910258p14913339.html
>>>>
>>>> Sent from the PostGIS - User mailing list archive at Nabble.com.
>>>>
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at postgis.refractions.net
>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at postgis.refractions.net
>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>       
>>>>         
>>>   
>>>       
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>     
>
>
>   



More information about the postgis-users mailing list