[postgis-users] need help optimizing spatial join / intersection query

Martin Davis mbdavis at refractions.net
Wed Jun 18 13:35:14 PDT 2008

I assume that the way PostgreSQL runs this query is sequentially on a 
single core?  That leaves 7 cores standing idly by.  Is there any simple 
way to get them involved?  Perhaps partition the data by some attribute 
and run multiple queries?

Mark, is there any chance of you posting your datasets for 
experimentation purposes?

Paul Ramsey wrote:
> Asked and answered? 15 minutes = 900 seconds / 12700 intersections =
> 70ms per intersection calculation. If your 10 rainfalls are fairly
> complex (what's the vertex count?) I don't think that's all that
> terrible.  Removing the intersects() test will make things modestly
> faster, but not earth-shattering.
> P
> On Wed, Jun 18, 2008 at 9:40 PM, Mark Phillips <mphillip at unca.edu> wrote:
>> Hi,
>> I am a relative newcomer to postgis and am trying to figure out how to best
>> optimize an interesting query.
>> I have two tables containing (multi)polygons, one representing drainage
>> basins, and the other representing rainfall amounts. The rainfall table has
>> an attribute giving the amount of rain in mm associated with each polygon.
>>     'basin' table:
>>          gid       integer,
>>          the_geom  geometry
>>     'rainfall' table:
>>         gid         integer,
>>         the_geom    geometry,
>>         rainamount  numeric
>> I want to compute the total volume of rain in each basin by taking the
>> intersection of each basin with each rainfall polygon, multiplying the area
>> of that intersection by the rain amount value for the corresponding rain
>> polygon, and adding up all the resulting totals for each basin, storing the
>> result in a new table.  I have spatial indexes on both tables, and I've
>> tried the following query using the && operator to make use of the indexes:
>>     create table basinrain as
>>         select bgid,
>>                sum(arearain) as totrain
>>           from (
>>                  select b.gid as bgid,
>>                         r.gid as rgid,
>>                         r.rainamount * area(intersection(b.the_geom,
>> r.the_geom)) as arearain
>>                    from basin b,
>>                         rain  r
>>                   where b.the_geom && r.the_geom
>>                     and intersects(b.the_geom, r.the_geom)
>>                  ) foo
>>           group by bgid
>> This seems to work just fine, but it is much slower than I would expect.  My
>> basin table has about 2200 rows; their size and geometric complexity is
>> roughly comparable to US county polygons.  The rain table has about 10 rows,
>> but each one represents a pretty complicated multipolygon with (many)
>> holes.  The query "select count(*) from basin, rain where basin.the_geom &&
>> rain.the_geom" executes very quickly and returns 12746, which I take to mean
>> that (a) my spatial indexes are in fact in place and working, and (b) there
>> are 12746 "possible" intersections to be computed in the bigger query
>> above.  On a dual quad-core 3GHz Xeon system with nothing else going on,
>> though, the bigger query takes about 15 minutes to run, which seems to me
>> like a long time for computing 12746 intersections / areas.  (I know that
>> comes out to an average of about 14 intersection/area computations per
>> second, which is way faster than I could do it by hand of course, but for
>> some reason I would expect it to be even faster than that.)
>> Is this surprising to anyone else?  Can someone suggest other ways to
>> optimize this?
>> Thanks in advance,
>> --Mark
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

More information about the postgis-users mailing list