Yes I can make the data available; I'll do that in the next day or so and send another note with a link to it.<br><br>I'd love to hear about ways to parallelize the query!<br><br>--Mark<br><br><div class="gmail_quote">

On Wed, Jun 18, 2008 at 4:35 PM, Martin Davis <<a href="mailto:mbdavis@refractions.net">mbdavis@refractions.net</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

I assume that the way PostgreSQL runs this query is sequentially on a single core?  That leaves 7 cores standing idly by.  Is there any simple way to get them involved?  Perhaps partition the data by some attribute and run multiple queries?<br>


<br>

Mark, is there any chance of you posting your datasets for experimentation purposes?<div><div></div><div class="Wj3C7c"><br>

<br>

Paul Ramsey wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Asked and answered? 15 minutes = 900 seconds / 12700 intersections =<br>

70ms per intersection calculation. If your 10 rainfalls are fairly<br>

complex (what's the vertex count?) I don't think that's all that<br>

terrible.  Removing the intersects() test will make things modestly<br>

faster, but not earth-shattering.<br>

<br>

P<br>

<br>

On Wed, Jun 18, 2008 at 9:40 PM, Mark Phillips <<a href="mailto:mphillip@unca.edu" target="_blank">mphillip@unca.edu</a>> wrote:<br>

  <br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Hi,<br>

<br>

I am a relative newcomer to postgis and am trying to figure out how to best<br>

optimize an interesting query.<br>

<br>

I have two tables containing (multi)polygons, one representing drainage<br>

basins, and the other representing rainfall amounts. The rainfall table has<br>

an attribute giving the amount of rain in mm associated with each polygon.<br>

<br>

    'basin' table:<br>

         gid       integer,<br>

         the_geom  geometry<br>

<br>

    'rainfall' table:<br>

        gid         integer,<br>

        the_geom    geometry,<br>

        rainamount  numeric<br>

<br>

I want to compute the total volume of rain in each basin by taking the<br>

intersection of each basin with each rainfall polygon, multiplying the area<br>

of that intersection by the rain amount value for the corresponding rain<br>

polygon, and adding up all the resulting totals for each basin, storing the<br>

result in a new table.  I have spatial indexes on both tables, and I've<br>

tried the following query using the && operator to make use of the indexes:<br>

<br>

    create table basinrain as<br>

        select bgid,<br>

               sum(arearain) as totrain<br>

          from (<br>

                 select b.gid as bgid,<br>

                        r.gid as rgid,<br>

                        r.rainamount * area(intersection(b.the_geom,<br>

r.the_geom)) as arearain<br>

                   from basin b,<br>

                        rain  r<br>

                  where b.the_geom && r.the_geom<br>

                    and intersects(b.the_geom, r.the_geom)<br>

                 ) foo<br>

          group by bgid<br>

<br>

This seems to work just fine, but it is much slower than I would expect.  My<br>

basin table has about 2200 rows; their size and geometric complexity is<br>

roughly comparable to US county polygons.  The rain table has about 10 rows,<br>

but each one represents a pretty complicated multipolygon with (many)<br>

holes.  The query "select count(*) from basin, rain where basin.the_geom &&<br>

rain.the_geom" executes very quickly and returns 12746, which I take to mean<br>

that (a) my spatial indexes are in fact in place and working, and (b) there<br>

are 12746 "possible" intersections to be computed in the bigger query<br>

above.  On a dual quad-core 3GHz Xeon system with nothing else going on,<br>

though, the bigger query takes about 15 minutes to run, which seems to me<br>

like a long time for computing 12746 intersections / areas.  (I know that<br>

comes out to an average of about 14 intersection/area computations per<br>

second, which is way faster than I could do it by hand of course, but for<br>

some reason I would expect it to be even faster than that.)<br>

<br>

Is this surprising to anyone else?  Can someone suggest other ways to<br>

optimize this?<br>

<br>

Thanks in advance,<br>

<br>

--Mark<br>

<br>

<br>

_______________________________________________<br>

postgis-users mailing list<br>

<a href="mailto:postgis-users@postgis.refractions.net" target="_blank">postgis-users@postgis.refractions.net</a><br>

<a href="http://postgis.refractions.net/mailman/listinfo/postgis-users" target="_blank">http://postgis.refractions.net/mailman/listinfo/postgis-users</a><br>

<br>

<br>

    <br>

</blockquote>

_______________________________________________<br>

postgis-users mailing list<br>

<a href="mailto:postgis-users@postgis.refractions.net" target="_blank">postgis-users@postgis.refractions.net</a><br>

<a href="http://postgis.refractions.net/mailman/listinfo/postgis-users" target="_blank">http://postgis.refractions.net/mailman/listinfo/postgis-users</a><br>

<br>

  <br>

</blockquote>

<br></div></div><font color="#888888">

-- <br>

Martin Davis<br>

Senior Technical Architect<br>

Refractions Research, Inc.<br>

(250) 383-3022</font><div><div></div><div class="Wj3C7c"><br>

<br>

_______________________________________________<br>

postgis-users mailing list<br>

<a href="mailto:postgis-users@postgis.refractions.net" target="_blank">postgis-users@postgis.refractions.net</a><br>

<a href="http://postgis.refractions.net/mailman/listinfo/postgis-users" target="_blank">http://postgis.refractions.net/mailman/listinfo/postgis-users</a><br>

</div></div></blockquote></div><br>