[postgis-users] Optimizing a query with Intersects and Intersection

Wed Feb 6 06:40:48 PST 2008

    Hi all,

    I'm looking for advice on ways to optimize the following query:

SELECT species_id, season, count(*),
        sum(area(Intersection(geom, 
MPolyFromText(MultiPolygon('...'))))) as area
FROM distrib
WHERE geom && MPolyFromText(MultiPolygon('...')
        AND Intersects(geom, MPolyFromText(MultiPolygon('...'))
GROUP BY species_id, season;

    I'm running this query against a feature table containing about 150K 
features.

    The actual MultiPolygon used in the query is specified by the user 
of my application and have been omitted for conciseness. It is usually 
the contour of a county, state or country, or several of them, so it may 
be a complex feature.

    When the extent of the MultiPolygon is small comparing to the extent 
of the features in the distrib table, the query runs fast (10 to 20 
seconds).

    When the MultiPolygon is large, the query takes quite a lot more 
time. I don't know how much yet. My test case has been running for 40 
minutes now. The problem is that with a larger MultiPolygon, the && 
operator fails to exclude much of the features and PostgreSQL ends up 
running Intersects and Intersection for all polygons in the feature table.

    I've created a GIST index on the table, but it would only help the 
&& operation, which is helpless in my case.

    This isn't a big problem as the users won't usually select large 
polygons, and they have plenty to do while they wait for the results. 
But it would be good to optimize it if possible.

    Does anyone see a way to optimize this query further?

    Any help would be greatly appreciated.

    Cheers,

Ricardo