[postgis-users] ST_Contains() performance problem
Stephan Grüter
grueter at wuestundpartner.com
Sat Apr 26 10:09:20 PDT 2008
Hi Mark, Regina and Kevin
I think, I found the source of the problem.
It seems, that the planner of PG 8.2 always used internal functions
like "=" before external functions whereas the planner of PG 8.3
decides according to the estimated cost value of the external functions
and most of the postgis functions have a cost value of 1.
After I changed the estimated cost of st_geometry_overlap() (used by
&& operator)
and contains() from 1 to 2, the planner behaved exactly like the
former version and
the performance of the query was as good as before.
So I have a workaround for this particular case, but the question
remains, how to deal with this situation generally:
- Is it possible to disable the evaluation of function costs in
PG 8.3 with a configuration parameter (I didn't found one yet)?
- Are the default cost values of the postgis functions reasonable
for all or most situations or do we have to fine tune the values?
Regards,
Stephan
Am 25.04.2008 um 16:33 schrieb Mark Cave-Ayland:
> Stephan Grüter wrote:
>> Hi Kevin,
>> Am 25.04.2008 um 03:59 schrieb Kevin Neufeld:
>>> I find it curious that your old system has
>>> "retyp=8 AND (geom && ...) AND contains(...)"
>>>
>>> but you new system has
>>> "(geom && ...) AND contains(...) AND regtyp=8"
>>> I think PostgreSQL does short-circuit evaluation. Testing for
>>> integer equality and dropping out of a filter clause is
>>> significantly faster than testing a bounding box intersection, a
>>> contains operation, and then finishing with a test for integer
>>> equality.
>> Yes, I think the execution planner of PG 8.3 makes a very bad
>> decision in this case. But why? The query, the data,
>> the indexes and the clustering are identical on both systems.
>>> Are you sure you use the same query on both systems?
>> Yes, I'm sure. I noticed a performance degradation of an existing
>> web application and after some profiling I found the this query
>> as the source of the problem.
>> Regards,
>> Stephan
>
>
> Hmmm. It does seem strange that the evaluation order is different,
> however planner costs are influenced by the parameters in
> postgresql.conf, and to a less extent, the statistics sampling
> routine.
>
> So a good thing to do would be to compare the settings in
> postgresql.conf between both systems, and also check the statistics
> target for each of the columns within the join before going any
> further. Don't forget to restart PostgreSQL if you make any changes
> to the memory /cost settings.
>
>
> HTH,
>
> Mark.
>
> --
> Mark Cave-Ayland
> Sirius Corporation - The Open Source Experts
> http://www.siriusit.co.uk
> T: +44 870 608 0063
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
More information about the postgis-users
mailing list