[postgis-users] Performance boost with ST_ValueCount() - wondering why

Fri Jan 10 13:43:16 PST 2020

Hi Giuseppe,

Thank you for your reply. I provided the EXPLAIN (ANALYZE ON, BUFFERS ON)
output in the links in my initial email. Is that not what you meant?

Yes, the config between the two systems is very similar. I'm thinking the
shared buffers count differs between the two systems because the row counts
themselves are different. But why is that? I think I'll turn this question
over to the general Postgres Performance list now.

Shira

On Thu, Jan 9, 2020 at 3:23 PM Giuseppe Broccolo <g.broccolo.7 at gmail.com>
wrote:

> Hi Shira,
>
> there's the number of shared buffers involved in the hash aggregation that
> is different in the two execution plans, in PostgreSQL 9.6 the double of
> the pages are hitten compared to PostgreSQL 12.0. Here I assume that the
> configuration of the two DB engines is the same (same shared buffers,
> etc.). It would be good to check buffers behaviour during the query
> execution. Could you attach the output with the EXPLAIN(ANALYSE, BUFFERS)
> statement?
>
> Giuseppe.
>
> Il giorno gio 9 gen 2020 alle ore 20:48 Shira Bezalel <shira at sfei.org> ha
> scritto:
>
>> Hi List,
>>
>> This is a "yeah, but why?" type of question.
>>
>> I'm testing an upgrade from
>>
>> Postgres 9.6 and PostGIS 2.3
>>
>> to
>>
>> Postgres 12.1 and PostGIS 3.0
>>
>> One of our queries has gone from about 80 seconds to 30 seconds on the
>> new releases, which is great, but I'm just trying to figure out why. I
>> realize there are a host of different reasons why performance changes may
>> result across different versions and servers, but just wondering if it's
>> related to a specific performance enhancement in Postgres or PostGIS. Can't
>> seem to find anything in the release notes that would explain it.
>>
>> I've narrowed down the improvement to the part of the larger query that
>> issues an ST_ValueCount() against a large raster and then sums the results.
>> I can replicate the improvement with this simpler query subset:
>>
>> SELECT pvc.value, SUM(pvc.count) AS sum
>> FROM
>> (SELECT (ST_ValueCount(cv.rast, 1)).*
>> FROM calveg_whrtype_20m AS cv) AS pvc
>> GROUP BY pvc.value
>>
>> 9.6 plan <https://explain.depesz.com/s/W8HN>
>> 12.0 plan <https://explain.depesz.com/s/lIRS>
>>
>> Anything jump out as the reason for the improved plan in terms of changes
>> to Postgres or PostGIS? Something to do with the HashAggregate it seems,
>> but not sure why. And yes, this may have nothing to do with PostGIS per se,
>> so feel free to point me over to the Postgres Performance list, if so. Just
>> thought I'd start here since the raster function is involved.
>>
>> Thanks much!
>> Shira
>>
>>
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users

-- 
Shira Bezalel
Database Administrator & Desktop Support Manager
San Francisco Estuary Institute
www.sfei.org
Ph: 510-746-7304
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20200110/de85aef2/attachment.html>