[postgis-users] Performance boost with ST_ValueCount() - wondering why

Mon Jan 13 08:21:10 PST 2020

Hi Giuseppe,

Here is the original explain output (You can also get this from clicking on
the "Source" tab in the Depsez links):

9.6

HashAggregate  (cost=42103.26..42105.26 rows=200 width=16) (actual
time=94062.150..94062.161 rows=59 loops=1)
  Group Key: (st_valuecount(cv.rast, 1, true, NULL::double precision[],
'0'::double precision)).value
  Buffers: shared hit=35395
  ->  Seq Scan on calveg_whrtype_20m cv  (cost=0.00..11323.26 rows=2052000
width=12) (actual time=90.687..94056.706 rows=15812 loops=1)
        Buffers: shared hit=35395
Planning time: 0.213 ms
Execution time: 94062.215 ms

12.1

HashAggregate  (cost=1088130.78..1088132.78 rows=200 width=16) (actual
time=44634.021..44634.031 rows=59 loops=1)
  Group Key: ((st_valuecount(cv.rast, 1, true, NULL::double precision[],
'0'::double precision))).value
  Buffers: shared hit=17664
  ->  Result  (cost=0.00..1057350.78 rows=2052000 width=12) (actual
time=348.651..44630.614 rows=15812 loops=1)
        Buffers: shared hit=17664
        ->  ProjectSet  (cost=0.00..10830.78 rows=2052000 width=32) (actual
time=348.637..44628.307 rows=15812 loops=1)
              Buffers: shared hit=17664
              ->  Seq Scan on calveg_whrtype_20m cv  (cost=0.00..47.52
rows=2052 width=31) (actual time=0.022..1.311 rows=2052 loops=1)
                    Buffers: shared hit=27
Planning Time: 0.381 ms
JIT:
  Functions: 12
  Options: Inlining true, Optimization true, Expressions true, Deforming
true
  Timing: Generation 3.299 ms, Inlining 10.618 ms, Optimization 124.105 ms,
Emission 80.956 ms, Total 218.978 ms
Execution Time: 44637.499 ms

Interesting about the stats improvement in 10. If/when I gain more insight
about this question, I'll keep this listed posted.

Thanks,
Shira

On Sat, Jan 11, 2020 at 3:54 AM Giuseppe Broccolo <g.broccolo.7 at gmail.com>
wrote:

> Hi Shira,
>
> On Fri, 10 Jan 2020, 21:44 Shira Bezalel, <shira at sfei.org> wrote:
>
>> Hi Giuseppe,
>>
>> Thank you for your reply. I provided the EXPLAIN (ANALYZE ON, BUFFERS ON)
>> output in the links in my initial email. Is that not what you meant?
>>
>
> I was meaning the vanilla output returned by the EXPLAIN, not just the
> (optimal, BTW) Depesz one.
>
> Yes, the config between the two systems is very similar. I'm thinking the
>> shared buffers count differs between the two systems because the row counts
>> themselves are different. But why is that? I think I'll turn this question
>> over to the general Postgres Performance list now.
>>
>
> If the tables are identical in the two system, a different count of rows
> could be to a better statistical sampling, so the planner is more accurate
> in planning the aggregation. And this could make sense cause since
> PostgreSQL 10 statistics have been improved.
>
> Also, consider that even slight differences between the systems can lead
> to significant differences. Would be good here to understand what is
> different in the configuration.
>
> Anyway, please keep us updated from any insight if you find something :)
>
> Giuseppe.
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users

-- 
Shira Bezalel
Database Administrator & Desktop Support Manager
San Francisco Estuary Institute
www.sfei.org
Ph: 510-746-7304
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20200113/c5daf650/attachment.html>