<div dir="ltr"><div>Hi,</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">вт, 27 июл. 2021 г. в 00:01, Justin Pryzby <<a href="mailto:pryzby@telsasoft.com">pryzby@telsasoft.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Jul 26, 2021 at 10:44:42PM +0200, Raúl Marín wrote:<br>

> Hi,<br>

> <br>

> As Paul is mentioning, it appears that you are assuming that having lots of<br>

> shared buffer hits is good but it isn't. A shared buffer hit meant that you<br>

> needed to read a page and it happened to be cached in memory; but indexes<br>

> are useful because they avoid the need of reading unnecessary pages (in<br>

> memory or in disk), so an ideal sort would be one that put the data you are<br>

> looking in the same pages (so all the rows in the page are useful) and the<br>

> ideal index would be one that knew which pages as fast as possible (also<br>

> reading as few index pages as possible).<br>

<br>

I'm not following along closely, but I suggest to look at whether the index is<br>

clustered or not.<br>

<br>

SELECT correlation FROM pg_stats WHERE attname=.. AND tablename=..<br>

<br>

If correlation is low, an index scan may touch many pages of the table, even if<br>

it returns only a fraction of its tuples.  In addition to reading the heap more<br>

randomly than sequentually.</blockquote><div><br></div><div>Correlation is not helpful here as it falls under chicken-and-egg problem: we're trying to prove that hilbert sort is faster, and correlation will show how perfectly hilbert-sorted the table is.</div></div></div>