<div dir="ltr"><div>Hi,</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">вт, 27 июл. 2021 г. в 00:01, Justin Pryzby <<a href="mailto:pryzby@telsasoft.com">pryzby@telsasoft.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Jul 26, 2021 at 10:44:42PM +0200, Raúl Marín wrote:<br>
> Hi,<br>
> <br>
> As Paul is mentioning, it appears that you are assuming that having lots of<br>
> shared buffer hits is good but it isn't. A shared buffer hit meant that you<br>
> needed to read a page and it happened to be cached in memory; but indexes<br>
> are useful because they avoid the need of reading unnecessary pages (in<br>
> memory or in disk), so an ideal sort would be one that put the data you are<br>
> looking in the same pages (so all the rows in the page are useful) and the<br>
> ideal index would be one that knew which pages as fast as possible (also<br>
> reading as few index pages as possible).<br>
<br>
I'm not following along closely, but I suggest to look at whether the index is<br>
clustered or not.<br>
<br>
SELECT correlation FROM pg_stats WHERE attname=.. AND tablename=..<br>
<br>
If correlation is low, an index scan may touch many pages of the table, even if<br>
it returns only a fraction of its tuples. In addition to reading the heap more<br>
randomly than sequentually.</blockquote><div><br></div><div>Correlation is not helpful here as it falls under chicken-and-egg problem: we're trying to prove that hilbert sort is faster, and correlation will show how perfectly hilbert-sorted the table is.</div></div></div>