<div dir="ltr"><div dir="ltr"><div>Possibly relevant - a presentation on how BRIN indexes can provide better performance and reduce storage for very large point datasets:</div><div><br></div><div><a href="https://www.postgresql-sessions.org/_media/8/gbroccolo_jrouhaud_pgsession_brin4postgis.pdf">https://www.postgresql-sessions.org/_media/8/gbroccolo_jrouhaud_pgsession_brin4postgis.pdf</a><br></div><br><div class="gmail_quote"><div dir="ltr">On Sat, Jan 12, 2019 at 8:29 AM Wenbo Tao <<a href="mailto:taowenbo1993@gmail.com">taowenbo1993@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hello,<div><br></div><div>    I was trying to build a GiST index on a geometry column in a table with 1 billion rows. It took an entire week to finish. </div><div><br></div><div>    Then I reduced the number of rows by grouping closer objects into one clump (using some clustering algorithm), and then compressed the clump as one row (the geometry column becomes the bounding box of all objects in that clump). The construction then went way faster -- down to 12 hours. I did this because the query I need to answer is finding all objects whose bbox intersects with a given rectangle. I can now query all clumps whose bbox intersects with the rectangle. </div><div><br></div><div>   So essentially, the index construction is slow for too many rows, but much faster for a smaller # of bigger rows. Any intuition why this is the case would be greatly appreciated!</div><div><br><br></div></div></blockquote></div></div></div>