[postgis-users] Slow construction of GiST index, but better with smaller # of big rows

Bo Guo bo.guo at gisticinc.com
Sun Jan 13 09:00:41 PST 2019


The law of physics is in play here.  I think your approach is creative 
and valid.  I wish Postgis offered grid-base spatial index which grouped 
geometry BBOXs in grids with user defined levels /sizes.  A couple of 
other thoughts:

1. You may also look in to table partitioning to physically breakup the 
large table.

2. In addition, depend on how your data is used and whether or not the 
data is static, vector-tiling/cacheing the geometry on disk (out-side of 
database) may help.


On 1/12/19 9:28 AM, Wenbo Tao wrote:
> Hello,
>     I was trying to build a GiST index on a geometry column in a table 
> with 1 billion rows. It took an entire week to finish.
>     Then I reduced the number of rows by grouping closer objects into 
> one clump (using some clustering algorithm), and then compressed the 
> clump as one row (the geometry column becomes the bounding box of all 
> objects in that clump). The construction then went way faster -- down 
> to 12 hours. I did this because the query I need to answer is finding 
> all objects whose bbox intersects with a given rectangle. I can now 
> query all clumps whose bbox intersects with the rectangle.
>    So essentially, the index construction is slow for too many rows, 
> but much faster for a smaller # of bigger rows. Any intuition why this 
> is the case would be greatly appreciated!
> Thank you,
> Wenbo Tao
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20190113/4bab1d2e/attachment.html>

More information about the postgis-users mailing list