[postgis-users] Parallel spatial indexing for GiST?

Marco Boeringa marco at boeringa.demon.nl
Wed Sep 16 07:35:19 PDT 2020


Hi all,

This is probably more of a PostgreSQL question than a PostGIS one, but I 
have wondered if there is actually any work going on in allowing 
PostgreSQL / PostGIS to build GiST type spatial indexes in parallel, and 
/ or if this is even logically and technically feasible? According to 
the PostgreSQL documentation, only B-tree indexes can be indexed in 
parallel.

With the ever growing size of spatial databases like OpenStreetMap, with 
tables running into the 100s of million records, spatial indexing using 
GiST is one of the major bottle necks in re-creating or reloading a 
spatial PostGIS database. The indexing process seems highly CPU bound, 
with negligible disk activity for the majority of the time the indexing 
process runs, hence being able to take advantage of multiple cores seems 
like a possible big win. Nonetheless, there seems little to no mention 
of such (future) option for GiST type indexing when searching on the 
internet for relevant information.

I have no knowledge of the internals of PostgreSQL, so there may be 
technical limitations I am unaware of that make this impossible or 
technically very difficult to implement, but it would be nice to hear 
something about this even if it is not feasible.

Yes, I know there are BRIN type spatial indexes for PostGIS, which are 
comparatively super fast to create and lead to very small indexes even 
for ultra large tables, but from the little information and personal 
experience I gathered, BRIN seems most suited for Point data only, and 
for static, not updated data, due to its requirement of clustered data 
for efficiency (actually not a problem in my particular case, since I 
don't do updates, but only reloads). The few times I tried to use it for 
large, spatially clustered, Polygon data sets, it seemed less efficient 
when accessing the data spatially in a GIS, with clearly longer display 
times, although I don't have real benchmarks for that.

Most OpenStreetMap related tools like e.g. osm2pgsql also default to 
GiST, and probably with good reason.

Marco



More information about the postgis-users mailing list