[postgis-users] Parallel spatial indexing for GiST?
Marco Boeringa
marco at boeringa.demon.nl
Wed Sep 16 07:35:19 PDT 2020
Hi all,
This is probably more of a PostgreSQL question than a PostGIS one, but I
have wondered if there is actually any work going on in allowing
PostgreSQL / PostGIS to build GiST type spatial indexes in parallel, and
/ or if this is even logically and technically feasible? According to
the PostgreSQL documentation, only B-tree indexes can be indexed in
parallel.
With the ever growing size of spatial databases like OpenStreetMap, with
tables running into the 100s of million records, spatial indexing using
GiST is one of the major bottle necks in re-creating or reloading a
spatial PostGIS database. The indexing process seems highly CPU bound,
with negligible disk activity for the majority of the time the indexing
process runs, hence being able to take advantage of multiple cores seems
like a possible big win. Nonetheless, there seems little to no mention
of such (future) option for GiST type indexing when searching on the
internet for relevant information.
I have no knowledge of the internals of PostgreSQL, so there may be
technical limitations I am unaware of that make this impossible or
technically very difficult to implement, but it would be nice to hear
something about this even if it is not feasible.
Yes, I know there are BRIN type spatial indexes for PostGIS, which are
comparatively super fast to create and lead to very small indexes even
for ultra large tables, but from the little information and personal
experience I gathered, BRIN seems most suited for Point data only, and
for static, not updated data, due to its requirement of clustered data
for efficiency (actually not a problem in my particular case, since I
don't do updates, but only reloads). The few times I tried to use it for
large, spatially clustered, Polygon data sets, it seemed less efficient
when accessing the data spatially in a GIS, with clearly longer display
times, although I don't have real benchmarks for that.
Most OpenStreetMap related tools like e.g. osm2pgsql also default to
GiST, and probably with good reason.
Marco
More information about the postgis-users
mailing list