[postgis-users] Parallel spatial indexing for GiST?

Giuseppe Broccolo g.broccolo.7 at gmail.com
Sun Jan 24 09:25:15 PST 2021


Hi Marco,

Il giorno sab 23 gen 2021 alle ore 11:39 Marco Boeringa <
marco at boeringa.demon.nl> ha scritto:

> Hi Darafei,
>
> How does this relate to e.g. PostGIS with polygon or line data?
>

It doesn't. The patch Darafei shared in his reply added the possibility to
define a further support function to build the GiST, based on sorting the
data following some "natural criteria" depending on the data itself, which
makes the build faster - as pointed by Darafei, no parallelism for the
moment.

Just the support for the internal point data has been included in the
patch, based on Z-order.

To add the support for different geometries, like polygons, you would need
before to define some sorting algorithm similar to Z-order for points - not
sure the same algos can be used for other geometries, maybe reducing them
to the related bbox and considering the bbox vertices? Although PostGIS
lately included natural sorting of geometries based on Hilbert curves, can
someone else confirm this eventually?


> Faster indexing for point data only is only of limited value, there are
> already reasonable alternatives for point type data, e.g. BRIN indexes.
> However, for polygon and line data, GiST does still appear to be the most
> efficient index for accessing the data in spatial lookups, and tools like
> e.g. osm2pgsql default to them.
>

Actually, all the 2D geometries, geographies and 3D points can be indexed
using BRIN support, which internally reduces the indexed data to their
bboxes. Of course, with all the limitations imposed by using BRINs (few
operators supported, no kNN, etc.).


> Is this commit just the "groundwork" layed out to implement something like
> this in PostGIS, or does it require more work on the PostgreSQL side itself
> before that is even conceivable?
>

Exactly, just the new support to speed up the build has been added. The
implementation for PostGIS types should be done separately. This would mean
to define this support function
<https://github.com/glukhovn/postgres/blob/225a49161fae9388651373d4beb8dcba99059339/src/include/access/gist.h#L37>
and this other one
<https://github.com/glukhovn/postgres/blob/225a49161fae9388651373d4beb8dcba99059339/src/include/access/gist.h#L38>
in here
<https://github.com/postgis/postgis/blob/8b13c3e2f8366d902dbf516ec17de09ae84361f4/postgis/postgis.sql.in#L781>.
But I think we need to wait for PostgreSQL 14 to be released before.

Considering what is reported in the exchange of emails related to the patch
<https://www.postgresql.org/message-id/flat/1A36620E-CAD8-4267-9067-FB31385E7C0D@yandex-team.ru>,
the expected improvement in the speed of the index build is a factor x5 for
the internal point data in PostgreSQL. Should be tested then in PostGIS, as
said by Darafei.

Giuseppe.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20210124/bd129082/attachment.html>


More information about the postgis-users mailing list