<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi Giuseppe,</p>
<p>Thanks for your insights regarding BRIN. <br>
</p>
<p>I actually do employ BRIN, but only for Point type geometry,
where I have the (subjective) feeling that the performance is
least degraded compared to GiST. I also set the 'pages_per_range'
parameter to a much smaller value than the default. Even with
small values for this parameter, the size and creation times of
the resulting index is nothing compared to GiST. <br>
</p>
<p>For Polygon data, the few times I tried using a BRIN type spatial
index, I had the feeling it was probably some 3-4 times slower
than GiST in terms of display times in a GIS, but these aren't
hard figures, because I did not really time it. I also had the
feeling that there was considerably more disk activity needed to
access the relevant geometries.</p>
<p>The data is from osm2pgsql, that initially spatially sorts the
data using the default PostGIS spatial sorting / clustering using
Hilbert curve. This should be efficient. I derive tables from
that, some of which are additionally being spatially clustered
depending on the processing they have had (for those tables I
actually also need to create GiST type spatial indexes, as the
PostgreSQL CLUSTER command cannot use BRIN as input for spatial
clustering, due to the nature of the index, it will fail with an
error / warning about this when you attempt it, and CLUSTER needs
a (spatial) index as input).</p>
<p>osm2pgsql itself already seems to optimize indexing, in the sense
that it launches multiple index processes against different tables
in parallel. This is a kind of "parallel indexing", but not
against a single table / spatial column. For the processes I
developed myself, this is not feasible though, and I would benefit
of having parallel GiST index creation for a single geometry
column.</p>
<p>Marco<br>
</p>
<div class="moz-cite-prefix">Op 16-9-2020 om 18:55 schreef Giuseppe
Broccolo:<br>
</div>
<blockquote type="cite"
cite="mid:CAFtuf8Bo_2OkFVmaAbyMemQ_wcqtdAexSfuX6229fYF48jriFw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">Hi Marco,<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">Il giorno mer 16 set 2020
alle ore 15:35 Marco Boeringa <<a
href="mailto:marco@boeringa.demon.nl"
moz-do-not-send="true">marco@boeringa.demon.nl</a>> ha
scritto:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">[...]<br>
Yes, I know there are BRIN type spatial indexes for PostGIS,
which are <br>
comparatively super fast to create and lead to very small
indexes even <br>
for ultra large tables, but from the little information and
personal <br>
experience I gathered, BRIN seems most suited for Point data
only, and <br>
for static, not updated data, due to its requirement of
clustered data <br>
for efficiency (actually not a problem in my particular
case, since I <br>
don't do updates, but only reloads). The few times I tried
to use it for <br>
large, spatially clustered, Polygon data sets, it seemed
less efficient <br>
when accessing the data spatially in a GIS, with clearly
longer display <br>
times, although I don't have real benchmarks for that.<br>
<br>
Most OpenStreetMap related tools like e.g. osm2pgsql also
default to <br>
GiST, and probably with good reason.<br>
</blockquote>
<div><br>
</div>
<div>About BRIN in PostGIS: it internally works using bounding
boxes of geometries,</div>
<div>as GiST, so in principle you can use this index for any
geometry type, and as</div>
<div>far as you use intersect, contains, is_contained
operators for 2D geometries and</div>
<div>intersects for 3D ones in your geospatial queries.<br>
</div>
<div><br>
</div>
<div>You are right when you say that BRIN is more suitable for
"static" data, because</div>
<div>of how it internally works - creating a sort of summary
of which range of tuples are</div>
<div>included in the data pages physically stored, just to use
a few words. New entries <br>
</div>
<div>added during INSERTs or UPDATEs are properly summarised
in BRINs as far as <br>
</div>
<div>the new indexed values/geometries are included in
ranges/bounding boxes already <br>
</div>
<div>present in the index: in case new pages are created with
data which does not fall <br>
</div>
<div>within the last summarized range, the new ranges are not
automatically acquired <br>
</div>
<div>in the summary, and the related tuples remain
unsummarized until a new summarization <br>
</div>
<div>is invoked, automatically through a VACUUM or manually
through <code class="gmail-function">brin_summarize_range </code><br>
</div>
<div>or<code class="gmail-function"> brin_summarize_new_values<font
face="arial,sans-serif"> functions. This allows some
maintenance of the <br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">index even with non static data,
of course with some limitation compared to GiST.<br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif"><br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">About the performance: being a
range index it surely performs worse compared <br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">to Rtree indexes like GiST. How
much worse depends from several factors:<br>
<br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">1) how the data pages are
physically stored: ranges are as more effective as
possible <br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">as far as spatially close
geometries are adjacently stored even in physical pages
the <br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">storage, so the initial import
of spatial data should need to be done following some</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">sorting criteria</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif"><br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">2) BRIN granularity: performance
starts to be closer to an Rtree one as far as the size</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">of the block range is small.
This can be configured during index creation with the</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">parameter <code
class="gmail-literal">pages_per_range</code>, i.e. how
many pages are summarised per range.</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">Of course, the smaller the
number, the larger is the resulting BRIN and more time</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">is needed for the creation</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif"><br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">GiSTs remain faster even with
2), but I'd suggest checking how the data was originally</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">imported into the geospatial DB
in order to be sure you could benefit as much as
possible</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">from a range index.<br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif"><br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">Hope it helps,</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif">Giuseppe.<br>
</font></code></div>
<div><code class="gmail-function"><font
face="arial,sans-serif"><br>
</font></code></div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
postgis-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/postgis-users">https://lists.osgeo.org/mailman/listinfo/postgis-users</a></pre>
</blockquote>
</body>
</html>