<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi Darafei,</p>
<p>Thanks for the added explanation, much appreciated.</p>
<p>Yes, it very much sounds like I ran into this particular issue.
In my specific case though, I was lucky enough to be able to
workaround this issue by modifying my SQL. The SQL used was a
nested WITH structure that creates a materialized view, and I had
put the 'ORDER BY geom' clause on the outer query. By moving the
clause to the inner most CTE, which directly referenced a table
that did have a spatial index on the geometry column, the
processing time for the creation of the 370M materialized view
went down from an undetermined >30hours, to under 4 hours. This
solved my particular issue.</p>
<p>I concur this sounds like needing a fix upstream.<br>
</p>
<p>Marco<br>
</p>
<div class="moz-cite-prefix">Op 11-12-2020 om 20:53 schreef Darafei
"Komяpa" Praliaskouski:<br>
</div>
<blockquote type="cite"
cite="mid:CAC8Q8tJ8aMbfLd9i8o=YpPh+EJ9kwc-e_iSpkT6EzxhYoZNJZg@mail.gmail.com">
<div dir="ltr">Hello Marco,
<div><br>
</div>
<div>This is exactly the case. The way Postgres accelerates the
sorting is by having a 8-byte "prefix" to sort on before
sorting on the actual field content. There is a problem in
parallel sort implementation in Postgres where this prefix (as
which PostGIS uses the sortable hash which is Hilbert key
currently) where this thing will get calculated in parallel
workers, sorted, and then calculated again on the merge stage
in main worker, and then sorted again. Since it's a relatively
slow op to calculate it performance may degrade to
single-thread-sort-like. To fight that, _ST_SortableHash is
exposed for advanced users: it immutably emits the key as an
integer and after the sort that integer can be passed from
parallel workers to main worker to be used verbatim. Best
practices around this are not established yet and your
feedback is welcome.<br>
<br>
For what I've noticed for now there is a significant
difference in speed if you do something along those lines:<br>
select _ST_SortableHash(geom), * from table order by 1;<br>
instead of <br>
select * from table order by _ST_SortableHash(geom);<br>
There may be other tricks to consider.<br>
<br>
Please note that the best way forward on this is to fix issue
in upstream postgres instead of this workaround.</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Dec 11, 2020 at 10:36
PM Marco Boeringa <<a href="mailto:marco@boeringa.demon.nl">marco@boeringa.demon.nl</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi
Paul,<br>
<br>
Could you potentially eleborate a bit about #4805,
'_ST_SortableHash' <br>
exposed?<br>
<br>
I looked up the bug tracker issue, and have followed the
links, but the <br>
information provided there is more of a stub, and at least one
reference <br>
suggests mainly a link with BRIN type spatial indexes. On the
other <br>
hand, it suggests a generic performance issue with Postgres
and sorting <br>
very large spatial tables on the geometry column ('ORDER BY
geom').<br>
<br>
I must admit I actually ran into to something that sounds like
the <br>
latter, where attempting to 'ORDER BY geom' on a non-indexed
spatial <br>
column of a >370M table, resulted in a relatively short
(few hours) <br>
parallel execution of the query with multiple 'parallel
workers' visible <br>
in pgAdmin, which was subsequently followed by > 30 hours
of the 'client <br>
backend' doing single threaded work, with virtually no CPU,
RAM or disk <br>
write activity. Since the database runs on SSD, this was
unexpected. <br>
Even a full sequential scan of the entire table should only
have cost a <br>
fraction of that time. I finally killed the process, not
wanting to wait <br>
any longer.<br>
<br>
Marco Boeringa<br>
<br>
Op 11-12-2020 om 19:09 schreef Paul Ramsey:<br>
> Since there were a few quick additions to beta1, I have
dropped a beta2. The next release will be rc1 and then final,
barring any emergency.<br>
><br>
> * Enhancements *<br>
> - #4814, Do not drop empty geometry components when
converting<br>
> to GEOS (Sandro Santilli)<br>
> - #4815, Fix GEOS conversion of POINT EMPTY to retain
type<br>
> (Sandro Santilli)<br>
> - #4813, ST_MakeValid removing NaN coordinates (Sandro
Santilli)<br>
><br>
><br>
>> On Dec 9, 2020, at 4:25 PM, Paul Ramsey <<a
href="mailto:pramsey@cleverelephant.ca" target="_blank">pramsey@cleverelephant.ca</a>>
wrote:<br>
>><br>
>> This is a beta1 release, for testing and quality
assurance, to be followed shortly by a final release. If
you're interested in the stability and usability of PostGIS,
please take a little time to ensure that you can build and use
this release.<br>
>><br>
>> <a
href="https://download.osgeo.org/postgis/source/postgis-3.1.0beta1.tar.gz"
rel="noreferrer" target="_blank">https://download.osgeo.org/postgis/source/postgis-3.1.0beta1.tar.gz</a><br>
>> <a
href="https://postgis.net/2020/12/09/postgis-3.1.0beta1/"
rel="noreferrer" target="_blank">https://postgis.net/2020/12/09/postgis-3.1.0beta1/</a><br>
>><br>
>> There have been a few changes since the alpha3
release.<br>
>><br>
>> * Breaking changes *<br>
>><br>
>> - #4214, Deprecated ST_Count(tablename,...),
ST_ApproxCount(tablename, ...)<br>
>> ST_SummaryStats(tablename, ..),<br>
>> ST_Histogram(tablename, ...),
ST_ApproxHistogram(tablename, ...),<br>
>> ST_Quantile(tablename, ...),
ST_ApproxQuantile(tablename, ...) removed.<br>
>> (Darafei Praliaskouski)<br>
>><br>
>> * Enhancements *<br>
>><br>
>> - #4801, ST_ClusterKMeans supports weights in
POINT[Z]M geometries (Darafei Praliaskouski)<br>
>> - #4804, ST_ReducePrecision (GEOS 3.9+) allows
valid precision reduction (Paul Ramsey)<br>
>> - #4805, _ST_SortableHash exposed to work around
parallel soring performance issue<br>
>> in Postgres. If your table is huge, use
ORDER BY _ST_SortableHash(geom)<br>
>> instead of ORDER BY geom to make parallel
sort faster (Darafei Praliaskouski)<br>
>> - #4625, Correlation statistics now calculated.<br>
>> Run ANALYZE for BRIN indexes to start
kicking in.<br>
>> (Darafei Praliaskouski)<br>
>> - Fix axis order issue with urn:ogc:def:crs:EPSG in
ST_GeomFromGML()<br>
>> (Even Roualt)<br>
>><br>
> _______________________________________________<br>
> postgis-users mailing list<br>
> <a href="mailto:postgis-users@lists.osgeo.org"
target="_blank">postgis-users@lists.osgeo.org</a><br>
> <a
href="https://lists.osgeo.org/mailman/listinfo/postgis-users"
rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-users</a><br>
_______________________________________________<br>
postgis-users mailing list<br>
<a href="mailto:postgis-users@lists.osgeo.org" target="_blank">postgis-users@lists.osgeo.org</a><br>
<a
href="https://lists.osgeo.org/mailman/listinfo/postgis-users"
rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-users</a><br>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">Darafei "Komяpa" Praliaskouski<br>
OSM BY Team - <a href="http://openstreetmap.by/"
target="_blank">http://openstreetmap.by/</a><br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
postgis-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/postgis-users">https://lists.osgeo.org/mailman/listinfo/postgis-users</a>
</pre>
</blockquote>
</body>
</html>