[postgis-users] postgis-3.1.0beta2 released

Marco Boeringa marco at boeringa.demon.nl
Fri Dec 11 12:31:05 PST 2020


Hi Darafei,

Thanks for the added explanation, much appreciated.

Yes, it very much sounds like I ran into this particular issue. In my 
specific case though, I was lucky enough to be able to workaround this 
issue by modifying my SQL. The SQL used was a nested WITH structure that 
creates a materialized view, and I had put the 'ORDER BY geom' clause on 
the outer query. By moving the clause to the inner most CTE, which 
directly referenced a table that did have a spatial index on the 
geometry column, the processing time for the creation of the 370M 
materialized view went down from an undetermined >30hours, to under 4 
hours. This solved my particular issue.

I concur this sounds like needing a fix upstream.

Marco

Op 11-12-2020 om 20:53 schreef Darafei "Komяpa" Praliaskouski:
> Hello Marco,
>
> This is exactly the case. The way Postgres accelerates the sorting is 
> by having a 8-byte "prefix" to sort on before sorting on the actual 
> field content. There is a problem in parallel sort implementation in 
> Postgres where this prefix (as which PostGIS uses the sortable hash 
> which is Hilbert key currently) where this thing will get calculated 
> in parallel workers, sorted, and then calculated again on the merge 
> stage in main worker, and then sorted again. Since it's a relatively 
> slow op to calculate it performance may degrade to 
> single-thread-sort-like. To fight that, _ST_SortableHash is exposed 
> for advanced users: it immutably emits the key as an integer and after 
> the sort that integer can be passed from parallel workers to main 
> worker to be used verbatim. Best practices around this are not 
> established yet and your feedback is welcome.
>
> For what I've noticed for now there is a significant difference in 
> speed if you do something along those lines:
> select _ST_SortableHash(geom), * from table order by 1;
> instead of
> select * from table order by _ST_SortableHash(geom);
> There may be other tricks to consider.
>
> Please note that the best way forward on this is to fix issue in 
> upstream postgres instead of this workaround.
>
> On Fri, Dec 11, 2020 at 10:36 PM Marco Boeringa 
> <marco at boeringa.demon.nl <mailto:marco at boeringa.demon.nl>> wrote:
>
>     Hi Paul,
>
>     Could you potentially eleborate a bit about #4805, '_ST_SortableHash'
>     exposed?
>
>     I looked up the bug tracker issue, and have followed the links,
>     but the
>     information provided there is more of a stub, and at least one
>     reference
>     suggests mainly a link with BRIN type spatial indexes. On the other
>     hand, it suggests a generic performance issue with Postgres and
>     sorting
>     very large spatial tables on the geometry column ('ORDER BY geom').
>
>     I must admit I actually ran into to something that sounds like the
>     latter, where attempting to 'ORDER BY geom' on a non-indexed spatial
>     column of a >370M table, resulted in a relatively short (few hours)
>     parallel execution of the query with multiple 'parallel workers'
>     visible
>     in pgAdmin, which was subsequently followed by > 30 hours of the
>     'client
>     backend' doing single threaded work, with virtually no CPU, RAM or
>     disk
>     write activity. Since the database runs on SSD, this was unexpected.
>     Even a full sequential scan of the entire table should only have
>     cost a
>     fraction of that time. I finally killed the process, not wanting
>     to wait
>     any longer.
>
>     Marco Boeringa
>
>     Op 11-12-2020 om 19:09 schreef Paul Ramsey:
>     > Since there were a few quick additions to beta1, I have dropped
>     a beta2. The next release will be rc1 and then final, barring any
>     emergency.
>     >
>     >   * Enhancements *
>     >    - #4814, Do not drop empty geometry components when converting
>     >             to GEOS (Sandro Santilli)
>     >    - #4815, Fix GEOS conversion of POINT EMPTY to retain type
>     >             (Sandro Santilli)
>     >    - #4813, ST_MakeValid removing NaN coordinates (Sandro Santilli)
>     >
>     >
>     >> On Dec 9, 2020, at 4:25 PM, Paul Ramsey
>     <pramsey at cleverelephant.ca <mailto:pramsey at cleverelephant.ca>> wrote:
>     >>
>     >> This is a beta1 release, for testing and quality assurance, to
>     be followed shortly by a final release. If you're interested in
>     the stability and usability of PostGIS, please take a little time
>     to ensure that you can build and use this release.
>     >>
>     >>
>     https://download.osgeo.org/postgis/source/postgis-3.1.0beta1.tar.gz
>     <https://download.osgeo.org/postgis/source/postgis-3.1.0beta1.tar.gz>
>     >> https://postgis.net/2020/12/09/postgis-3.1.0beta1/
>     <https://postgis.net/2020/12/09/postgis-3.1.0beta1/>
>     >>
>     >> There have been a few changes since the alpha3 release.
>     >>
>     >> * Breaking changes *
>     >>
>     >>   - #4214, Deprecated ST_Count(tablename,...),
>     ST_ApproxCount(tablename, ...)
>     >>            ST_SummaryStats(tablename, ..),
>     >>            ST_Histogram(tablename, ...),
>     ST_ApproxHistogram(tablename, ...),
>     >>            ST_Quantile(tablename, ...),
>     ST_ApproxQuantile(tablename, ...) removed.
>     >>            (Darafei Praliaskouski)
>     >>
>     >> * Enhancements *
>     >>
>     >>   - #4801, ST_ClusterKMeans supports weights in POINT[Z]M
>     geometries (Darafei Praliaskouski)
>     >>   - #4804, ST_ReducePrecision (GEOS 3.9+) allows valid
>     precision reduction (Paul Ramsey)
>     >>   - #4805, _ST_SortableHash exposed to work around parallel
>     soring performance issue
>     >>            in Postgres. If your table is huge, use ORDER BY
>     _ST_SortableHash(geom)
>     >>            instead of ORDER BY geom to make parallel sort
>     faster (Darafei Praliaskouski)
>     >>   - #4625, Correlation statistics now calculated.
>     >>            Run ANALYZE for BRIN indexes to start kicking in.
>     >>            (Darafei Praliaskouski)
>     >>   - Fix axis order issue with urn:ogc:def:crs:EPSG in
>     ST_GeomFromGML()
>     >>            (Even Roualt)
>     >>
>     > _______________________________________________
>     > postgis-users mailing list
>     > postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>     > https://lists.osgeo.org/mailman/listinfo/postgis-users
>     <https://lists.osgeo.org/mailman/listinfo/postgis-users>
>     _______________________________________________
>     postgis-users mailing list
>     postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>     https://lists.osgeo.org/mailman/listinfo/postgis-users
>     <https://lists.osgeo.org/mailman/listinfo/postgis-users>
>
>
>
> -- 
> Darafei "Komяpa" Praliaskouski
> OSM BY Team - http://openstreetmap.by/ <http://openstreetmap.by/>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20201211/3e02ac87/attachment.html>


More information about the postgis-users mailing list