[postgis-users] postgis-3.1.0beta2 released

Darafei "Komяpa" Praliaskouski me at komzpa.net
Fri Dec 11 11:53:03 PST 2020


Hello Marco,

This is exactly the case. The way Postgres accelerates the sorting is by
having a 8-byte "prefix" to sort on before sorting on the actual field
content. There is a problem in parallel sort implementation in Postgres
where this prefix (as which PostGIS uses the sortable hash which is Hilbert
key currently) where this thing will get calculated in parallel workers,
sorted, and then calculated again on the merge stage in main worker, and
then sorted again. Since it's a relatively slow op to calculate it
performance may degrade to single-thread-sort-like. To fight that,
_ST_SortableHash is exposed for advanced users: it immutably emits the key
as an integer and after the sort that integer can be passed from parallel
workers to main worker to be used verbatim. Best practices around this are
not established yet and your feedback is welcome.

For what I've noticed for now there is a significant difference in speed if
you do something along those lines:
select _ST_SortableHash(geom), * from table order by 1;
instead of
select * from table order by _ST_SortableHash(geom);
There may be other tricks to consider.

Please note that the best way forward on this is to fix issue in upstream
postgres instead of this workaround.

On Fri, Dec 11, 2020 at 10:36 PM Marco Boeringa <marco at boeringa.demon.nl>
wrote:

> Hi Paul,
>
> Could you potentially eleborate a bit about #4805, '_ST_SortableHash'
> exposed?
>
> I looked up the bug tracker issue, and have followed the links, but the
> information provided there is more of a stub, and at least one reference
> suggests mainly a link with BRIN type spatial indexes. On the other
> hand, it suggests a generic performance issue with Postgres and sorting
> very large spatial tables on the geometry column ('ORDER BY geom').
>
> I must admit I actually ran into to something that sounds like the
> latter, where attempting to 'ORDER BY geom' on a non-indexed spatial
> column of a >370M table, resulted in a relatively short (few hours)
> parallel execution of the query with multiple 'parallel workers' visible
> in pgAdmin, which was subsequently followed by > 30 hours of the 'client
> backend' doing single threaded work, with virtually no CPU, RAM or disk
> write activity. Since the database runs on SSD, this was unexpected.
> Even a full sequential scan of the entire table should only have cost a
> fraction of that time. I finally killed the process, not wanting to wait
> any longer.
>
> Marco Boeringa
>
> Op 11-12-2020 om 19:09 schreef Paul Ramsey:
> > Since there were a few quick additions to beta1, I have dropped a beta2.
> The next release will be rc1 and then final, barring any emergency.
> >
> >   * Enhancements *
> >    - #4814, Do not drop empty geometry components when converting
> >             to GEOS (Sandro Santilli)
> >    - #4815, Fix GEOS conversion of POINT EMPTY to retain type
> >             (Sandro Santilli)
> >    - #4813, ST_MakeValid removing NaN coordinates (Sandro Santilli)
> >
> >
> >> On Dec 9, 2020, at 4:25 PM, Paul Ramsey <pramsey at cleverelephant.ca>
> wrote:
> >>
> >> This is a beta1 release, for testing and quality assurance, to be
> followed shortly by a final release. If you're interested in the stability
> and usability of PostGIS, please take a little time to ensure that you can
> build and use this release.
> >>
> >> https://download.osgeo.org/postgis/source/postgis-3.1.0beta1.tar.gz
> >> https://postgis.net/2020/12/09/postgis-3.1.0beta1/
> >>
> >> There have been a few changes since the alpha3 release.
> >>
> >> * Breaking changes *
> >>
> >>   - #4214, Deprecated ST_Count(tablename,...),
> ST_ApproxCount(tablename, ...)
> >>            ST_SummaryStats(tablename, ..),
> >>            ST_Histogram(tablename, ...), ST_ApproxHistogram(tablename,
> ...),
> >>            ST_Quantile(tablename, ...), ST_ApproxQuantile(tablename,
> ...) removed.
> >>            (Darafei Praliaskouski)
> >>
> >> * Enhancements *
> >>
> >>   - #4801, ST_ClusterKMeans supports weights in POINT[Z]M geometries
> (Darafei Praliaskouski)
> >>   - #4804, ST_ReducePrecision (GEOS 3.9+) allows valid precision
> reduction (Paul Ramsey)
> >>   - #4805, _ST_SortableHash exposed to work around parallel soring
> performance issue
> >>            in Postgres. If your table is huge, use ORDER BY
> _ST_SortableHash(geom)
> >>            instead of ORDER BY geom to make parallel sort faster
> (Darafei Praliaskouski)
> >>   - #4625, Correlation statistics now calculated.
> >>            Run ANALYZE for BRIN indexes to start kicking in.
> >>            (Darafei Praliaskouski)
> >>   - Fix axis order issue with urn:ogc:def:crs:EPSG in ST_GeomFromGML()
> >>            (Even Roualt)
> >>
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at lists.osgeo.org
> > https://lists.osgeo.org/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>


-- 
Darafei "Komяpa" Praliaskouski
OSM BY Team - http://openstreetmap.by/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20201211/6431037e/attachment.html>


More information about the postgis-users mailing list