[postgis-users] Making GeoHash ordering faster
Paul Norman
penorman at mac.com
Wed May 18 15:36:23 PDT 2016
Ordering tables by ST_GeoHash is a common technique to make them
spatially correlated on disk. During recent tests with osm2pgsql on a
machine with very fast disks, it was found over 20% of CPU time was
spent comparing strings.
This is because doing ORDER BY ST_GeoHash(...); does sorting in the
locale of the database, normally some language in UTF8. Because
ST_GeoHash returns strings in base-32 only using 0-9 b-z, the C
collation can be specified instead, allowing for much faster string
comparisons.
This can be done with ORDER BY ST_GeoHash(...) COLLATE "C";
The fastest ordering I am aware of is ORDER BY
ST_GeoHash(ST_Transform(ST_Envelope(geom),4326),10) COLLATE "C";. This
is faster than simply ORDER BY ST_GeoHash(ST_Transform(geom,4326)); by
about 40% in some cases.
I wrote some more in the blog post
http://paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/,
but the essentials are above.
More information about the postgis-users
mailing list