[postgis-users] Making GeoHash ordering faster

Paul Norman penorman at mac.com
Wed May 18 15:36:23 PDT 2016


Ordering tables by ST_GeoHash is a common technique to make them 
spatially correlated on disk. During recent tests with osm2pgsql on a 
machine with very fast disks, it was found over 20% of CPU time was 
spent comparing strings.

This is because doing ORDER BY ST_GeoHash(...); does sorting in the 
locale of the database, normally some language in UTF8. Because 
ST_GeoHash returns strings in base-32 only using 0-9 b-z, the C 
collation can be specified instead, allowing for much faster string 
comparisons.

This can be done with ORDER BY ST_GeoHash(...) COLLATE "C";

The fastest ordering I am aware of is ORDER BY 
ST_GeoHash(ST_Transform(ST_Envelope(geom),4326),10) COLLATE "C";. This 
is faster than simply ORDER BY ST_GeoHash(ST_Transform(geom,4326)); by 
about 40% in some cases.

I wrote some more in the blog post 
http://paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/, 
but the essentials are above.


More information about the postgis-users mailing list