[postgis-users] Advice on space efficient grid indexing

Thu Sep 29 01:11:37 PDT 2016

Hi all

We have quite large tables containing geometries (one table per geometry 
type), and we need to use them in a legacy application that has it's own 
locking system.
Basically, rows are locked on a grid basis (i.e. all geometries in a 
square cell are locked at once, the size of the cell cannot be changed).
Geometries are attributed to a cell according to the coordinates of the 
center of their bounding boxes, so for a given geometry we have only one 
and exactly one matching cell.

For now, we have added a CHAR column with the coordinates of the 
lower/left corner of the cell formated to always fit the column width (x 
coordinate left padded with zeros up to 9 characters, underscore, y 
coordinate left padded with zeros up to 9 characters).

It works, and computing this string is very fast (min/max coordinates of 
bouding box divided by two and "floored" to the closest grid point), but 
it takes a lot of storage (the column is a char(19)), and we now need to 
index it for faster access (it's not indexed for the moment).

I thought about a geohash, but postgis' geohash implementation is the 
standard one with a grid size that cannot be changed (and our data is 
not in lat/lon, it's in UTM coordinates).

What would be a more efficient way to store this cell coordinates so 
that disk usage would be limited and accessing the rows would be using 
the index efficiently ?
Using a point geometry would not be space efficient I believe, and it 
seems overkill since we do not need to do any spatial operation on this 
column (only equality queries, i.e. retrieving all rows in a given cell).
I also thought about a bigint array of two elements, but our database is 
mainly accessed through psqlodbc and I'm not sure that arrays are very 
efficient in this case.

Any idea or advice would be very welcome.

Regards
Arnaud