[postgis-users] Advice on space efficient grid indexing
Arnaud L.
arnaud.listes at codata.eu
Thu Sep 29 01:11:37 PDT 2016
Hi all
We have quite large tables containing geometries (one table per geometry
type), and we need to use them in a legacy application that has it's own
locking system.
Basically, rows are locked on a grid basis (i.e. all geometries in a
square cell are locked at once, the size of the cell cannot be changed).
Geometries are attributed to a cell according to the coordinates of the
center of their bounding boxes, so for a given geometry we have only one
and exactly one matching cell.
For now, we have added a CHAR column with the coordinates of the
lower/left corner of the cell formated to always fit the column width (x
coordinate left padded with zeros up to 9 characters, underscore, y
coordinate left padded with zeros up to 9 characters).
It works, and computing this string is very fast (min/max coordinates of
bouding box divided by two and "floored" to the closest grid point), but
it takes a lot of storage (the column is a char(19)), and we now need to
index it for faster access (it's not indexed for the moment).
I thought about a geohash, but postgis' geohash implementation is the
standard one with a grid size that cannot be changed (and our data is
not in lat/lon, it's in UTM coordinates).
What would be a more efficient way to store this cell coordinates so
that disk usage would be limited and accessing the rows would be using
the index efficiently ?
Using a point geometry would not be space efficient I believe, and it
seems overkill since we do not need to do any spatial operation on this
column (only equality queries, i.e. retrieving all rows in a given cell).
I also thought about a bigint array of two elements, but our database is
mainly accessed through psqlodbc and I'm not sure that arrays are very
efficient in this case.
Any idea or advice would be very welcome.
Regards
Arnaud
More information about the postgis-users
mailing list