[postgis-users] Any native compression strategies suited for a revisioning system?

Imre Samu pella.samu at gmail.com
Sun Dec 13 10:06:02 PST 2015


Hi Peter,

>I have been tasked with creating a revisioning system for geometries stored
by PostGIS.
> Postgres's compression algorithm (PGLZ?) is not optimised for delta
encoding or high compression ratios.
> So are there any options short of me implementing user-defined functions
to wrap a compression library?
> .. a field would hold all revisions in a JSON data structure.

My tips for evaluating the compression :
- the TWKB format [1]
- cstore_fdw  (compressing JSONB data)  [2]  ( now : only load or append
supported )

[1] "TWKB (Tiny Well-Known Binary) format. TWKB is a compressed binary
format with a focus on minimizing the size of the output."
(  PostGIS >= 2.2.0 )
http://postgis.net/docs/manual-2.2/ST_GeomFromTWKB.html
http://postgis.net/docs/manual-2.2/ST_AsTWKB.html

"TWKB applies the following principles:
-Only store the absolute position once, and store all other positions as
delta values relative to the preceding position.
-Only use as much address space as is necessary for any given value.
Practically this means that "variable length integers" or "varints" are
used throughout the specification for storing values in any situation where
numbers greater than 128 might be encountered."
https://github.com/TWKB/Specification/blob/master/twkb.md

[2] "Compressing PostgreSQL JSONB data 6x using cstore_fdw"
https://www.citusdata.com/blog/14-marco/156-compressing-jsonb-using-cstore-fdw
https://www.citusdata.com/citus-products/cstore-fdw/cstore-fdw-quick-start-guide
https://github.com/citusdata/cstore_fdw


Regards,
  Imre

2015-12-13 15:47 GMT+01:00 Peter Devoy <peter at 3xe.co.uk>:

> I have been tasked with creating a revisioning system for geometries
> stored by PostGIS.  To avoid data duplication I thought about storing
> only diffs but because that introduces other complications I am
> thinking about just storing each revision in its entirety and having a
> compression mechanism minimize size on disk.
>
> However, it seems to me that, quite rightly, Postgres's compression
> algorithm (PGLZ?) is not optimised for delta encoding or high
> compression ratios.  So are there any options short of me implementing
> user-defined functions to wrap a compression library?
>
> For clarity, with this method I am thinking each geometry would have a
> corresponding row in which a field would hold all revisions in a JSON
> data structure.
>
>
> Peter
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20151213/ee8ef9e3/attachment.html>


More information about the postgis-users mailing list