[postgis-devel] gzip support for ST_AsMVT

Bruce Rindahl bruce.rindahl at gmail.com
Mon Nov 4 10:11:27 PST 2019


It has been a while but I think all the browsers support gziped SVG as
input.  Could be another use case for this.

On Mon, Nov 4, 2019 at 9:31 AM Paul Ramsey <pramsey at cleverelephant.ca>
wrote:

>
>
> On Nov 4, 2019, at 9:21 AM, Yuri Astrakhan <yuriastrakhan at gmail.com>
> wrote:
>
> looks good, thanks again for the blazing fast reply & fix!  I wonder how
> soon the demand will bring it into core PG :)
>
>
> Well, keep up with the agitating :) if only getting it built out as a
> package by pgdg.
> I think core would resist it, on the theory that “compression is already
> done”, in the default TOAST storage. (I know, not *quite* the use case
> you’re looking at.)
> I think you can also already hack a gzip compression with the right calls
> in pgcrypto, though I didn’t quite figure that out, building an extension
> felt like more fun anyways.
> P.
>
>
> On Mon, Nov 4, 2019 at 11:19 AM Paul Ramsey <pramsey at cleverelephant.ca>
> wrote:
>
>> Text as a column can be quite unhappy if you feed it with binary data
>> (things like the printing code gets unhappy with the zero-byte parts, so
>> you can end up with unexpected stops in printouts).
>> I have no idea why text would have any advantage over bytea, since if you
>> check the source, they are both just typedefs over top of varlena.
>> It would be possible to do a gzip(text) returns bytea, but you’d still
>> have a bytea in the end, and it wouldn’t be nice and bidirectional, since
>> text::bytea works, but bytea::text doesn’t return the original ascii
>> encoding.
>>
>> There shouldn’t be any conversion cost in a cast, since the memory
>> structure of text and bytea is identical, it’s just re-typing a pointer.
>>
>> I can add a gzip(text) => bytea very easily though (heck, you can, just
>> create a new gzip function that looks exactly like the bytea one, but takes
>> text as the argument, it’ll work fine.
>>
>> ATB
>>
>> P
>>
>> On Nov 4, 2019, at 1:49 AM, Yuri Astrakhan <yuriastrakhan at gmail.com>
>> wrote:
>>
>> Paul, one (possibly newbie) question - I noticed you used `gzip(bytea)`,
>> and I read in [1] that text performs about 17% better (for read queries,
>> not sure about in-memory operations). Does it make sense to add gzip(text)
>> variant?   I concatenate ST_AsMVT() data with STRING_AGG, thus end up with
>> a TEXT, and wonder if there is a conversion cost.  Thanks!!
>>
>> [1] http://engineering.pivotal.io/post/bytea_versus_text_in_postgresql/
>>
>> On Mon, Nov 4, 2019 at 4:24 AM Yuri Astrakhan <yuriastrakhan at gmail.com>
>> wrote:
>>
>>> Paul rulez!!!! Thank you!!! :)
>>>
>>> Experimentation ensues...
>>>
>>> On Sun, Nov 3, 2019 at 9:29 PM Paul Ramsey <pramsey at cleverelephant.ca>
>>> wrote:
>>>
>>>> Excuse me,
>>>>
>>>> https://github.com/pramsey/pgsql-gzip
>>>>
>>>> P
>>>>
>>>> On Sun, Nov 3, 2019 at 6:24 PM Paul Ramsey <pramsey at cleverelephant.ca>
>>>> wrote:
>>>> >
>>>> > I could not find a gzip extension on the web, even though I swear one
>>>> > already exists.
>>>> > So, I wrote up one, it's quite a small piece of work, but hopefully
>>>> it serves.
>>>> >
>>>> > https://github.com/pramsey/psql-gzip
>>>> >
>>>> > ATB,
>>>> >
>>>> > P
>>>> >
>>>> > On Sun, Nov 3, 2019 at 11:57 AM Yuri Astrakhan <
>>>> yuriastrakhan at gmail.com> wrote:
>>>> > >
>>>> > > OSM -> PostgreSQL is done by Imposm 3, which updates database
>>>> tables on the daily/hourly/minute basis. Every time it runs, it generates a
>>>> list of changed tiles.  I do not know what process they use for it -- from
>>>> my perspective, I simply get the list of updated tiles on zoom 14 as a
>>>> file, and I could use it to regenerate cached tiles or purge them from
>>>> Varnish.  Perhaps Imposm maintainers could find a good use for that
>>>> functionality?
>>>> > >
>>>> > > [1]
>>>> https://imposm.org/docs/imposm3/latest/tutorial.html#expire-tiles
>>>> > >
>>>> > > On Sun, Nov 3, 2019 at 2:41 PM Martin Davis <mtnclimb at gmail.com>
>>>> wrote:
>>>> > >>
>>>> > >> Good background to know.  Quite a bit to grok there!
>>>> > >>
>>>> > >> One question: does any part of that toolchain regenerate tile
>>>> cache subsets depending on detection of feature change?  And if so, would
>>>> it be useful to have a DB function which can determine the set of tile ids
>>>> that need to be refreshed (i.e. by mapping a (set of or single) geometry
>>>> (or envelopes) into a set of tile ids?
>>>> > >>
>>>> > >> On Sun, Nov 3, 2019 at 11:00 AM Yuri Astrakhan <
>>>> yuriastrakhan at gmail.com> wrote:
>>>> > >>>
>>>> > >>> Martin, I am working on improving OpenMapTiles tooling [1] - the
>>>> ultimate goal is to have tiles generated in real time from the up-to-date
>>>> OSM data, and serve them directly to user's browser via some caching layer
>>>> (i.e. Varnish). The tools already contain postserve - a simple python
>>>> server that queries for MVT tiles (no compression yet, but can be easily
>>>> added)
>>>> > >>>
>>>> > >>> The other task is tile pre-generation using tilelive-copy
>>>> (nodejs) - I wrote a tilelive-pgquery plugin [2] that queries PG for the
>>>> tile, compresses it, and passes it on to tilelive-copy for storage.
>>>> > >>>
>>>> > >>> [1] OpenMapTiles tools -
>>>> https://github.com/openmaptiles/openmaptiles-tools
>>>> > >>> [2] tilelive-pgquery -
>>>> https://www.npmjs.com/package/tilelive-pgquery
>>>> > >>>
>>>> > >>> On Sun, Nov 3, 2019 at 12:55 PM Martin Davis <mtnclimb at gmail.com>
>>>> wrote:
>>>> > >>>>
>>>> > >>>> Great to hear that ST_AsMVT is useful.
>>>> > >>>>
>>>> > >>>> The other PostGIS capability that is useful for web spatial
>>>> applications is the (recently enhanced) ST_AsGeoJSON.  This should also be
>>>> gzipped over the wire.  So this suggests a modular gzip capability would be
>>>> more useful.
>>>> > >>>>
>>>> > >>>> If this isn't provided in Postgres in some way (now or in near
>>>> term) perhaps we should just add a ST_Gzip function to PostGIS.
>>>> > >>>>
>>>> > >>>> Out of curiosity, what platform do you use for your external
>>>> gzipping layer?
>>>> > >>>>
>>>> > >>>> On Sun, Nov 3, 2019 at 8:29 AM nyurik <yuriastrakhan at gmail.com>
>>>> wrote:
>>>> > >>>>>
>>>> > >>>>> The amazing ST_AsMVT() has two common usage patterns:  copy
>>>> resulting MVTs to
>>>> > >>>>> a tile cache (e.g. .mbtiles file or a materialized view), or
>>>> serve MVT to
>>>> > >>>>> the users (direct SQL->browser approach).  Both patterns still
>>>> require one
>>>> > >>>>> additional data processing step -- gziping.
>>>> > >>>>>
>>>> > >>>>> Thus, rather than having a horizontally scalable db plus a
>>>> simple IO-bound
>>>> > >>>>> SQL->Web or a SQL->store process, one has to add a relatively
>>>> CPU-intensive
>>>> > >>>>> gzipping layer.  This is especially relevant if I try to create
>>>> a PG table
>>>> > >>>>> with the pre-generated tiles - I must use an external data
>>>> compression
>>>> > >>>>> process to retrieve a tile, gzip it, and store it back, instead
>>>> of running a
>>>> > >>>>> single query for copying all tiles.  My cursory look at the
>>>> tile sizes
>>>> > >>>>> indicate gzipping shrinks MVTs 50% to 300%.
>>>> > >>>>>
>>>> > >>>>> Note that a similar CPU-intensive step - creating MD5 tile
>>>> hashes for a more
>>>> > >>>>> efficient storage - can be easily done with PG's `md5()`
>>>> function, whereas
>>>> > >>>>> `gzip()` doesn't appear to exist.
>>>> > >>>>>
>>>> > >>>>> I would like to propose two possible solutions:
>>>> > >>>>> * Implement ST_AsMVT(..., compress) parameter - NULL=no
>>>> compression,
>>>> > >>>>> 0-9=compression level.
>>>> > >>>>>    PROs:  adds just the required functionality to where it is
>>>> needed (YAGNI
>>>> > >>>>> principle), does not require ungzip yet (ST_AsMVT is a one way
>>>> function
>>>> > >>>>> without the corresponding MVT->Table method)
>>>> > >>>>>    CONs: less generic (unusable for non-MVT usage)
>>>> > >>>>> * Implement gzip() or ST_gzip()
>>>> > >>>>>    PROs:  a more generic approach not tied to MVTs
>>>> > >>>>>    CONs:  logically implies the need of ungzip(), requires PG
>>>> community to
>>>> > >>>>> agree this functionality is needed
>>>> > >>>>>
>>>> > >>>>> Thanks!
>>>> > >>>>>
>>>> > >>>>>
>>>> > >>>>>
>>>> > >>>>> --
>>>> > >>>>> Sent from:
>>>> http://postgis.17.x6.nabble.com/PostGIS-Dev-f3570762.html
>>>> > >>>>> _______________________________________________
>>>> > >>>>> postgis-devel mailing list
>>>> > >>>>> postgis-devel at lists.osgeo.org
>>>> > >>>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>>> > >>>>
>>>> > >>>> _______________________________________________
>>>> > >>>> postgis-devel mailing list
>>>> > >>>> postgis-devel at lists.osgeo.org
>>>> > >>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>>> > >>>
>>>> > >>> _______________________________________________
>>>> > >>> postgis-devel mailing list
>>>> > >>> postgis-devel at lists.osgeo.org
>>>> > >>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>>> > >>
>>>> > >> _______________________________________________
>>>> > >> postgis-devel mailing list
>>>> > >> postgis-devel at lists.osgeo.org
>>>> > >> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>>> > >
>>>> > > _______________________________________________
>>>> > > postgis-devel mailing list
>>>> > > postgis-devel at lists.osgeo.org
>>>> > > https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>>> _______________________________________________
>>>> postgis-devel mailing list
>>>> postgis-devel at lists.osgeo.org
>>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>>
>>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20191104/bd942cff/attachment-0001.html>


More information about the postgis-devel mailing list