[postgis-devel] I'd like to remove this Toast discussion

Paul Ramsey pramsey at cleverelephant.ca
Fri Jun 30 08:51:54 PDT 2017


I feel like if we did it PgSQL side we'd have to be further up the chain,
working around the area of detoast_datum_slice. We might also need a change
to the decompressor though, to support partial decompression. Annoying, I
know I've looked at this problem several times and put it aside each time,
but I cannot remember why :)

P

On Fri, Jun 30, 2017 at 8:44 AM, Javier Santana <jsantana at carto.com> wrote:

>
> When I implemented the st_getgeometrytype version [1] (that Paul finally
> merged into postgis master) it was faster but Paul is right. My proposal
> here would be to change the pglz_decompress [2] function, called
> from PG_DETOAST_DATUM_SLICE (which is the macro we use to get a slice of
> the data).
>
> The algorithm seems easy enough to change and it might get into postgres
> 10.
>
> In any case, there are two things we could do to improve this:
> 1) use the same pglz_compress postgres has but calling it from postgis
> source and have a HEADER + COMPRESSED_GEOMETRY
> 2) if we do that, we could encode the geometry by component so it
>  compress better
>
> Cheers!
>
> [1] https://github.com/javisantana/postgis/commit/
> fbc45b0486be558f77b8c89de0f3f30274be2b90
> [2] https://github.com/postgres/postgres/blob/master/
> src/common/pg_lzcompress.c#L682
>
> On Fri, Jun 30, 2017 at 5:18 PM Paul Ramsey <pramsey at cleverelephant.ca>
> wrote:
>
>> On Thu, Jun 29, 2017 at 10:31 PM, Regina Obe <lr at pcorp.us> wrote:
>>
>>> https://postgis.net/docs/performance_tips.html#small_
>>> tables_large_objects
>>>
>>> This is the last bit of business before I can release PostGIS 2.3.3.
>>>
>>> I'd like to get rid of this whole discussion of TOAST as I don't think
>>> it's
>>> relevant anymore.
>>>
>>> Am I correct in assuming that?
>>>
>>> My assumption of why I think it's not relevant anymore is that we store
>>> cached boxes for most geometries except for Points and 2 point lines.
>>> So isn't it so that even if there is no index on a geometry, we do not
>>> need
>>> to fully detoast (or possibly not to detoast at all) the geometry to read
>>> the bounding box.
>>> That it is part of the head of the geometry?
>>>
>>
>> No, that is not so. If the object is large enough that it requires
>> toasting, the whole thing is stuffed off to the side table and replaced
>> with a reference in the main tuple. In order to get at that bbox on the
>> header, we have to .... go to the toast tables and re-assemble the thing.
>>
>> "Worse", since we use compression on large objects, the object is
>> compressed in the side tables and is pulled back in its entirety, then
>> decompressed, then just the header is read.
>>
>> Changing that would involve storing objects uncompressed and implementing
>> any compression we wanted internal to the postgis library. So we'd have
>> some logic to compress anything larger than N vertices, but would always
>> leave the header uncompressed so we could peak at it for free.
>>
>> That would *still* leave the toasting problem, which I'd have to
>> re-confirm whether a pg_datum_slice cleanly pulls out just one piece of the
>> toasted object or actually pulls the whole thing back first.
>>
>> P.
>>
>>
>>>
>>> I did a quick test of my states table - 52 records (granted they aren't
>>> huge, but toast size is about 14 MB)
>>>
>>> When I do something like below without forcing enable_seqscan = false; it
>>> doesn't use an index (well sometimes it does even without urging)
>>>
>>> SELECT stusps
>>> from tiger_data.state_all
>>> WHERE the_geom &&
>>> ST_Buffer('0101000020AD100000242F4BF9462754C0BD0F7DA33F524340'::
>>> geometry,
>>> 5);
>>>
>>> But whether it uses an index or not, the timing is still about the same -
>>> under 10ms.
>>>
>>> Though interestingly switching to
>>>
>>> SELECT stusps
>>> FROM tiger_data.state_all
>>> WHERE ST_Intersects(the_geom,
>>> ST_Buffer('0101000020AD100000242F4BF9462754C0BD0F7DA33F524340'::
>>> geometry,
>>> 5));
>>>
>>> Does always force an index scan.
>>>
>>>
>>> Thanks,
>>> Regina
>>>
>>>
>>> _______________________________________________
>>> postgis-devel mailing list
>>> postgis-devel at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
> --
> -- CARTO CTO
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20170630/9987529f/attachment.html>


More information about the postgis-devel mailing list