[postgis-devel] I'd like to remove this Toast discussion

Paul Ramsey pramsey at cleverelephant.ca
Fri Jun 30 08:17:58 PDT 2017


On Thu, Jun 29, 2017 at 10:31 PM, Regina Obe <lr at pcorp.us> wrote:

> https://postgis.net/docs/performance_tips.html#small_tables_large_objects
>
> This is the last bit of business before I can release PostGIS 2.3.3.
>
> I'd like to get rid of this whole discussion of TOAST as I don't think it's
> relevant anymore.
>
> Am I correct in assuming that?
>
> My assumption of why I think it's not relevant anymore is that we store
> cached boxes for most geometries except for Points and 2 point lines.
> So isn't it so that even if there is no index on a geometry, we do not need
> to fully detoast (or possibly not to detoast at all) the geometry to read
> the bounding box.
> That it is part of the head of the geometry?
>

No, that is not so. If the object is large enough that it requires
toasting, the whole thing is stuffed off to the side table and replaced
with a reference in the main tuple. In order to get at that bbox on the
header, we have to .... go to the toast tables and re-assemble the thing.

"Worse", since we use compression on large objects, the object is
compressed in the side tables and is pulled back in its entirety, then
decompressed, then just the header is read.

Changing that would involve storing objects uncompressed and implementing
any compression we wanted internal to the postgis library. So we'd have
some logic to compress anything larger than N vertices, but would always
leave the header uncompressed so we could peak at it for free.

That would *still* leave the toasting problem, which I'd have to re-confirm
whether a pg_datum_slice cleanly pulls out just one piece of the toasted
object or actually pulls the whole thing back first.

P.


>
> I did a quick test of my states table - 52 records (granted they aren't
> huge, but toast size is about 14 MB)
>
> When I do something like below without forcing enable_seqscan = false; it
> doesn't use an index (well sometimes it does even without urging)
>
> SELECT stusps
> from tiger_data.state_all
> WHERE the_geom &&
> ST_Buffer('0101000020AD100000242F4BF9462754C0BD0F7DA33F524340'::geometry,
> 5);
>
> But whether it uses an index or not, the timing is still about the same -
> under 10ms.
>
> Though interestingly switching to
>
> SELECT stusps
> FROM tiger_data.state_all
> WHERE ST_Intersects(the_geom,
> ST_Buffer('0101000020AD100000242F4BF9462754C0BD0F7DA33F524340'::geometry,
> 5));
>
> Does always force an index scan.
>
>
> Thanks,
> Regina
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20170630/61f9f043/attachment.html>


More information about the postgis-devel mailing list