<div dir="ltr"><div><br></div>When I implemented the st_getgeometrytype version [1] (that Paul finally merged into postgis master) it was faster but Paul is right. My proposal here would be to change the pglz_decompress [2] function, called from PG_DETOAST_DATUM_SLICE (which is the macro we use to get a slice of the data). <div><br></div><div>The algorithm seems easy enough to change and it might get into postgres 10.</div><div><br></div><div>In any case, there are two things we could do to improve this:<br>1) use the same pglz_compress postgres has but calling it from postgis source and have a HEADER + COMPRESSED_GEOMETRY<br>2) if we do that, we could encode the geometry by component so it compress better<span style="font-size:12px;white-space:pre;color:rgb(106,115,125);font-family:SFMono-Regular,Consolas,"Liberation Mono",Menlo,Courier,monospace"> </span></div><div><div><br></div><div>Cheers!</div><div><br></div><div>[1] <a href="https://github.com/javisantana/postgis/commit/fbc45b0486be558f77b8c89de0f3f30274be2b90">https://github.com/javisantana/postgis/commit/fbc45b0486be558f77b8c89de0f3f30274be2b90</a></div><div>[2] <a href="https://github.com/postgres/postgres/blob/master/src/common/pg_lzcompress.c#L682">https://github.com/postgres/postgres/blob/master/src/common/pg_lzcompress.c#L682</a></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, Jun 30, 2017 at 5:18 PM Paul Ramsey <<a href="mailto:pramsey@cleverelephant.ca">pramsey@cleverelephant.ca</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Jun 29, 2017 at 10:31 PM, Regina Obe <span dir="ltr"><<a href="mailto:lr@pcorp.us" target="_blank">lr@pcorp.us</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><a href="https://postgis.net/docs/performance_tips.html#small_tables_large_objects" rel="noreferrer" target="_blank">https://postgis.net/docs/performance_tips.html#small_tables_large_objects</a><br>
<br>
This is the last bit of business before I can release PostGIS 2.3.3.<br>
<br>
I'd like to get rid of this whole discussion of TOAST as I don't think it's<br>
relevant anymore.<br>
<br>
Am I correct in assuming that?<br>
<br>
My assumption of why I think it's not relevant anymore is that we store<br>
cached boxes for most geometries except for Points and 2 point lines.<br>
So isn't it so that even if there is no index on a geometry, we do not need<br>
to fully detoast (or possibly not to detoast at all) the geometry to read<br>
the bounding box.<br>
That it is part of the head of the geometry?<br></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>No, that is not so. If the object is large enough that it requires toasting, the whole thing is stuffed off to the side table and replaced with a reference in the main tuple. In order to get at that bbox on the header, we have to .... go to the toast tables and re-assemble the thing. </div><div><br></div><div>"Worse", since we use compression on large objects, the object is compressed in the side tables and is pulled back in its entirety, then decompressed, then just the header is read.</div><div><br></div><div>Changing that would involve storing objects uncompressed and implementing any compression we wanted internal to the postgis library. So we'd have some logic to compress anything larger than N vertices, but would always leave the header uncompressed so we could peak at it for free.</div><div><br></div><div>That would *still* leave the toasting problem, which I'd have to re-confirm whether a pg_datum_slice cleanly pulls out just one piece of the toasted object or actually pulls the whole thing back first.</div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>P.</div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I did a quick test of my states table - 52 records (granted they aren't<br>
huge, but toast size is about 14 MB)<br>
<br>
When I do something like below without forcing enable_seqscan = false; it<br>
doesn't use an index (well sometimes it does even without urging)<br>
<br>
SELECT stusps<br>
from tiger_data.state_all<br>
WHERE the_geom &&<br>
ST_Buffer('0101000020AD100000242F4BF9462754C0BD0F7DA33F524340'::geometry,<br>
5);<br>
<br>
But whether it uses an index or not, the timing is still about the same -<br>
under 10ms.<br>
<br>
Though interestingly switching to<br>
<br>
SELECT stusps<br>
FROM tiger_data.state_all<br>
WHERE ST_Intersects(the_geom,<br>
ST_Buffer('0101000020AD100000242F4BF9462754C0BD0F7DA33F524340'::geometry,<br>
5));<br>
<br>
Does always force an index scan.<br>
<br>
<br>
Thanks,<br>
Regina<br>
<br>
<br>
_______________________________________________<br>
postgis-devel mailing list<br>
<a href="mailto:postgis-devel@lists.osgeo.org" target="_blank">postgis-devel@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/postgis-devel" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-devel</a></blockquote></div></div></div>
_______________________________________________<br>
postgis-devel mailing list<br>
<a href="mailto:postgis-devel@lists.osgeo.org" target="_blank">postgis-devel@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/postgis-devel" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-devel</a></blockquote></div><div dir="ltr">-- <br></div><div data-smartmail="gmail_signature"><div dir="ltr"><span style="font-family:"helvetica neue",helvetica,arial,sans-serif">-- CARTO CTO </span></div></div>