<div dir="ltr"><div class="gmail_default" style="font-family:monospace,monospace">Hi,</div><div class="gmail_default" style="font-family:monospace,monospace">just a quick 2 cents.</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">Index are not going to help because you read the full table, thus no index is used.</div><div class="gmail_default" style="font-family:monospace,monospace">There are many parameters that can affect the speed of this query</div><div class="gmail_default" style="font-family:monospace,monospace">(just for fun, there is a whole wiki page on postgres to explain what information to provide for "slow query" questions.)</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">On of this parameter is the version of pgpointcloud you are using.</div><div class="gmail_default" style="font-family:monospace,monospace">Another is the max size of a row, because basically you patch is mainly stored in the postgres TOAST table,</div><div class="gmail_default" style="font-family:monospace,monospace">only the beginning is stored in the actual row of your table.</div><div class="gmail_default" style="font-family:monospace,monospace">(by the way, TOAST compression should be de-activated on this table).</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">Bottom line is, even when using toast, you are trying to access</div><div class="gmail_default" style="font-family:monospace,monospace">8KB (one row) * 50k (nb of rows), that is about 400 MB (worse case scenario), it is going to take time.</div><div class="gmail_default" style="font-family:monospace,monospace">By the way 1.5sec for 50k rows, and 5sec for 160k rows seems like very nice scaling to me (linear).</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">A good solution (depending on your workflow), would be to use a "proxy/miror/shadow" table<br></div><div class="gmail_default" style="font-family:monospace,monospace">that mirrors the one with patch.<br></div><div class="gmail_default" style="font-family:monospace,monospace">In proxy table, you would maintain (with triggers for instance) one row per patch in your patch table,</div><div class="gmail_default" style="font-family:monospace,monospace">and you would store all kind of useful statistics,</div><div class="gmail_default" style="font-family:monospace,monospace">such as bounding boxes (min max etc, ), geom cast to postgis, number of points, etc etc.</div><div class="gmail_default" style="font-family:monospace,monospace">This proxy table should also be heavily indexed;</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div><div style="font-family:monospace,monospace" class="gmail_default">I used this approach on Billion-range point cloud without trouble.</div><br></div><div><br></div><div><div style="font-family:monospace,monospace" class="gmail_default">Cheers,</div><div style="font-family:monospace,monospace" class="gmail_default">Remi-C</div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2018-06-01 8:17 GMT-04:00 Lars <span dir="ltr"><<a href="mailto:laasunde@hotmail.com" target="_blank">laasunde@hotmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
The query being used is "select id, PC_PatchMin(pa, 'x'), PC_PatchMax(pa, 'x'),PC_PatchMin(pa, 'y'), PC_PatchMax(pa, 'y') from importdata;".</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
This information is used to make bounding boxes for <u><strong>all patches</strong></u> in a table. There is no where clause in our query, we want all rows. Does the index expression help "pre-compute" the PC_PatchMin/Max values for quicker access or does it
only work when using a where clause?</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<hr style="width:98%;display:inline-block">
<div id="m_4664306334511018512divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>Fra:</b> Karl Pietrzak <<a href="mailto:kap4020@gmail.com" target="_blank">kap4020@gmail.com</a>><br>
<b>Sendt:</b> fredag 1. juni 2018 12.44<div><div class="h5"><br>
<b>Til:</b> <a href="mailto:laasunde@hotmail.com" target="_blank">laasunde@hotmail.com</a><br>
<b>Kopi:</b> <a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a><br>
<b>Emne:</b> Re: [pgpointcloud] Patch min and max</div></div></font>
<div> </div>
</div><div><div class="h5">
<div>
<div dir="ltr">Based on the "Seq Scan on public.importdata", it doesn't look like the index is being used. :(
<div><br>
</div>
<div>It's almost impossible to replicate this, because we don't have any specifics. If you could show us the exact, complete statements you used to create the table, create the index, etc. and the SELECT you are doing, that would be very helpful.</div>
</div>
<br>
<div class="m_4664306334511018512x_gmail_quote">
<div dir="ltr">On Fri, Jun 1, 2018 at 4:10 AM Lars <<a href="mailto:laasunde@hotmail.com" target="_blank">laasunde@hotmail.com</a>> wrote:<br>
</div>
<blockquote class="m_4664306334511018512x_gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid">
<div dir="ltr">
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Karl,</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Interesting proposal.</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
The inital tests did however <u><strong>not </strong></u>yield any improved performance.</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
The table index is described like this (\d+ importdata)</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
"patch_min_max_idx" btree (pc_patchmin(pa, 'x'::text), pc_patchmax(pa, 'x'::text),pc_patchmin(pa, 'y'::text), pc_patchmax(pa, 'y'::text))</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
The explain query looks like this;</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Seq Scan on public.importdata (cost=0.00..1402.50 rows=52321 width=132) (actual time=3.063..1546.160 rows=52321 loops=1)</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Output: id, _pc_patchstat(pa, 0, 'x'::text), _pc_patchstat(pa, 0, 'y'::text), _pc_patchstat(pa, 1, 'x'::text), _pc_patchstat(pa, 1, 'y'::text)</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Buffers: shared hit=795053 read=57146</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Planning time= 0.600ms</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Execution time= 1548.494 ms</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
(5 rows)</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
The "analyze importdata" command has also been invoked but did not make any difference.</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Any ideas on why the expression does not improve performance? Viewing the explain output I notice "_pc_patchstat" which make me think that the query is using the index but it might be wishfull thinking on my part.</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Kind regards, Lars </div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<hr style="width:98%;display:inline-block">
<div id="m_4664306334511018512x_m_-6325196892831888957divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>Fra:</b> Karl Pietrzak <<a href="mailto:kap4020@gmail.com" target="_blank">kap4020@gmail.com</a>><br>
<b>Sendt:</b> 31. mai 2018 17:18<br>
<b>Til:</b> <a href="mailto:laasunde@hotmail.com" target="_blank">laasunde@hotmail.com</a><br>
<b>Kopi:</b> <a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a><br>
<b>Emne:</b> Re: [pgpointcloud] Patch min and max</font>
<div> </div>
</div>
<div>
<div dir="ltr">Postgres supports indices on expressions. <a href="https://www.postgresql.org/docs/current/static/indexes-expressional.html" target="_blank">https://www.postgresql.org/<wbr>docs/current/static/indexes-<wbr>expressional.html</a>
<div><br>
</div>
<div>Would this feature support your use case?</div>
<div><br>
</div>
<div>Something like this:</div>
<div><br>
</div>
<div>
<pre class="m_4664306334511018512x_m_-6325196892831888957x_gmail-programlisting" style="padding:2ex;border-radius:8px;border:1px solid rgb(207,207,207);color:rgb(0,0,0);overflow:auto;font-size:1.3em;margin-top:2ex;margin-bottom:2ex;margin-left:2ex;background-color:rgb(247,247,247)">CREATE INDEX patch_min_and_max_idx ON importdata (PC_PatchMin(pa, 'x'), PC_PatchMax(pa, 'x'),PC_PatchMin(pa, 'y'), PC_PatchMax(pa, 'y'));</pre>
</div>
<div><br>
</div>
</div>
<br>
<div class="m_4664306334511018512x_m_-6325196892831888957x_gmail_quote">
<div dir="ltr">On Thu, May 31, 2018 at 9:46 AM Lars <<a href="mailto:laasunde@hotmail.com" target="_blank">laasunde@hotmail.com</a>> wrote:<br>
</div>
<blockquote class="m_4664306334511018512x_m_-6325196892831888957x_gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid">
<div dir="ltr">
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Hello,</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
The below query takes 1,46 seconds using table with 50 000 patches and at least 5 seconds using table with 160 000 patches.</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
"select id, PC_PatchMin(pa, 'x'), PC_PatchMax(pa, 'x'),PC_PatchMin(pa, 'y'), PC_PatchMax(pa, 'y') from importdata;"</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
The table size has a significant effect on the query execution speed which means the query does not scale well.</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
How can we make the query more scalable and/or improve query performance?</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
One approach is to create an additional table that contains min/max values for each patch to prevent the "expensive" PC_PatchXXX function. This is obviously duplicate information that adds complexity to the system.</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Using Windows 10 and PostgreSQL 10.3</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
<br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">
Kind regard, Lars<br>
</div>
</div>
______________________________<wbr>_________________<br>
pgpointcloud mailing list<br>
<a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/pgpointcloud" rel="noreferrer" target="_blank">https://lists.osgeo.org/<wbr>mailman/listinfo/pgpointcloud</a></blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div class="m_4664306334511018512x_m_-6325196892831888957x_gmail_signature" dir="ltr">
<div dir="ltr">
<div>Karl</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div class="m_4664306334511018512x_gmail_signature" dir="ltr">
<div dir="ltr">
<div>Karl</div>
</div>
</div>
</div>
</div></div></div>
<br>______________________________<wbr>_________________<br>
pgpointcloud mailing list<br>
<a href="mailto:pgpointcloud@lists.osgeo.org">pgpointcloud@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/pgpointcloud" rel="noreferrer" target="_blank">https://lists.osgeo.org/<wbr>mailman/listinfo/pgpointcloud</a><br></blockquote></div><br></div>