<div dir="ltr">I think the biggest part of it is about collecting real world datasets and workloads.<div><br></div><div>I was thinking about making a system that would serve as error handler for things like unexpected GEOS errors that would send offending geometries and all versions of all software to some (separate) issue tracker.</div><div><br></div><div>A lot of software asks "do you want to share usage statistics with us?" - can we do something similar? If enabled, log number of invocations and typical characteristics of calls (number of points in geometries, their types, K for KMeans, whatever seems reasonable for the case) and send somewhere we can pick it up once a day?</div></div><br><div class="gmail_quote"><div dir="ltr">ср, 2 мая 2018 г. в 10:08, Regina Obe <<a href="mailto:lr@pcorp.us">lr@pcorp.us</a>>:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div class="m_-7399594280537154099WordSection1"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">That sounds like a good start, we could also have a folder on <a href="http://postgis.net" target="_blank">postgis.net</a> hosting some data files we can use. <u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">I was wondering if pg_bench would be of any value. I honestly haven't explored it to know how much flexibility we have in feeding it custom queries. It seems it's possible.<u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><a href="https://www.postgresql.org/docs/10/static/pgbench.html" target="_blank">https://www.postgresql.org/docs/10/static/pgbench.html</a><u></u><u></u></span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p><p class="MsoNormal" style="margin-left:.5in"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> postgis-devel [mailto:<a href="mailto:postgis-devel-bounces@lists.osgeo.org" target="_blank">postgis-devel-bounces@lists.osgeo.org</a>] <b>On Behalf Of </b>Daniel Baston<br><b>Sent:</b> Tuesday, May 01, 2018 4:53 PM<br><b>To:</b> PostGIS Development Discussion <<a href="mailto:postgis-devel@lists.osgeo.org" target="_blank">postgis-devel@lists.osgeo.org</a>></span></p></div></div><div lang="EN-US" link="blue" vlink="purple"><div class="m_-7399594280537154099WordSection1"><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"><br><b>Subject:</b> Re: [postgis-devel] performance test suite<u></u><u></u></span></p></div></div><div lang="EN-US" link="blue" vlink="purple"><div class="m_-7399594280537154099WordSection1"><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p><div><p class="MsoNormal" style="margin-left:.5in">I agree that this would be very useful, not only for catching regressions but also to help us promote the performance improvements make it in to each release. We itemize performance improvements in the changelog, but we don't generally quantify what they mean for typical use cases. <span style="font-family:"Arial",sans-serif;color:#222222;background:white">It would be nice to say by upgrading to release 2.5, typical point-in-polygon queries are improved by 20%, K-means is improved by X%, etc.</span><u></u><u></u></p><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p></div><div><p class="MsoNormal" style="margin-left:.5in">To keep the perl/python to a minimum, could we rely on pg_stat_statements to do the bulk of the work for us? So it be something as simple as:<u></u><u></u></p><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p></div><div><p class="MsoNormal" style="margin-left:.5in">1) a script that loads or generates test data<u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:.5in">2) a SQL file that runs a bunch of queries capturing typical usages of PostGIS<u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:.5in">3) something that parses the output of pg_stat_statements<u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p></div><div><p class="MsoNormal" style="margin-left:.5in">Dan <u></u><u></u></p></div></div></div><div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p><div><p class="MsoNormal" style="margin-left:.5in">On Tue, May 1, 2018 at 4:42 PM, Regina Obe <<a href="mailto:lr@pcorp.us" target="_blank">lr@pcorp.us</a>> wrote:<u></u><u></u></p><blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt"><div><div><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Bjorn,</span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Oh you are a man after my own heart. Yes definitely. Performance testing is a very weak spot in our testing. I hate finding out about this when users complain </span><span style="font-size:11.0pt;font-family:Wingdings;color:#1f497d">J</span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">I think starting it off as a separate project is a good idea but I'd really love to see it eventually as part of PostGIS core that say we can flip on and have enabled for some bots or when we are about to release. How we keep record of timings etc, seems to me a bot end thing the testing bot reporting to some mothership database.</span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">As to whether it should be done in perl or something else – to be honest Perl scares the shit out of me. Python sadly I haven't warmed up to either. I always feel like I'm fumbling thru a mine field with both. Okay that's an exaggeration.</span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">But then again Perl is a dependency we are used to having, so whatever you do ideally shouldn't add any crazy dependencies and if additional dependencies – a dependency that can run on all platforms. It's okay to have extra dependencies as long as they are not required for regular testing. I think Komzpa already put in some logic for code coverage testing via lcov for example, which is fine since it's not a requirement.</span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Thanks,</span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Regina</span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span><u></u><u></u></p><p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span><u></u><u></u></p><div><div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in"><p class="MsoNormal" style="margin-left:1.0in"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> postgis-devel [mailto:</span><a href="mailto:postgis-devel-bounces@lists.osgeo.org" target="_blank"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">postgis-devel-bounces@lists.osgeo.org</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">] <b>On Behalf Of </b>Paul Ramsey<br><b>Sent:</b> Tuesday, May 01, 2018 1:08 PM<br><b>To:</b> Björn Harrtell <</span><a href="mailto:bjorn@wololo.org" target="_blank"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">bjorn@wololo.org</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">>; PostGIS Development Discussion <</span><a href="mailto:postgis-devel@lists.osgeo.org" target="_blank"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">postgis-devel@lists.osgeo.org</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">><br><b>Subject:</b> Re: [postgis-devel] performance test suite</span><u></u><u></u></p></div></div><div><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p><p class="MsoNormal" style="margin-left:1.0in">I think perhaps “do it as a separate project”. It’s going to be complex, it’s going to be brittle, it’s going to eventually break and I’d rather not have it sitting around broken inside the main source tree. The only way to find the regressions is going to be longitudinally testing and keeping track of numbers over time, so it’ll be quite a complex piece of work.<u></u><u></u></p><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in">P<u></u><u></u></p><div><p class="MsoNormal" style="margin-bottom:12.0pt;margin-left:1.0in"><u></u> <u></u></p><blockquote style="margin-top:5.0pt;margin-bottom:5.0pt"><div><p class="MsoNormal" style="margin-left:1.0in">On May 1, 2018, at 10:05 AM, Björn Harrtell <<a href="mailto:bjorn.harrtell@gmail.com" target="_blank">bjorn.harrtell@gmail.com</a>> wrote:<u></u><u></u></p></div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p><div><div><p class="MsoNormal" style="margin-left:1.0in">Hi devs,<u></u><u></u></p><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in">In recent times I've been pondering on about how to make a sensible test suite specifically for performance. Hacking/extending <a href="http://run_test.pl/" target="_blank">run_test.pl</a> to accommodate for this has been the only suggested path forward but to me it's a dead end mostly because of perl (sorry)-<u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in">The reason why this has become a to me apparent missing thing is due to:<u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in">1. My own work on <a href="https://trac.osgeo.org/postgis/ticket/4076" target="_blank">https://trac.osgeo.org/postgis/ticket/4076</a>.<u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in">2. The recently discovered large performance regression of ST_Union tracked by <a href="https://trac.osgeo.org/postgis/ticket/4075" target="_blank">https://trac.osgeo.org/postgis/ticket/4075</a>. Even if it's in GEOS and could perhaps be performance tested there, I think it would not be wrong to also performance test ST_Union without consideration of underlying implementation.<u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in"><span style="font-family:"Arial",sans-serif;color:#222222;background:white">Any additional thoughts on the subject? do it in perl or don't do it? :)</span><u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in">Regards,<u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div><div><p class="MsoNormal" style="margin-left:1.0in">/Björn<u></u><u></u></p></div></div><p class="MsoNormal" style="margin-left:1.0in">_______________________________________________<br>postgis-devel mailing list<br><a href="mailto:postgis-devel@lists.osgeo.org" target="_blank">postgis-devel@lists.osgeo.org</a><br><a href="https://lists.osgeo.org/mailman/listinfo/postgis-devel" target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-devel</a><u></u><u></u></p></div></blockquote></div><p class="MsoNormal" style="margin-left:1.0in"> <u></u><u></u></p></div></div></div></div></div><p class="MsoNormal" style="margin-left:.5in"><br>_______________________________________________<br>postgis-devel mailing list<br><a href="mailto:postgis-devel@lists.osgeo.org" target="_blank">postgis-devel@lists.osgeo.org</a><br><a href="https://lists.osgeo.org/mailman/listinfo/postgis-devel" target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-devel</a><u></u><u></u></p></blockquote></div><p class="MsoNormal" style="margin-left:.5in"><u></u> <u></u></p></div></div></div>_______________________________________________<br>
postgis-devel mailing list<br>
<a href="mailto:postgis-devel@lists.osgeo.org" target="_blank">postgis-devel@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/postgis-devel" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-devel</a></blockquote></div>