<div dir="ltr"><div>I would make some of assertions (some just restating what Even wrote)</div><div><br></div><div>* The cost to heap allocation is a fair question (but I would suggest we defer the work until required)</div><div><br></div><div>* Any work done on performance critical subroutines should be done carefully.</div><div><br></div><div>* It would help to have a standard comment that people can add to empirically detected performance critical sections so that we reduce the chance of performance regressions</div><div><br></div><div>* It would be nice to have a very simple performance check system, but that seems like a separate discussion that will open the whole benchmarking can-o-worms.</div><div><br></div><div>* It would be great to have a set of instructions on how to instrument GDAL to collect metrics of runs and detect hot spots with open source tools.  It's been a long time since I've done anything like that with generally available tools.  e.g. <a href="https://sourceware.org/binutils/docs/gprof/" target="_blank">https://sourceware.org/binutils/docs/gprof/</a>  And I need to look more at the gcov output.  I see that mapserver has a start  <a href="https://trac.osgeo.org/mapserver/wiki/PerformanceTesting" target="_blank">https://trac.osgeo.org/mapserver/wiki/PerformanceTesting</a></div><div><br></div><div>* The gdal wiki needs to have more of <a href="http://erouault.blogspot.com/2016/01/software-quality-improvements-in-gdal.html" target="_blank">http://erouault.blogspot.com/2016/01/software-quality-improvements-in-gdal.html</a> blended in</div><div><br></div><div>* <a href="https://en.wikiquote.org/wiki/Donald_Knuth" target="_blank">https://en.wikiquote.org/wiki/Donald_Knuth</a> "<span style="color:rgb(37,37,37);font-family:sans-serif;font-size:14px;line-height:22.4px">premature optimization is the root of all evil (or at least most of it) in programming" :)</span></div><div><br></div>* I would suggest that with any guideline there will almost always be the occasional location where that guideline does not make sense.  In that case, I would expect that a comment would go along with the exception explaining why it is important.<div><br></div><div>* The cost of heap versus stack is not well done in a standalone test (but there is value in standalone examples).  Pressures on caches, TLBs, etc are difficult to replicate in simple examples.  Moving a large allocate to the heap helps keep the rest of the stack fast.  Think about the pages required for the stack with multiple cores and lots of threads.</div><div><a href="http://unix.stackexchange.com/questions/128213/how-is-page-size-determined-in-virtual-address-space" target="_blank">http://unix.stackexchange.com/questions/128213/how-is-page-size-determined-in-virtual-address-space</a></div><div><br></div><div>* There are also places where heap allocation is just flat out banned after a process is initialized.  e.g. daemons like gpsd or long running realtime systems (your car's engine controllers, spacecraft control systems, etc.).  I would argue that it is never going to be a good idea to use GDAL inside that kind of system / process <br><div><br></div><div>In terms of GDAL, I haven't seen any places where heap allocation of large objects would like be an important fraction of runtime.  I'm sure they are there, but I would bet they are rare.</div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 2, 2016 at 3:29 AM, Even Rouault <span dir="ltr"><<a href="mailto:even.rouault@spatialys.com" target="_blank">even.rouault@spatialys.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Le jeudi 02 juin 2016 00:46:06, Kurt Schwehr a écrit :<br>

> <a href="https://docs.google.com/document/d/1O1B7LY13L532kXcYcB2EdO65m5LOCsqaqn5R9iJ" rel="noreferrer" target="_blank">https://docs.google.com/document/d/1O1B7LY13L532kXcYcB2EdO65m5LOCsqaqn5R9iJ</a><br>

<span>> fSPU/pub<br>

><br>

> The optimized stack .o is 1248 bytes and the on the heap vector is 1600<br>

> bytes with gcc 4.8.  The cost of either is pretty small.  So, if there are<br>

> 100 of these in gdal, we are talking about 30-60K of extra object file size<br>

> for all the cases in GDAL.<br>

<br>

</span>Code size is one thing, but I'd be curious to know about the speed impacts due<br>

to the heap allocation. Like time several million iterations of each way (and<br>

make sure to do something non trivial with the arrray so that the compiler<br>

doesn't optimize the code away). If there's no significant difference, then fine.<br>

Otherwise we'd have to be careful when the replacements are done in a<br>

performance sensitive routine (my feeling is that there are not so many such<br>

places)<br>

<div><div><br>

><br>

> For stack usage, start thinking about tons of threads spread out over lots<br>

> of cores and think about how systems are built...  Google Compute Engine<br>

> has 32 core machines.  If you want to get the most out your machines at<br>

> scale, you want a small stack.<br>

><br>

> int anVals[256] = {};  Does initialize everything to the default value<br>

> (zeros), but doesn't solve the stack issue.<br>

><br>

> Making a little class is yet another thing for people to learn when vector<br>

> looks to me to work quite well.<br>

><br>

> On Mon, May 9, 2016 at 1:37 PM, Andrew Bell <<a href="mailto:andrew.bell.ia@gmail.com" target="_blank">andrew.bell.ia@gmail.com</a>><br>

><br>

> wrote:<br>

> > On Mon, May 9, 2016 at 2:49 PM, Mateusz Loskot <<a href="mailto:mateusz@loskot.net" target="_blank">mateusz@loskot.net</a>> wrote:<br>

> >> Point taken.<br>

> >><br>

> >> Although the proposal looks OK, I'd suggest to check what<br>

> >> assembler code generates your favourite C++ toolkit,<br>

> >> or at least measured times for<br>

> >><br>

> >> int anVals[256];<br>

> >> memset(anVals, 0, 256*sizeof(int));<br>

> ><br>

> > Are "we" doing memset anymore in these cases?<br>

> ><br>

> > int anVals[256] = {};<br>

> ><br>

> > seems preferable<br>

> ><br>

> >> vs<br>

> >><br>

> >> std::vector<int> oVals(256, 0);<br>

> >><br>

> >> and compare with:<br>

> >><br>

> >> std::vector<char> oVals(256, 0);<br>

> ><br>

> > Think vector is a bad solution for something that's fixed.  Just write<br>

> > something.  But I already suggested that and wrote something... :)<br>

> ><br>

> > Do you know why they are wedded to a 16K stack?<br>

> ><br>

> > --<br>

> > Andrew Bell<br>

> > <a href="mailto:andrew.bell.ia@gmail.com" target="_blank">andrew.bell.ia@gmail.com</a><br>

> ><br>

> > _______________________________________________<br>

> > gdal-dev mailing list<br>

> > <a href="mailto:gdal-dev@lists.osgeo.org" target="_blank">gdal-dev@lists.osgeo.org</a><br>

> > <a href="http://lists.osgeo.org/mailman/listinfo/gdal-dev" rel="noreferrer" target="_blank">http://lists.osgeo.org/mailman/listinfo/gdal-dev</a><br>

<br>

--<br>

</div></div><div><div>Spatialys - Geospatial professional services<br>

<a href="http://www.spatialys.com" rel="noreferrer" target="_blank">http://www.spatialys.com</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div data-smartmail="gmail_signature">--<div><a href="http://schwehr.org" target="_blank">http://schwehr.org</a></div></div>

</div></div></div>