<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div dir="ltr"></div><div dir="ltr"><br></div><div dir="ltr"><br><blockquote type="cite">On Sep 12, 2024, at 05:09, Charles Karney via PROJ <proj@lists.osgeo.org> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><span>I recommend against unrolling the loops. This makes the code longer and</span><br><span>harder to read. You also lose the flexibility of adjusting the number</span><br><span>of terms in the expansion at runtime.</span><br><span></span><br><span>…But</span><br><span>remember that compilers can do the loop unrolling for you. Also,</span><br><span>doesn't the smaller code size with the loops result in fewer cache</span><br><span>misses?</span><br></div></blockquote><br><div><font face="Georgia">I think Charles is spot-on here. Any modern compiler will make these optimizations, tuned to the target architecture. Different architectures will prefer different amount of unrolling, so it’s best not to second-guess by hard-coding. Loop overhead of a simple counter is zero, normally, because of unrolling in the short cases and because the branch prediction will favor continuation in the longer cases. Meanwhile the loop counting happens in parallel in one of the ALUs while the FPUs do their thing.</font></div><div><font face="Georgia"><br></font></div><div><font face="Georgia">— daan Strebe</font></div></body></html>