[Benchmarking] Proposal: add standard deviation to the graphical
result
Martin Desruisseaux
martin.desruisseaux at geomatys.fr
Wed Aug 3 12:21:35 EDT 2011
Hello all
Giving that every time a team run a suite of tests, the execution time may vary
slightly (maybe more in Java than in C/C++), we are supposed to run the same
suite of tests many time in order to gain meaningful statistics. In his
"/Performance Anxiety/" talk
(http://www.devoxx.com/display/Devoxx2K10/Performance+Anxiety), Joshua Bloch
suggests that 40 executions is a minimum. I'm somewhat neutral on the number of
runs. However I would like that every team save the execution time of individual
run, so we can do statistics. More specifically, I suggest that the curve to be
show at FOSS4G be the average of all execution time (minus the first executions
for Java applications, because of JVM "warm up" time), together with the
standard deviation. The standard deviation was missing in the last year graphics.
I can take care of producing the graphics at the FOSS4G once I have the data
(every spreadsheets have those basic statistics tools). The reason why I wish
standard deviation is that:
* It show if the execution time of an application is rather stable, of vary a lot.
* If an application appears faster than an other one, the standard deviation
tell us the probability that the first application is really faster, i.e.
that the difference is not a matter of luck because of random variations in
execution time. (This point assumes that the execution time have a gaussian
distribution, but this is probably the case and we can very that from the
raw data).
I can take care of the stats. I basically just ask that we agree on how many
time each suite of tests shall be run, and that each team record all their raw
data (execution time of each individual run).
Regards,
Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/benchmarking/attachments/20110803/e8fefb0e/attachment.html
More information about the Benchmarking
mailing list