[Benchmarking] Congratulations or panic --- things fall apart.

Adrian Custer adrian.custer at geomatys.fr
Tue Sep 7 07:34:38 EDT 2010


Hello all,

This is too bad that the windows server is down at the critical
juncture. I can imagine that adds stress (if not yet panic) to an
already exhausting effort. Unfortunate.

Regardless of this hiccup, my thanks go out to Mike Smith and the Army
Corps for putting up the machines and trusting us with credentials to
alter them at will. It was very cool of you all to give us the run of
your servers.




We will discuss this evening where we go from here. If this were to be a
scientific publication, we would likely abandon any ambition to publish
at this point. Certainly, from my point of view, the numbers shift too
much between identical runs for me to trust the tools, the test design,
and the metrics. 

If Mike is right and the main success of the effort is the work that we
have poured into our servers, then that is probably the story we should
tell rather than posting the problematic, competitive, and partial
numbers we may have tonight.

Also, I hope we could begin the discussion of how those who are
interested might collaborate towards a robust set of tests in a future
benchmarking suite. Ideally these would be designed to minimize the time
they took, while ensuring that the machines would be in a well
understood state prior to each measurement and while being designed to
stress different aspects of WMS performance. Probably such a test suite
should start with the OGC cite tests to ensure that all the WMS
instances being tested are fully conformant WMS's.


On Tue, 2010-09-07 at 00:32 +0200, Michael Smith wrote:
> All,
> 
> I think the most important thing from this effort is what we all learn about
> our respective software each year. All of these exercise are a little
> contrived to actually make it possible.

Actually 'well contrived' is exactly what the tests should be. The tests
should be able to isolate a particular functionality and stress that
absent all other effects. We are not trying to get 'realistic' numbers
but numbers that both indicate the performance characteristic of various
particular aspects of the servers and discriminate effectively between
different software servers.

> 
> We could have a 1Tb or 10tb dataset but then we would never be able to
> distribute that data set back out for others to run. That's one point I
> think we forget in trying to have better numbers in this or that test. We
> want these tests and benchmarks to be available for others to run outside of
> our environment. 

Was that ever an intent? Were we developing a test suite for others? It
would be a great goal but this is the first I hear that we were actually
trying to do that this year.

> 
> We are also running on servers that have been provided by the US Army Corps
> for this effort. If people want to have better servers to test with, then
> please, contribute your hardware, your bandwidth, your funds.

The servers were actually great. In a real production environment the
disk bottleneck would surely be addressed since it would be such a clear
performance win. But this does not matter for our testing since we could
easily 'contrive' tests either to exploit the bottleneck (and assess
reading efficiency) or to avoid the bottleneck (and assess code
efficiency).





On Tue, 2010-09-07 at 10:39 +0200, Anne-Sophie Collignon wrote:
Hi Martin,
> 
> It's an issue for us as well that windows server is down since
Sunday...
> 
> Also, I'm not sure at the end the windows server results will be
> comparable with the Linux one due to the RHEL OS caching. So, is that
> not already waste of time for running the tests ? The benefits of
> participating is for sure the nice improvements we've provided in the
> product, and also the experience, and lessons to retain of the such
> exercise.


Yeah, maybe this should be our single message to the world: we worked
hard, improved our servers but did not land useful, reliable numbers to
present the public.

Then perhaps we can come up with a better design for next year.

cheers,

--adrian custer



More information about the Benchmarking mailing list