<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Hi,<br>
This is just my opinion on how to improve future benchmarks. I think
in order for the tests to be more real-world like, we need to make
sure the total amount of hot data that went through the rendering
pipe is much bigger than the amount of available physical memory.
So for instance on a 8GB box we should be rendering 16GB of raw
data for a full run. My guestimate for this year's vector working
set (from those 2152 query windows) is around 2GB, which as some
team discovered can be coerced into memory cache (at the OS level)
with some conscious efforts from its map server. <br>
<br>
Maybe in addition to this large working set, we can devise a smaller
one so that every team is encouraged to cache it in memory, thereby
eliminating any disk boundness. This will be a good indication of
the raw 'map rendering' performance. But it probably does not say
much about scalability in a real world situation.<br>
<br>
thanks<br>
LJ<br>
<br>
<br>
On 9/6/2010 11:47 AM, Luc Donea wrote:
<blockquote
cite="mid:C4981DA0DB5E59488746E77B0DD31301ACBE41@belex02.lggm.llc"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta name="Generator" content="MS Exchange Server version
6.5.7638.1">
<title>RE : [Benchmarking] data block caching technique</title>
<!-- Converted from text/plain format -->
<p><font size="2">Hi Adrian and all,<br>
<br>
Thanks for your quick answer Adrian. I like your idea to
generate newer, shorter scripts.<br>
<br>
I would not limit to only two threads, I think it stay
important to show the server scalability at 64 users. We could
do 1, 4 and 64 users to stay short. The important thing to me
is to to create a different csv for the 2nd and 3rd runs. This
would ensure a more realistic result and still allow servers
to warm up. I also think we should rerun the imagery but also
the vector tests. Do you think you would be able to create
these tests? I must admit that I am unable to do that... :-)<br>
<br>
It would be indeed really nice to run the benchmarks all
together so everyone can follow what is happening, but I guess
this can take quite awhile to run 16 tests and starting 8
servers. I guess it can easily turn to a 3 hours session.<br>
<br>
I'd like to know what the others are thinking about this and
what would be everyone's availability on Tuesday?<br>
<br>
Thanks,<br>
<br>
Luc<br>
<br>
-------- Message d'origine--------<br>
De: <a class="moz-txt-link-abbreviated" href="mailto:benchmarking-bounces@lists.osgeo.org">benchmarking-bounces@lists.osgeo.org</a> de la part de Adrian
Custer<br>
Date: lun. 06/09/2010 04:58<br>
À: <a class="moz-txt-link-abbreviated" href="mailto:benchmarking@lists.osgeo.org">benchmarking@lists.osgeo.org</a><br>
Objet : Re: [Benchmarking] data block caching technique<br>
<br>
Hey all,<br>
<br>
<br>
On Mon, 2010-09-06 at 09:38 +0200, Luc Donea wrote:<br>
><br>
> We think that this combination of unrealistic test
conception and data<br>
> block caching technique is unfair to other participants
and will make<br>
> their results looks bad, while they might perform as good
or even<br>
> better in a real world use-case.<br>
><br>
We tried to raise this issue early on by saying that all those
in the<br>
benchmarking effort really needed to agree on what kind of a
setup we<br>
were trying to mimic in the benchmark so that we could then
build tests<br>
which reasonably represented that setup.<br>
<br>
Because we did not do that work, it seems we have stumbled
into an edge<br>
case for which some servers are able to work only from main
memory. When<br>
we agreed to use the current tiny raster data set (compared to
the 1.3Tb<br>
full .ecw dataset for all of Spain), we realized that we would
not be<br>
benchmarking a real, industrial dataset. However, we did not
know that<br>
it would be just small enough that, coupled with repeated
request sets,<br>
some servers would be working from main memory.<br>
<br>
<br>
> I think that every one should publish all 3 run results
and guarantee<br>
> that these have been measured just after server
restarting. We would<br>
> also like that the ones using such technique rerun their
test after<br>
> disabling it.<br>
<br>
The question of how to resolve this situation is more
difficult.<br>
<br>
<br>
We had a vote on which scripts to use, and the vote result was
in favour<br>
of switching. Seeing the results of the vote, our team started
all our<br>
runs with the newer scripts.<br>
<br>
However, the vote seems to have been totally ignored. I
personally do<br>
not like working through this voting process but would rather
work<br>
through the slower but more friendly and productive process of
getting<br>
everyone to agree on a consensus position. Nonetheless up
until the<br>
script vote, everything in this benchmarking process was done
through<br>
voting. I am puzzled as to why, on this issue, the vote was
ignored.<br>
<br>
<br>
<br>
The proposal you make, Luc, would be difficult to follow. I
imagine few<br>
of the teams bothered to flush the memory cache before making
their<br>
runs. I have observed Constellation-SDI both after filling the
caches<br>
and after emptying them---the results are, unsurprisingly,
totally<br>
different. So your proposal boils down to every team
re-running a set of<br>
benchmarks.<br>
<br>
I am open to any productive suggestion of how to resolve the
issue. We<br>
could easily generate newer, shorter scripts, say with only
one or two<br>
thread combinations to test servers as if they were serving
the whole<br>
imagery dataset for Spain yet be able to complete runs for all
the teams<br>
in the remaining time. We might even be able to make the runs
for all<br>
servers during our meeting time on the 7th. It would seem
generally a<br>
good strategy anyhow to run the benchmarks all together so
everyone can<br>
follow what is happening.<br>
<br>
<br>
--adrian custer<br>
<br>
<br>
<br>
<br>
_______________________________________________<br>
Benchmarking mailing list<br>
<a class="moz-txt-link-abbreviated" href="mailto:Benchmarking@lists.osgeo.org">Benchmarking@lists.osgeo.org</a><br>
<a moz-do-not-send="true"
href="http://lists.osgeo.org/mailman/listinfo/benchmarking">http://lists.osgeo.org/mailman/listinfo/benchmarking</a><br>
<br>
</font>
</p>
<pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
Benchmarking mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Benchmarking@lists.osgeo.org">Benchmarking@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="http://lists.osgeo.org/mailman/listinfo/benchmarking">http://lists.osgeo.org/mailman/listinfo/benchmarking</a>
</pre>
</blockquote>
<br>
</body>
</html>