[Benchmarking] Testing setup (following up Adrian Custer questions)

Thu Aug 5 10:20:08 EDT 2010

Hi,
I've seen that there is a number of questions about how we're going to
run the tests for this year.

I'd prefer to start discussing them at ease on the mailing list so that
we can start building the testing scripts sooner rather than later.

> Jmeter testing
>   # what output formats will be requested?

For raster I think we all agreed on plain JPEG.
That still does not say anything about the compression factor, for
some servers going for a very small compression might prove
to be an advantage. If the servers use tunable compression rates
it might be interesting to have them be the same in both
baseline and best effort tests (otherwise, note down the differences
in returned image size using some agreed up WMS requests).

Vector wise I'd go for PNG24, but some asked for PNG8 to be
tested as well.
If we do PNG8 I'd say we just do one test, maybe the one with
all the layers, since it has more colors, with PNG8 too,
so that we can compare with PNG24
(and show both the speed and the resulting size?).

PNG8 + antialiasing normally requires dynamic palette computation
and scale down.
Is it ok to use a precomputed pallette?
I propose to force dynamic computation for the baseline test
(and allow it for best effort where we're quite hands free).

>  # what projections will be requested? (Platte-carré is not conformal)

I don't think we should care about the geometric properties of
the resulting map. The interest of the test is seeing how much
reprojection slows down the server when enabled.

But I agree we should do a test with some reprojection, possibly
a simple one without labels, so that the reprojection engine differences
show up.

So what about we do all vector tests in 4326, and redo the polygon
one, that has a simple styling, in EPSG:3857 as well?
Pretty much anyone has to deal with that code these days in order
to overlay with Google/Bing/Yahoo layers.

>   # what layer combinations will be requested? (All vector layers or various mixed subsets?)

This was discussed lightly but I don't remeber if there was an
agreement. Let me try to propose something:

* all polygon layers together (4326 and 3857, PNG24)
* the contour layers, labeled (4326, PNG24)
* all the vector layers together (4326, PNG24 and PNG8)
* the raster layers (in their native SRS and in 3857)

# what envelopes will be issued? (Frank's 2009 query generation script 
forces a hard cutoff at the largest scale)

We have nothing better of Frank 2009 generation script until
someone improves it or proposes another.

Let's talk scales and area instead.
Many vector layers enable at 30.000, so I guess that defines one
of the boundary. And then we go up until 1:1000?

Area wise, it is tempting to take the entire spain for the vecto layers,
but the tool will generate requests in empty areas as well.
See:
http://12.189.158.78:8080/geoserver/wms?service=WMS&version=1.1.0&request=GetMap&layers=shp:spain&styles=&bbox=-18.167,27.642,4.352,43.805&width=512&height=367&srs=EPSG:4326&format=application/openlayers

If we limit ourselves to
- lat between 37.5 and 42.7
- lon between -6 and -1
we'll basically get something for all vector requests

For raster requests we can take the bbox of the raster layers. Yes, we
have a hole there, and the tool will generate requests there too.
Shall we just keep the empty requests?

> # what output metrics will be created?

JMeter running from the command line unfortunately does not generate
metrics. So we rolled a little python script that generates 
min/max/average time and req/s.
Last year we diagrammed req/s only to avoid cluttering the diagrams.
If people want to have more we can get

Btw, I asked about running JMeter in graphical sessions days ago
but got no answer (see the "Graphical session on the jmeter machine").

> # how will the jmx file be designed? (need one .csv file per thread so all requests are different) 

I was planning to modify last year jmx to have a progression
of 1 2 4 8 16 32 64 threads, each group making enough requests
to stabilize the average (I'd say 200, 200, 400, 400, 400, 400, 800, 
800). As usual I'm open to suggestions, but I'd suggest to avoid
too many requests, we have many servers and we cannot afford the total
run to take various hours.

As far as I know one csv is sufficient, all threads pick the next value
from the shared csv as if it was a shared input queue (and roll over to
the start if it ends, but we'll generate enough requests to make sure
no two requests are ever run in the same session)

This is it. Opinion, comments, contributions, and more importantly,
help to setup the scripts are all very much appreciated ;-)

Cheers
Andrea

-- 
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.