load balancing across multiple machines

Stephen Lime steve.lime at dnr.state.mn.us
Mon Aug 9 16:39:43 EDT 1999


Greetings: I thought I'd share a simple script we've used here to balance
map making across a number of machines. The script, balance.pl, is a
short perl script that uses the UNIX 'rup' command and 'rpc.statd'. The
original script was made available by Chris Stuber of the US Census 
Bureau. In order to use it you need...

  1) 2 or more machines set up identically with respect to a particular
      MapServer application. By identical I mean that each machine
      needs to respond to the same MapServer request in the same way.
      This is easiest if the machines are exact copies of each other.

  2) rpc.statd needs to be running on each machine in the pool.

  3) Perl 5. 

Within the script you define a pool of machines, web server ports and
a few other things. The balance.pl script can be installed on one or
all of the machines depending on how you want to do things. In any
case, you replace calls to /cgi-bin/mapserv with /cgi-bin/balance.pl
and the script will hand off the actuall calls to the MapServer to the least
busy machine (based on 1 minute load average). If you're using a
master machine for balance.pl then the HTML templates need to 
explicitly refer to that machine for the next map:

<form name="mapserv" action="http://master.machine/cgi-bin/balance.pl">

if all machines have balance.pl then the following is fine:

<form name="mapserv" action="/cgi-bin/balance.pl">

You can see it working with the following application:

http://www.ra.dnr.state.mn.us/bwca/

Map/image building is spread across 4 machines, each with the balancing
script locally. Seems to work really well. Load averages are nearly identical.
Some machines are busier than others so the number of maps varies, but
that's what we're after. This approach makes more sense, to me anyway, then
a DNS round-robin approach, and is certainly cheaper than load balancing
hardware.

Bugs: If a machine in the pool goes down then 'rup' will wait 60 seconds for
it to respond. 

To do's:  Adding some sort of server weighting scheme should be pretty easy 
to do. Fixing the timeout issue on dead machines (Perl may have some RPC
modules that I haven't looked into yet).

Again, the credit for this goes to Chris - also any blame ;-). 

Steve


Stephen Lime
Internet Applications Analyst
MIS Bureau - MN DNR

(651) 297-2937
steve.lime at dnr.state.mn.us

-------------- next part --------------
#!/usr/local/bin/perl

use CGI qw(:standard);

my $rup = "/bin/rup -l";

my $machines = "pinchot leopold darwin capra";
my %ports = ("pinchot" => ":8080", 
             "leopold" => ":8080",
             "darwin" => ":8080",
             "capra" => ":8080",
            );
my $domain = "dnr.state.mn.us";

my $script = "/cgi-bin/mapserv";
my $query = '';

$cgi = new CGI;

if($cgi->self_url =~ /\?/) {
  $query = $cgi->self_url;
  $query =~ s/.*\?//;
}

# Load Balancing setup
#
@hosts = `$rup $machines`;
foreach $line (@hosts) {
  while ($line =~ s/^\s//) {};
  $line =~ s/\,//g;
  @array = split(/ /,$line);
  $mach = $array[0];
  pop(@array);
  pop(@array);
  $load_1 = pop(@array);
  $pool{$mach} = $load_1;
}

$cload = 100;
foreach $mach (keys(%pool)) {
  if ($pool{$mach} < $cload) {
     $cload = $pool{$mach};
     $preferred = $mach;
   }
}
  
print "Location: http://$preferred.$domain". $ports{$preferred} ."$script?$query\n\n";

exit;


More information about the mapserver-users mailing list