The cheap solution is mod_backhand
, distributed on
the same licence as Apache. It originated in the Center for
Networking and Distributed Systems at Johns Hopkins University.
Its function is to keep track of the resources of individual machines running Apache and connected in a cluster. It then diverts incoming requests to the machines with the largest available resources. There is a small overhead in the redirection, but overall, the cluster works much better.
In the simplest arrangement, a single server has the
site’s IP number and farms the requests out to the
other servers, which are set up identically (apart from IP addresses)
and with identical mod_backhand
directives. The
machines communicate with each other (once a second, by default, but
this can be changed), exchanging information on the resources each
currently has available. On the basis of this information, the
machine that catches a request can forward it to the machine best
able to deal with it. Naturally, there is a computing cost to this,
but it is small and predictable.
mod_backhand
works like a proxy server, but one
that knows the capabilities of its proxies and how that capability
varies from moment to moment.
It is possible to vary this setup so that different machines do different things — for instance, you might have some 64-bit processors (DEC Alphas, for example) which could specialize in running CGI scripts. PCs, however, are used to serve images.
A more complex setup is to use multiple servers fielding the incoming
requests and handing them off to each other. There are essentially
two ways of handling this. The first is to use standard
load-balancing hardware to distribute the requests among the servers,
and then using mod_backhand
to redistribute them
more intelligently. An alternative is to use round-robin
DNS — that is, to give each machine a different IP address, but
to have the server name resolve to all of the addresses. This has the
advantage that you avoid the expense of the load balancer (and the
problems of single points of failure, too), but the problem is that
if a server dies, there’s no easy way to handle the
fact its IP address is no longer being serviced. One answer to this
problem is Wackamole, also from CNDS, which builds on the rather
marvelous Spread toolkit to ensure that every IP address is always in
service on some machine.
This is all very fine and good, and the idea of
mod_backhand
— choosing a lightly loaded
server to service a request on the fly — clearly seems a good
one. But there are problems. The main one is deciding on the server.
The operating system provides loading information in the form of a
one-minute rolling average of the length of the run queue updated
every five seconds. Since a busy site could get 5,000 hits before the
next update, it is clear that just choosing the most lightly loaded
server each time will overwhelm it. The granularity of this data is
much too coarse. Consequently, mod_backhand
has a
number of methods for picking a reasonably lightly loaded server.
Just which method is best involves a lot of real-world
experimentation, and the jury is still out.