mod_backhand

The cheap solution is mod_backhand, distributed on the same licence as Apache. It originated in the Center for Networking and Distributed Systems at Johns Hopkins University.

Its function is to keep track of the resources of individual machines running Apache and connected in a cluster. It then diverts incoming requests to the machines with the largest available resources. There is a small overhead in the redirection, but overall, the cluster works much better.

In the simplest arrangement, a single server has the site’s IP number and farms the requests out to the other servers, which are set up identically (apart from IP addresses) and with identical mod_backhand directives. The machines communicate with each other (once a second, by default, but this can be changed), exchanging information on the resources each currently has available. On the basis of this information, the machine that catches a request can forward it to the machine best able to deal with it. Naturally, there is a computing cost to this, but it is small and predictable.

mod_backhand works like a proxy server, but one that knows the capabilities of its proxies and how that capability varies from moment to moment.

It is possible to vary this setup so that different machines do different things — for instance, you might have some 64-bit processors (DEC Alphas, for example) which could specialize in running CGI scripts. PCs, however, are used to serve images.

A more complex setup is to use multiple servers fielding the incoming requests and handing them off to each other. There are essentially two ways of handling this. The first is to use standard load-balancing hardware to distribute the requests among the servers, and then using mod_backhand to redistribute them more intelligently. An alternative is to use round-robin DNS — that is, to give each machine a different IP address, but to have the server name resolve to all of the addresses. This has the advantage that you avoid the expense of the load balancer (and the problems of single points of failure, too), but the problem is that if a server dies, there’s no easy way to handle the fact its IP address is no longer being serviced. One answer to this problem is Wackamole, also from CNDS, which builds on the rather marvelous Spread toolkit to ensure that every IP address is always in service on some machine.

This is all very fine and good, and the idea of mod_backhand — choosing a lightly loaded server to service a request on the fly — clearly seems a good one. But there are problems. The main one is deciding on the server. The operating system provides loading information in the form of a one-minute rolling average of the length of the run queue updated every five seconds. Since a busy site could get 5,000 hits before the next update, it is clear that just choosing the most lightly loaded server each time will overwhelm it. The granularity of this data is much too coarse. Consequently, mod_backhand has a number of methods for picking a reasonably lightly loaded server. Just which method is best involves a lot of real-world experimentation, and the jury is still out.