We are now in a position to start creating real(ish) web sites, which can be found in the sample code at the web site for the book, http://oreilly.com/catalog/apache3/. For the sake of a little extra realism, we will base the site loosely round a simple web business, Butterthlies, Inc., that creates and sells picture postcards. We need to give it some web addresses, but since we don’t yet want to venture into the outside world, they should be variants on your own network ID. This way, all the machines in the network realize that they don’t have to go out on the Web to make contact. For instance, we edited the \windows\hosts file on the Windows 95 machine running the browser and the /etc/hosts file on the Unix machine running the server to read as follows:
127.0.0.1 localhost 192.168.123.2 www.butterthlies.com 192.168.123.2 sales.butterthlies.com 192.168.123.3 sales-IP.butterthlies.com 192.168.124.1 www.faraway.com
localhost is obligatory, so we left it in, but you should not make any server requests to it since the results are likely to be confusing.
You probably need to consult your network manager to make similar arrangements.
site.simple is site.toddle with a few small changes. The script go will work anywhere. To get started, do the following, depending on your operating environment:
test -d logs || mkdir logs
httpd -d 'pwd' -f 'pwd'/conf/httpd.conf
Open an MS-DOS window and from the command line, type:
c>cd \program files\apache group\apache c>apache -k start c>Apache/1.3.26 (Win32) running ...
To stop Apache, open a second MS-DOS window:
c>apache -k stop c>cd logs c>edit error.log
This will be true of each site in the demonstration setup, so we will not mention it again.
From here on, there will be minimal differences between the server setups necessary for Win32 and those for Unix. Unless one or the other is specifically mentioned, you should assume that the text refers to both.
It would be nice to have a log of what goes on. In the first edition
of this book, we found that a file access_log
was created automatically in
...site.simple/logs. In a rather bizarre move
since then, the Apache Group has broken backward compatibility and
now requires you to mention the log file explicitly in the Config
file using the TransferLog
directive.
The ... /conf/httpd.conf file now contains the following:
User webuser Group webgroup ServerName www.butterthlies.com DocumentRoot /usr/www/APACHE3/APACHE3/site.simple/htdocs TransferLog logs/access_log
In ... /htdocs we have, as before, 1.txt :
hullo world from site.simple again!
Type ./go
on the server. Become the client, and
retrieve http://www.butterthlies.com. You should
see:
Index of / . Parent Directory . 1.txt
Click on 1.txt
for an inspirational message as
before.
This all seems satisfactory, but there is a hidden mystery. We get the same result if we connect to http://sales.butterthlies.com. Why is this? Why, since we have not mentioned either of these URLs or their IP addresses in the configuration file on site.simple, do we get any response at all?
The answer is that when we configured the machine on which the server runs, we told the network interface to respond to anyof these IP addresses:
192.168.123.2 192.168.123.3
By default Apache listens to all IP addresses belonging to the
machine and responds in the same way to all of them. If there are
virtual hosts
configured (which there aren’t, in this case),
Apache runs through them, looking for an IP name that corresponds to
the incoming connection. Apache uses that configuration if it is
found, or the main configuration if it is not. Later in this chapter,
we look at more definite control with the directives
BindAddress
, Listen
, and
<VirtualHost>
.
It has to be said that working like this (that is, switching rapidly between different configurations) seemed to get Netscape or Internet Explorer into a rare muddle. To be sure that the server was functioning properly while using Netscape as a browser, it was usually necessary to reload the file under examination by holding down the Control key while clicking on Reload. In extreme cases, it was necessary to disable caching by going to Edit → Preferences → Advanced → Cache. Set memory and disk cache to 0, and set cache comparison to Every Time. In Internet Explorer, set Cache Compares to Every Time. If you don’t, the browser tends to display a jumble of several different responses from the server. This occurs because we are doing what no user or administrator would normally do, namely, flipping around between different versions of the same site with different versions of the same file. Whenever we flip from a newer version to an older version, Netscape is led to believe that its cached version is up-to-date.
Back on the server, stop Apache with ^C
, and look
at the log files. In ... /logs/access_log, you
should see something like this:
192.168.123.1--- [<date-time>] "GET / HTTP/1.1" 200 177
200
is the response code (meaning
“OK, cool, fine”), and
177
is the number of bytes transferred. In
... /logs/error_log, there should be nothing
because nothing went wrong. However, it is a good habit to look there
from time to time, though you have to make sure that the date and
time logged correspond to the problem you are investigating. It is
easy to fool yourself with some long-gone drama.
Life being what it is, things can go wrong, and the client can ask
for something the server can’t provide. It makes
sense to allow for this with the ErrorDocument
command.