One of the most important rules, when developing networked code, is that your program should never trust the connected peer. Your code should never assume that the connected peer sends data in a particular format. This is especially vital for server code that may communicate with multiple clients at once.
If your code doesn't carefully check for errors and unexpected conditions, then it will be vulnerable to exploits.
Consider the following code which receives data into a buffer until a space character is found:
char buffer[1028] = {0};
char *p = buffer;
while (!strstr(p, " "))
p += recv(client, p, 1028, 0);
The preceding code works simply. It reserves 1,028 bytes of buffer space and then uses recv() to write received data into that space. The p pointer is updated on each read to indicate where the next data should be written. The code then loops until the strstr() function detects a space character.
That code could be useful to read data from a client until an HTTP verb is detected. For example, it could receive data until GET is received, at which point the server can begin to process a GET request.
One problem with the preceding code is that recv() could write past the end of the allocated space for buffer. This is because 1028 is passed to recv(), even if some data has already been written. If a network client can cause your code to write past the end of a buffer, then that client may be able to completely compromise your server. This is because both data and executable code are stored in your server's memory. A malicious client may be able to write executable code past the buffer array and cause your program to execute it. Even if the malicious code isn't executed, the client could still overwrite other important data in your server's memory.
The preceding code can be fixed by passing to recv() only the amount of buffer space remaining:
char buffer[1028] = {0};
char *p = buffer;
while (!strstr(p, " "))
p += recv(client, p, 1028 - (p - buffer), 0);
In this case, recv() is not be able to write more than 1,028 bytes total into buffer. You may think that the memory errors are resolved, but you would still be wrong. Consider a client that sends 1,028 bytes, but no space characters. Your code then calls strstr() looking for a space character. Considering that buffer is completely full now, strstr() cannot find a space character or a null terminating character! In that case, strstr() continues to read past the end of buffer into unallocated memory.
So, you fix this issue by only allowing recv() to write 1,027 bytes total. This reserves one byte to remain as the null terminating character:
char buffer[1028] = {0};
char *p = buffer;
while (!strstr(p, " "))
p += recv(client, p, 1027 - (p - buffer), 0);
Now your code won't write or read past the array bounds for buffer, but the code is still very broken. Consider a client that sends 1,027 characters. Or consider a client that sends a single null character. In either case, the preceding code continues to loop forever, thus locking up your server and preventing other clients from being served.
Hopefully, the previous examples illustrate the care needed to implement a server in C. Indeed, it's easy to create bugs in any programming language, but in C special care needs to be taken to avoid memory errors.
Another issue with server software is that the server wants to allow access to some files on the system, but not others. A malicious client could send an HTTP request that tries to download arbitrary files from your server system. For example, if an HTTP request such as GET /../secret_file.c HTTP/1.1 was sent to a naive HTTP server, that server may send the secret_file.c to the connected client, even though it exists outside of the public directory!
Our code in web_server.c detects the most obvious attempts at this by searching for requests containing .. and denying those requests.
A robust server should use operating systems features to detect that requested files exist as actual files in the permitted directory. Unfortunately, there is no cross-platform way to do this, and the platform-dependent options are somewhat complicated.
Please understand that these are not purely theoretical concerns, but actual exploitable bugs. For example, if you run our web_server.c program on Windows and a client sends the request GET /this_will_be_funny/PRN HTTP/1.1, what do you suppose happens?
The this_will_be_funny directory doesn't exist, and the PRN file certainly doesn't exist in that non-existent directory. These facts may lead you to think that the server simply returns a 404 Not Found error, as expected. However, that's not what happens. Under Windows, PRN is a special filename. When your server calls fopen() on this special name, Windows doesn't look for a file, but rather it connects to a printer interface! Other special names include COM1 (connects to serial port 1) and LPT1 (connects to parallel port 1), although there are others. Even if these filenames have an extension, such as PRN.txt, Windows still redirects instead of looking for a file.
One generally applicable piece of security advice is this—run your networked programs under non-privileged accounts that have access to only the minimum resources needed to function. In other words, if you are going to run a networked server, create a new account to run it under. Give that account read access to only the files that server needs to serve. This is not a substitute for writing secure code, but rather running as a non-privilege user creates one final barrier. It is advice you should apply even when running hardened, industry-tested server software.
Hopefully, the previous examples illustrate that programming is complicated, and safe network programming in C can be difficult. It is best approached with care. Oftentimes, it is not possible to know that you have all the loopholes covered. Operating systems don't always have adequate documentation. Operating system APIs often behave in non-obvious and non-intuitive ways. Be careful.