Chapter 6. Securing your Application: Assume they’re all out to get you

image with no caption

Your parents were right: don’t talk to strangers. Or at least don’t trust them. If nothing else, don’t give them the keys to your application data, assuming they’ll do the right thing. It’s a cruel world out there, and you can’t count on everyone to be trustworthy. In fact, as a web application developer, you have to be part cynic, part conspiracy theorist. Yes, people are generally bad, and they’re definitely out to get you! OK, maybe that’s a little extreme, but it’s very important to take security seriously and design your applications so that they’re protected against anyone who might choose to do harm.

Uh oh, our young virtual rock prodigy’s moment in the limelight has been short-lived, as Jacob’s top Guitar Wars score is somehow missing, along with all the other scores. It seems a diabolical force is at work to foil the high score application and prevent Guitar Warriors from competing online. Unhappy virtual guitarists are unhappy users, and that can only lead to unhappy application developers... you!

image with no caption
image with no caption
image with no caption

We know that the main Guitar Wars page is empty, but does that mean the database is empty too? A SELECT query can answer that question:

image with no caption

Somehow all of the high score rows of data have been deleted from the Guitar Wars database. Could it be that maybe someone out there is using our Remove Score script to do evil? We need to protect the scores!

A simple and straightforward way to quickly secure the Guitar Wars high scores is to use HTTP authentication to password protect the Admin page. This technique actually involves both a user name and a password, but the idea is to require a piece of secret information from an administrator before they have access to restricted application features, such as the score removal links.

When a page is secured using HTTP authentication, a window pops up requesting the user name and password before access is allowed to the protected page. In the case of Guitar Wars, you can limit access to the Admin page to as few people as you want, potentially just you!

image with no caption

HTTP authentication works like this: when a user tries to access a page protected by authentication, such as our Admin page, they are presented with a window that asks them for a user name and password.

image with no caption

PHP enters the picture through its access to the user name and password entered by the user. They are stored in the $_SERVER superglobal, which is similar to other superglobals you’ve used ($_POST, $_FILES, etc.). A PHP script can analyze the user name and password entered by the user and decide if they should be allowed access to the protected page. Let’s say we only allow access to the Admin page if the user name is “rock” and the password is “roll.” Here’s how the Admin page is unlocked:

image with no caption

The idea behind HTTP authentication is that the server withholds a protected web page, and then asks the browser to prompt the user for a user name and password. If the user enters these correctly, the browser goes ahead and sends along the page. This dialog between browser and server takes place through headers, which are little text messages with specific instructions on what is being requested or delivered.

Headers are actually used every time you visit a web page, not just when authentication is required. Here’s how a normal, unprotected web page is delivered from the server to the browser with the help of headers:

image with no caption

Using PHP, you can carefully control the headers sent by the server to the browser, opening up the possibilities for performing header-driven tasks such as HTTP authentication. The built-in header() function is how a header is sent from the server to the browser from within a PHP script.

header('Content-Type: text/html');

The header() function immediately sends a header from the server to the browser and must be called before any actual content is sent to the browser. This is a very strict requirement—if even a single character or space is sent ahead of a header, the browser will reject it with an error. For this reason, calls to the header() function should precede any HTML code in a PHP script:

image with no caption
image with no caption

Authenticating the Guitar Wars Admin page using headers involves crafting a very specific set of headers, two in fact, that let the browser know to prompt the user for a user name and password before delivering the page. These two headers are generated by PHP code in the Admin script, and control the delivery of the page to the browser.

image with no caption

The two headers required to initiate authentication do two very specific things:

After processing the authentication headers, the browser waits for the user to take action via the authentication window. The browser takes a dramatically different action in response to what the user does...

image with no caption

Indeed it is... headers aren’t just for security

Although authentication presents the immediate need for headers, they are quite flexible and can do lots of other interesting things. Just call the header() function with the appropriate name/value pair, like this:

image with no caption

The header is called a location header and redirects the current page to a page called about.php on the same Guitar Wars site. Here we use a similar header to redirect to the about.php page after five seconds:

image with no caption

This header is called a refresh header since it refreshes a page after a period of time has elapsed. You often see the URL in such headers reference the current page so that it refreshes itself.

One last header is called a content type header because it controls the type of the content being delivered by the server. As an example, you can force a page to be plain text, as opposed to HTML, by using the following header when calling the header() function:

image with no caption

In this example, the text echoed to the browser is displayed exactly as shown with no special formatting. In other words, the server is telling the browser not to render the echoed content as HTML, so the HTML tags are displayed literally as text.

image with no caption

Talk about short-lived success. It didn’t take long at all for villainy to strike again, blitzing the scores from Guitar Wars and yet again frustrating hordes of competitive gamers. It seems that securing the Admin page alone wasn’t enough since the Remove Score script can still be accessed directly... if you know what you’re doing.

Write down how you think we can solve this latest attack, and prevent high scores from being deleted:

__________________________________________

__________________________________________

__________________________________________

image with no caption

Joe: That makes sense. I mean, it worked fine for the Admin page.

Frank: That’s true. So all we have to do is put the same header authorization code in the Remove Score script, and we’re good to go, right?

Jill: Yes, that will certainly work. But I worry about duplicating all that authorization code in two places. What happens if later on we add another page that needs to be protected? Do we duplicate the code yet again?

Joe: Code duplication is definitely a problem. Especially since there is a user name and password that all the scripts need to share. If we ever wanted to change those, we’d have to make the change in every protected script.

Frank: I’ve got it! How about putting the $username and $password variables into their own include file, and then sharing that between the protected scripts. We could even put it in an appvars.php include file for application variables.

Joe: I like where you’re headed but that solution only deals with a small part of the code duplication. Remember, we’re talking about a decent sized little chunk of code.

image with no caption

Jill: You’re both right, and that’s why I think we need a new include file that stores away all of the authorization code, not just the $username and $password variables.

Frank: Ah, and we can just include that script in any page we want to protect with HTTP authorization.

Joe: That’s right! We just have to make sure we always include it first thing since it relies on headers for all the HTTP authorization stuff.

We already have all the code we need for a new Authorize script; it’s just a matter of moving the code from admin.php to a new script file (authorize.php), and replacing the original code with a require_once statement.

image with no caption
image with no caption

Sadly, happiness in the Guitar Wars universe didn’t last for long, as bogus scores are showing up in the application in place of legitimate scores... and still inciting rage throughout the Guitar Wars universe. Apparently it’s entirely possible to disrupt the Guitar Wars high score list without removing scores. But how?

image with no caption

Until now we’ve operated under the assumption that any high score submitted with a screen shot image is considered verified. It’s now reasonably safe to say this is not the case! And it’s pretty clear who the culprit is...

image with no caption

Write down how you would solve the problem of people being able to post bogus high scores to the Guitar Wars application:

__________________________________________

__________________________________________

__________________________________________

Even in this modern world we live in, sometimes you can’t beat a real live thinking, breathing human being. In this case, it’s hard to beat a real person when it comes to analyzing a piece of information and assessing whether or not it is valid. We’re talking about moderation, where a human is put in charge of approving content posted to a web application before it is made visible to the general public.

image with no caption
image with no caption

Guitar Wars could really use some human moderation. Sure, it’s still possible that someone could carefully doctor a screen shot and maybe still sneak a score by a human moderator. But it wouldn’t be easy, and it doesn’t change the fact that moderation is a great deterrent. Keep in mind that securing a PHP application is largely about prevention.

Adding a human moderation feature to Guitar Wars is significant because it affects several parts of the application. The database must change, a new script must be created to carry out an approval, the Admin page must add an “Approve” link to each score, and finally, the main page must change to only show approved scores. With this many changes involved, it’s important to map out a plan and carry out each change one step at a time.

1

Use ALTER to add an approved column to the table.

 

Let’s start with the database, which needs a new column for keeping up with whether or not a score has been approved.

 

2

Create an Approve Score script that handles approving a new high score (sets the approved column to 1).

 

With the database ready to accommodate high score approvals, you need a script to actually handle approving a score. This Approve Score script is responsible for looking up a specific score in the database and changing the approved column for it.

 

3

Modify the Admin page to include an “Approve” link for scores that have yet to be approved.

 

The Approve Score script is a back-end script that shouldn’t normally be accessed directly. Instead, it is accessed through “Approve” links generated and displayed on the Admin page—only unapproved scores have the “Approve” link next to them.

 

4

Change the query on the main page to only show approved scores.

 

The last step is to make sure all this approval stuff gets factored into the main high score view. So the main page of the application changes to only show high scores that have been approved—without this change, all the other approval modifications would be pointless.

 

Adding the new approved column to the guitarwars table involves a one-time usage of the ALTER TABLE statement, which is an SQL statement we’ve used before.

image with no caption

The new approved column is a TINYINT that uses 0 to indicate an unapproved score, or 1 to indicate an approved score. So all new scores should start out with a value of 0 to indicate that they are initially unapproved.

image with no caption

It’s true, a new column means a new value in the INSERT query in the Add Score script.

It’s important to not lose sight of the fact that a PHP application is a careful orchestration of several pieces and parts: a database consisting of tables with rows and columns, PHP code, HTML code, and usually CSS code. It’s not always immediately apparent that changing one part requires changing another. Adding the new approved column in the guitarwars table for the sake of the new Approve Score script also requires modifying the INSERT query in the Add Score script:

image with no caption

All the infrastructure is now in place for the moderation feature in the Guitar Wars high score application. All that’s missing is the final step, which is altering the main page to only show approved scores. This involves tweaking the SQL SELECT query so that it only plucks out scores whose approved column is set to 1 (approved). This is accomplished with a WHERE statement.

The addition of the WHERE statement to this query eliminates any scores that haven’t been approved, which includes all new scores. This gives the moderator a chance to look them over and decide whether they should be removed or made visible to the public (approved).

The moderated version of Guitar Wars represents a significant security improvement, but it’s far from bulletproof. It seems our wily infiltrator has managed to find another weakness in the high score system and somehow sneak her high scores past the moderator. Ethel must be stopped, permanently, in order to restore trust throughout the Guitar Wars universe.

image with no caption
image with no caption

Even though the moderator knows without a doubt that he never approved Ethel’s high score submission, it nevertheless is there in plain view with the approved column set to 1. We know the Add Score script sets the approved column to 0 for new high scores because we just modified the INSERT query in that script. Something just doesn’t add up!

image with no caption

In order to understand what’s happening with this clever form attack, let’s trace the flow of form data as it travels through the Add Score script.

image with no caption

The Score form field expects a single numeric value, such as 1000000, but instead it has several values enclosed in single quotes, separated by commas, and then with a strange double-hyphen at the end. Very strange.

This strange data first gets stored in the $score variable, after which it gets incorporated into the INSERT query. This just results in a meaningless score, right? Or is something more sinister taking place here?

The real culprit in Ethel’s million-point attack is, strangely enough, SQL comments. A double-hyphen (--) is used in SQL to comment out the remainder of a line of SQL code. You must follow the double-hyphen with a space for it to work (-- ), but everything after the space is ignored. Now take a look at Ethel’s full query with that little nugget of wisdom.

image with no caption

Is it making more sense? The comment effectively erased the remaining SQL code so that it wouldn’t generate an error, allowing Ethel’s version of the query to slip through without a snag. The end result is an instantly approved new high score that the moderator never got a chance to catch.

Ethel’s attack is known as an SQL injection, and involves an extremely sneaky trick where form data is used as a means to change the fundamental operation of a query. So instead of a form field just supplying a piece of information, such as a name or score, it meddles with the underlying SQL query itself. In the case of Guitar Wars, Ethel’s SQL injection used the Score field as a means of not only providing the score, but also the screen shot filename, the approval value, and a comment at the end to prevent the original SQL code from generating an error.

image with no caption

The real weakness that SQL injections capitalize on is form fields that aren’t validated for dangerous characters. “Dangerous characters” are any characters that could potentially change the nature of an SQL query, such as commas, quotes, or -- comment characters. Even spaces at the end of a piece of data can prove harmful. Leading or trailing spaces are easy enough to eliminate with the built-in PHP function trim()—just run all form data through the trim() function before using it in an SQL query.

image with no caption

But leading and trailing spaces aren’t the whole problem. You still have the commas, quotes, comment characters, and on, and on. So in addition to trimming form fields of extra spaces, we also need a way to find and render harmless other problematic characters. PHP comes to the rescue with another built-in function, mysqli_real_escape_string(), which escapes potentially dangerous characters so that they can’t adversely affect how a query executes. These characters can still appear as data in form fields, they just won’t interfere with queries.

Putting the trim() and mysqli_real_escape_string() functions together provide a solid line of defense against SQL injections.

image with no caption

Processing the three Guitar Wars form fields with the trim() and mysqli_real_escape_string() functions greatly reduces the chances of another SQL injection attack. But these two functions aren’t enough—maybe there’s a way to make the query itself less vulnerable...

Aside from exploiting weak form field protection, Ethel’s SQL injection also relied on the fact that the approved column followed the screenshot column in the database structure. That’s how she was able to get away with just adding 1 onto the end of INSERT and have it go into the approved column. The problem is that the INSERT query is structured in such a way that it has to insert data into all columns, which adds unnecessary risk.

image with no caption

When data is inserted into a table like this, the order of the data must line up with the order of the columns in the table structure. So the fifth piece of data will go into the screenshot column because it’s the fifth column in the table. But it really isn’t necessary to explicitly insert the id or approved columns since id is auto-incremented and approved should always be 0. A better approach is to focus on inserting only the data explicitly required of a new high score. The id and approved columns can then be allowed to default to AUTO_INCREMENT and 0, respectively.

We need a restructured INSERT query that expects a list of columns prior to the list of data, with each matching one-to-one. This eliminates the risk of the approved column being set—it’s no longer part of the query. If this kind of query looks familiar, it’s because you’ve used it several times in other examples.

INSERT INTO guitarwars (date, name, score, screenshot)
VALUES (NOW(), '$name', '$score', '$screenshot')
image with no caption

This version of the INSERT query spells out exactly which column each piece of data is to be stored in, allowing you to insert data without having to worry about the underlying table structure. In fact, it’s considered better coding style to use this kind of INSERT query so that data is inserted exactly where you intend it to go, as opposed to relying on the structural layout of the table.

image with no caption

Not only is it possible, but it’s a very good idea to specify DEFAULT column values whenever possible.

The SQL DEFAULT command is what allows you to specify a default value for a column. If a column has a default value, you can forego setting it in an INSERT query and relax in the confidence of knowing that it will automatically take on the default value. This is perfect for the approved column in the guitarwars table. Now we just need to modify the table one more time to set the default value for approved to 0 (unapproved).

image with no caption

With the approved column now altered to take on a default value, the new and improved INSERT query in the Add Score script can insert high scores without even mentioning the approved column. This is good design since there’s no need to explicitly insert a value that can be defaulted, and it adds a small extra degree of security by not exposing the approved column to a potential attack.

One last step in minimizing the risks of SQL injection attacks involves the form validation in the Add Score script. Before checking to see if the screen shot file type or size is within the application-defined limits, the three Add Score form fields are checked to make sure they aren’t empty.

image with no caption

There is nothing wrong with this code as-is, but securing an application is often about going above and beyond the call of duty. Since the Score field expects a number, it makes sense to not just check for a non-empty value but for a numeric value. The PHP is_numeric() function does just that by returning true if a value passed to it is a number, or false otherwise. It’s consistently doing the little things, like checking for a number when you’re expecting a number, that will ultimately make your application as secure as possible from data attacks.

image with no caption

It seems Ethel’s will to interfere with the Guitar Wars high scores has finally been broken thanks to the improvements to the application that render it immune to SQL injections. The reigning Guitar Wars champion has responded by posting a new top score.

image with no caption
image with no caption

In addition to taking the Guitar Wars high score application to a new level, you’ve acquired several new tools and techniques. Let’s revisit the most important ones.

image with no caption
image with no caption