Your parents were right: don’t talk to strangers. Or at least don’t trust them. If nothing else, don’t give them the keys to your application data, assuming they’ll do the right thing. It’s a cruel world out there, and you can’t count on everyone to be trustworthy. In fact, as a web application developer, you have to be part cynic, part conspiracy theorist. Yes, people are generally bad, and they’re definitely out to get you! OK, maybe that’s a little extreme, but it’s very important to take security seriously and design your applications so that they’re protected against anyone who might choose to do harm.
Uh oh, our young virtual rock prodigy’s moment in the limelight has been short-lived, as Jacob’s top Guitar Wars score is somehow missing, along with all the other scores. It seems a diabolical force is at work to foil the high score application and prevent Guitar Warriors from competing online. Unhappy virtual guitarists are unhappy users, and that can only lead to unhappy application developers... you!
We know that the main Guitar Wars page is empty, but does that mean the database is empty too? A SELECT
query can answer that question:
Somehow all of the high score rows of data have been deleted from the Guitar Wars database. Could it be that maybe someone out there is using our Remove Score script to do evil? We need to protect the scores!
A simple and straightforward way to quickly secure the Guitar Wars high scores is to use HTTP authentication to password protect the Admin page. This technique actually involves both a user name and a password, but the idea is to require a piece of secret information from an administrator before they have access to restricted application features, such as the score removal links.
When a page is secured using HTTP authentication, a window pops up requesting the user name and password before access is allowed to the protected page. In the case of Guitar Wars, you can limit access to the Admin page to as few people as you want, potentially just you!
HTTP authentication provides a simple way to secure a page using PHP.
HTTP authentication works like this: when a user tries to access a page protected by authentication, such as our Admin page, they are presented with a window that asks them for a user name and password.
PHP enters the picture through its access to the user name and password entered by the user. They are stored in the $_SERVER
superglobal, which is similar to other superglobals you’ve used ($_POST
, $_FILES
, etc.). A PHP script can analyze the user name and password entered by the user and decide if they should be allowed access to the protected page. Let’s say we only allow access to the Admin page if the user name is “rock” and the password is “roll.” Here’s how the Admin page is unlocked:
The idea behind HTTP authentication is that the server withholds a protected web page, and then asks the browser to prompt the user for a user name and password. If the user enters these correctly, the browser goes ahead and sends along the page. This dialog between browser and server takes place through headers, which are little text messages with specific instructions on what is being requested or delivered.
Headers are actually used every time you visit a web page, not just when authentication is required. Here’s how a normal, unprotected web page is delivered from the server to the browser with the help of headers:
All web pages are delivered with the help of headers.
Using PHP, you can carefully control the headers sent by the server to the browser, opening up the possibilities for performing header-driven tasks such as HTTP authentication. The built-in header()
function is how a header is sent from the server to the browser from within a PHP script.
The header() function lets you create and send a header from a PHP script.
header('Content-Type: text/html');
The header()
function immediately sends a header from the server to the browser and must be called before any actual content is sent to the browser. This is a very strict requirement—if even a single character or space is sent ahead of a header, the browser will reject it with an error. For this reason, calls to the header()
function should precede any HTML code in a PHP script:
Authenticating the Guitar Wars Admin page using headers involves crafting a very specific set of headers, two in fact, that let the browser know to prompt the user for a user name and password before delivering the page. These two headers are generated by PHP code in the Admin script, and control the delivery of the page to the browser.
Two specific headers are required to request the authentication of a web page.
The two headers required to initiate authentication do two very specific things:
After processing the authentication headers, the browser waits for the user to take action via the authentication window. The browser takes a dramatically different action in response to what the user does...
Indeed it is... headers aren’t just for security
Although authentication presents the immediate need for headers, they are quite flexible and can do lots of other interesting things. Just call the header()
function with the appropriate name/value pair, like this:
The header is called a location header and redirects the current page to a page called about.php
on the same Guitar Wars site. Here we use a similar header to redirect to the about.php
page after five seconds:
This header is called a refresh header since it refreshes a page after a period of time has elapsed. You often see the URL in such headers reference the current page so that it refreshes itself.
One last header is called a content type header because it controls the type of the content being delivered by the server. As an example, you can force a page to be plain text, as opposed to HTML, by using the following header when calling the header()
function:
In this example, the text echoed to the browser is displayed exactly as shown with no special formatting. In other words, the server is telling the browser not to render the echoed content as HTML, so the HTML tags are displayed literally as text.
Headers must be the very first thing sent to the browser in a PHP file.
Because headers must be sent before any content, it is extremely important to not allow even a single space to appear outside of PHP code before calling the header()
function in a PHP script.
Talk about short-lived success. It didn’t take long at all for villainy to strike again, blitzing the scores from Guitar Wars and yet again frustrating hordes of competitive gamers. It seems that securing the Admin page alone wasn’t enough since the Remove Score script can still be accessed directly... if you know what you’re doing.
Write down how you think we can solve this latest attack, and prevent high scores from being deleted:
__________________________________________
__________________________________________
__________________________________________
Joe: That makes sense. I mean, it worked fine for the Admin page.
Frank: That’s true. So all we have to do is put the same header authorization code in the Remove Score script, and we’re good to go, right?
Jill: Yes, that will certainly work. But I worry about duplicating all that authorization code in two places. What happens if later on we add another page that needs to be protected? Do we duplicate the code yet again?
Joe: Code duplication is definitely a problem. Especially since there is a user name and password that all the scripts need to share. If we ever wanted to change those, we’d have to make the change in every protected script.
Frank: I’ve got it! How about putting the $username
and $password
variables into their own include file, and then sharing that between the protected scripts. We could even put it in an appvars.php
include file for application variables.
Joe: I like where you’re headed but that solution only deals with a small part of the code duplication. Remember, we’re talking about a decent sized little chunk of code.
Jill: You’re both right, and that’s why I think we need a new include file that stores away all of the authorization code, not just the $username
and $password
variables.
Frank: Ah, and we can just include that script in any page we want to protect with HTTP authorization.
Joe: That’s right! We just have to make sure we always include it first thing since it relies on headers for all the HTTP authorization stuff.
We already have all the code we need for a new Authorize script; it’s just a matter of moving the code from admin.php
to a new script file (authorize.php
), and replacing the original code with a require_once
statement.
Never underestimate the ability of determined people to reverse-engineer your PHP scripts and exploit weaknesses.
Sadly, happiness in the Guitar Wars universe didn’t last for long, as bogus scores are showing up in the application in place of legitimate scores... and still inciting rage throughout the Guitar Wars universe. Apparently it’s entirely possible to disrupt the Guitar Wars high score list without removing scores. But how?
Until now we’ve operated under the assumption that any high score submitted with a screen shot image is considered verified. It’s now reasonably safe to say this is not the case! And it’s pretty clear who the culprit is...
Write down how you would solve the problem of people being able to post bogus high scores to the Guitar Wars application:
__________________________________________
__________________________________________
__________________________________________
Even in this modern world we live in, sometimes you can’t beat a real live thinking, breathing human being. In this case, it’s hard to beat a real person when it comes to analyzing a piece of information and assessing whether or not it is valid. We’re talking about moderation, where a human is put in charge of approving content posted to a web application before it is made visible to the general public.
Human moderation is an excellent way to improve the integrity of user-submitted content.
Guitar Wars could really use some human moderation. Sure, it’s still possible that someone could carefully doctor a screen shot and maybe still sneak a score by a human moderator. But it wouldn’t be easy, and it doesn’t change the fact that moderation is a great deterrent. Keep in mind that securing a PHP application is largely about prevention.
Adding a human moderation feature to Guitar Wars is significant because it affects several parts of the application. The database must change, a new script must be created to carry out an approval, the Admin page must add an “Approve” link to each score, and finally, the main page must change to only show approved scores. With this many changes involved, it’s important to map out a plan and carry out each change one step at a time.
1 | Use |
Let’s start with the database, which needs a new column for keeping up with whether or not a score has been approved. | |
Adding the new approved
column to the guitarwars
table involves a one-time usage of the ALTER TABLE
statement, which is an SQL statement we’ve used before.
The new approved
column is a TINYINT
that uses 0
to indicate an unapproved score, or 1
to indicate an approved score. So all new scores should start out with a value of 0
to indicate that they are initially unapproved.
It’s true, a new column means a new value in the INSERT
query in the Add Score script.
It’s important to not lose sight of the fact that a PHP application is a careful orchestration of several pieces and parts: a database consisting of tables with rows and columns, PHP code, HTML code, and usually CSS code. It’s not always immediately apparent that changing one part requires changing another. Adding the new approved
column in the guitarwars
table for the sake of the new Approve Score script also requires modifying the INSERT
query in the Add Score script:
All the infrastructure is now in place for the moderation feature in the Guitar Wars high score application. All that’s missing is the final step, which is altering the main page to only show approved scores. This involves tweaking the SQL SELECT
query so that it only plucks out scores whose approved
column is set to 1
(approved). This is accomplished with a WHERE
statement.
Use WHERE to select rows based on the value of a certain column.
The addition of the WHERE
statement to this query eliminates any scores that haven’t been approved, which includes all new scores. This gives the moderator a chance to look them over and decide whether they should be removed or made visible to the public (approved).
The moderated version of Guitar Wars represents a significant security improvement, but it’s far from bulletproof. It seems our wily infiltrator has managed to find another weakness in the high score system and somehow sneak her high scores past the moderator. Ethel must be stopped, permanently, in order to restore trust throughout the Guitar Wars universe.
Even though the moderator knows without a doubt that he never approved Ethel’s high score submission, it nevertheless is there in plain view with the approved
column set to 1
. We know the Add Score script sets the approved
column to 0
for new high scores because we just modified the INSERT
query in that script. Something just doesn’t add up!
In order to understand what’s happening with this clever form attack, let’s trace the flow of form data as it travels through the Add Score script.
The Score form field expects a single numeric value, such as 1000000
, but instead it has several values enclosed in single quotes, separated by commas, and then with a strange double-hyphen at the end. Very strange.
This strange data first gets stored in the $score
variable, after which it gets incorporated into the INSERT
query. This just results in a meaningless score, right? Or is something more sinister taking place here?
The real culprit in Ethel’s million-point attack is, strangely enough, SQL comments. A double-hyphen (--
) is used in SQL to comment out the remainder of a line of SQL code. You must follow the double-hyphen with a space for it to work (--
), but everything after the space is ignored. Now take a look at Ethel’s full query with that little nugget of wisdom.
Is it making more sense? The comment effectively erased the remaining SQL code so that it wouldn’t generate an error, allowing Ethel’s version of the query to slip through without a snag. The end result is an instantly approved new high score that the moderator never got a chance to catch.
Ethel’s attack is known as an SQL injection, and involves an extremely sneaky trick where form data is used as a means to change the fundamental operation of a query. So instead of a form field just supplying a piece of information, such as a name or score, it meddles with the underlying SQL query itself. In the case of Guitar Wars, Ethel’s SQL injection used the Score field as a means of not only providing the score, but also the screen shot filename, the approval value, and a comment at the end to prevent the original SQL code from generating an error.
Form fields are a security weak point for web applications because they allow users to enter data.
The real weakness that SQL injections capitalize on is form fields that aren’t validated for dangerous characters. “Dangerous characters” are any characters that could potentially change the nature of an SQL query, such as commas, quotes, or --
comment characters. Even spaces at the end of a piece of data can prove harmful. Leading or trailing spaces are easy enough to eliminate with the built-in PHP function trim()
—just run all form data through the trim()
function before using it in an SQL query.
SQL injections can be prevented by properly processing form data.
But leading and trailing spaces aren’t the whole problem. You still have the commas, quotes, comment characters, and on, and on. So in addition to trimming form fields of extra spaces, we also need a way to find and render harmless other problematic characters. PHP comes to the rescue with another built-in function, mysqli_real_escape_string()
, which escapes potentially dangerous characters so that they can’t adversely affect how a query executes. These characters can still appear as data in form fields, they just won’t interfere with queries.
Putting the trim()
and mysqli_real_escape_string()
functions together provide a solid line of defense against SQL injections.
Processing the three Guitar Wars form fields with the trim()
and mysqli_real_escape_string()
functions greatly reduces the chances of another SQL injection attack. But these two functions aren’t enough—maybe there’s a way to make the query itself less vulnerable...
Aside from exploiting weak form field protection, Ethel’s SQL injection also relied on the fact that the approved
column followed the screenshot
column in the database structure. That’s how she was able to get away with just adding 1
onto the end of INSERT
and have it go into the approved
column. The problem is that the INSERT
query is structured in such a way that it has to insert data into all columns, which adds unnecessary risk.
An INSERT query can be written so that it nails down exactly what values go in what columns.
When data is inserted into a table like this, the order of the data must line up with the order of the columns in the table structure. So the fifth piece of data will go into the screenshot
column because it’s the fifth column in the table. But it really isn’t necessary to explicitly insert the id
or approved
columns since id
is auto-incremented and approved
should always be 0
. A better approach is to focus on inserting only the data explicitly required of a new high score. The id
and approved
columns can then be allowed to default to AUTO_INCREMENT
and 0
, respectively.
We need a restructured INSERT
query that expects a list of columns prior to the list of data, with each matching one-to-one. This eliminates the risk of the approved
column being set—it’s no longer part of the query. If this kind of query looks familiar, it’s because you’ve used it several times in other examples.
INSERT INTO guitarwars (date, name, score, screenshot)
VALUES (NOW(), '$name', '$score', '$screenshot')
This version of the INSERT
query spells out exactly which column each piece of data is to be stored in, allowing you to insert data without having to worry about the underlying table structure. In fact, it’s considered better coding style to use this kind of INSERT
query so that data is inserted exactly where you intend it to go, as opposed to relying on the structural layout of the table.
Not only is it possible, but it’s a very good idea to specify DEFAULT column values whenever possible.
The SQL DEFAULT
command is what allows you to specify a default value for a column. If a column has a default value, you can forego setting it in an INSERT
query and relax in the confidence of knowing that it will automatically take on the default value. This is perfect for the approved
column in the guitarwars
table. Now we just need to modify the table one more time to set the default value for approved
to 0
(unapproved).
With the approved
column now altered to take on a default value, the new and improved INSERT
query in the Add Score script can insert high scores without even mentioning the approved
column. This is good design since there’s no need to explicitly insert a value that can be defaulted, and it adds a small extra degree of security by not exposing the approved
column to a potential attack.
One last step in minimizing the risks of SQL injection attacks involves the form validation in the Add Score script. Before checking to see if the screen shot file type or size is within the application-defined limits, the three Add Score form fields are checked to make sure they aren’t empty.
There is nothing wrong with this code as-is, but securing an application is often about going above and beyond the call of duty. Since the Score field expects a number, it makes sense to not just check for a non-empty value but for a numeric value. The PHP is_numeric()
function does just that by returning true
if a value passed to it is a number, or false otherwise. It’s consistently doing the little things, like checking for a number when you’re expecting a number, that will ultimately make your application as secure as possible from data attacks.
Whenever possible, insist on form data being in the format you’ve requested.
It seems Ethel’s will to interfere with the Guitar Wars high scores has finally been broken thanks to the improvements to the application that render it immune to SQL injections. The reigning Guitar Wars champion has responded by posting a new top score.