Seven Deadliest Web Application Attacks

Reporting limits are not the only type of policy that attackers will attempt to circumvent. In 2008, a man was convicted of a scam that defrauded Apple out of more than 9,000 iPod Shuffles (www.sfgate.com/cgi-bin/article.cgi?f=/n/a/2009/08/26/state/n074933D73.DTL). Apple set up an advance replacement program for iPods so that a customer could quickly receive a replacement for a broken device before the device was received and processed by Apple. The policy states, “You will be asked to provide a major credit card to secure the return of the defective accessory. If you do not return the defective accessory to Apple within 10 days of when we ship the replacement part, Apple will charge you for the replacement.”^[1] Part of the scam involved using credit cards past their limit when requesting replacement devices. The cards and card information were valid. Thus, they passed initial antifraud mechanisms such as verification that the mailing address matched the address on file by card's issuer. So, at this point, the cards were considered valid by the system. However, the cards were overlimit and therefore couldn't be used for any new charges. The iPods were shipped and received well before the 10-day return limit, at which time the charge to the card failed because only now the limit problem was detected. Through this scheme and another that swapped out-of-warranty devices with in-warranty serial numbers, the scammers collected $75,000 by selling the fraudulently obtained iPods (http://arstechnica.com/apple/news/2008/07/apple-sues-ipodmechanic-owner-for-massive-ipod-related-fraud.ars).

No technical vulnerabilities were exploited in the execution of this scam. It didn't rely on hacking Apple's Web site with XSS or SQL injection, nor did it break an authentication scheme or otherwise submit unexpected data to Apple. The credit-card numbers, though not owned by the scammers, and all other submitted values followed valid syntax rules that would bypass a validation filter and Web application firewall. The scam relied on the ability to use credit cards that would be authorized, but not charged – otherwise the owner of the card might detect unexpected activity. The return policy had a countermeasure to prevent someone from asking for a replacement without returning a broken device. The scammers used a combination of tactics, but the important one was choosing cards that appeared valid at one point in the workflow (putting a card on record), but was invalid at another, which is in fact a more important point in the workflow (charging the card for a failed return).

Apple's iTunes and Amazon.com's music store faced a different type of fraudulent activity in 2009. This section opened with a brief discussion of how criminals overcome the difficulty of turning stolen credit cards into real money without leaving an obvious or easily detectable trail from crime to currency. In the case of iTunes and Amazon.com, a group of fraudsters uploaded music tracks to the Web sites. The music didn't need to be high quality or have an appeal to music fans of any genre because the fraudsters used stolen credit cards to buy the tracks, thus earning a profit from royalties (www.theregister.co.uk/2009/06/10/amazon_apple_online_fraudsters/). The scheme allegedly earned the crew $300,000 dollars from 1,500 credit cards.

In the case of iTunes and Amazon.com's music store, neither Web site was compromised or attacked via some technical vulnerability. In all ways but one, the sites were used as intended; musicians uploaded tracks, customers purchased those tracks, and royalties were paid to the content's creators. The exception was that stolen credit cards were being used to purchase the music. Once again, no network device, Web-application firewall, or amount of secure coding could have prevented this type of attack because the site was just used as a conduit for money laundering. The success of the two retailers in stopping the criminals was based on policies and techniques for identifying fraudulent activity and coordinating with law enforcement to reach the point where, instead of writing off $10 downloads as expected losses due to virtual shoplifting, the complete scheme was exposed and the ringleaders identified.

Not all Web site manipulation boils down to money laundering or financial gain. In April 2009, hackers modified Time Magazine's online poll of the top 100 most influential people in government, science, and technology. Any online poll should immediately be treated with skepticism regarding its accuracy. Polls and online voting attempt to aggregate the opinions and choices of individuals. The greatest challenge is ensuring that one vote equals one person. Attackers attempt to bend a poll one way or another by voting multiple times under a single or multiple identities.^[A] In the case of the Time poll, hackers stuffed the virtual ballot box using nothing more than brute force voting to create an elegant acrostic from the first letter of the top 21 candidates (http://musicmachinery.com/2009/04/15/inside-the-precision-hack/).

A YouTube is rife with accounts being attacked by “vote bots” to suppress channels or videos with which the attackers disagree. Look for videos about them by searching for “vote bots” or start with this link, www.youtube.com/watch?v=AuhkERR0Bnw, to learn more about such attacks.

Reading down the list, the attackers managed to create the phrase, “Marblecake also the game.” They accomplished this through several iterations of attack. First, the poll did not have any mechanisms to rate limit, authenticate, or otherwise validate votes. These failings put the poll at the mercy of even the most unsophisticated attacker. Eventually, Time started to add countermeasures. The developers enforced a rate limit of one vote per IP address per candidate every 13 seconds. The per-candidate restriction enabled the attacks to throw in one positive vote for their candidate and negative votes for other candidates within each 13-second window. The developers also attempted to protect URIs by appending a hash used to authenticate each vote. The hash was based on the URI used to submit a vote and a secret value, referred to as a salt, intended to obfuscate how the hash was generated. (The utility of salts with cryptographic hash functions is discussed in Chapter 3, “Structured Query Language Injection.”) Without knowledge of the salt included in the hash generation, attackers could not forge votes. A bad vote would receive the message, “Missing validation key.”

This secret value, the salt, turned an easily guessed URI into one with a parameter that at first glance appears hard to reverse engineer, as shown below. Note that the salt itself does not appear in the URI, but the result of the hash function that used the salt appears in the key parameter:

/contentpolls/Vote.do?pollName=time100_2009&id=1885481& rating=100&key=9279fbf4490102b824281f9c7b8b8758

The key was generated by an MD5 hash, as in the following pseudocode:

salt = ?

key = MD5(salt + '/contentpolls/Vote.do?pollName=time100_2009&id=1885481&rating=100')

Without a correct salt, the key parameter could not be updated to accept arbitrary values for the id and rating, which is what needed to be manipulated. If an attacker submitted a URI such as the following (note the rating has been changed from 100 to 1), the server could easily determine that the key value doesn't match the hash that should have been generated. This is how the application would be able to verify that the URI had been generated from a legitimate vote rather than a spoofed one. Only legitimate votes, that is, voting links created by the Time Web site, would have knowledge of the salt to create correct key values.

/contentpolls/Vote.do?pollName=time100_2009&id=1885481&rating=1&key=9279fbf4490102b824281f9c7b8b8758

The brute force approach to guess the salt would start iterating through potential values until it produced an MD5 hash that matched the key within the URI. The following Python code shows a brute force attack, albeit one with suboptimal efficiency:

#!/usr/bin/python

import hashlib

key = "9279fbf4490102b824281f9c7b8b8758"

guesses = ["lost", "for", "words"]

for salt in guesses:

hasher = hashlib.md5()

hasher.update(salt + "/contentpolls/Vote.do?pollName=time100_2009& id=1885481&rating=100")

if cmp(key, hasher.hexdigest()) == 0:

print hasher.hexdigest()

break

Brute force takes time and there was no hint whether the salt might be one character, eight characters, or more. A secret value that might contain eight mixed-case alphanumeric and punctuation characters could be any one of roughly 10¹⁶ values. One dedicated computer might be able to test approximately 14,000 guesses per second. An exhaustive brute force attack wouldn't be feasible without several 100,000 computers dedicated to the task (or a lucky guess, of course).

The problem for Time was that the salt was embedded in the client-side Flash application used for voting. The client is always an insecure environment in terms of the data received from it and, in this example, the data sent to it. Disassembling the Flash application led the determined hackers to the salt: lego rules. With this in hand, it was once again possible to create URIs with arbitrary values and bypass the key-based authentication mechanism. Note that adding a salt in this case was a step in the right direction; the problem was that the security of the voting mechanism depended on the salt remaining secret, which was impossible because it had to be part of a client-side object.

alt1 Tip

If you're interested in open-source brute force tools, check out John the Ripper at www.openwall.com/john/. It supports many algorithms and being open source is easily customized by a programmer with C experience. The site also provides various word lists useful for dictionary-based tests. At the very least, you might be interested in seeing the wide range of guesses per second for different password schemes.

The Time poll hack made news not only because it was an entertaining misuse of a site's functionality but also because it highlighted the problem with trying to establish identity on the Internet. The attacks only submitted valid data (with the exception of situations where ratings were outside the expected range of 1–100, but those were not central to the success of the attack). The attacks bypassed inadequate rate-limiting policies and an obfuscated key generation scheme.

Don't dismiss these examples as irrelevant to your Web site. They share a few themes that apply more universally than just to banks, music sites, and online polls.

Loophole is just a synonym for vulnerability. Tax laws have loopholes, and Web sites have vulnerabilities. In either case, the way a policy is intended to work is different from how it works in practice. A policy's complexity may introduce contradictions or ambiguity that translates to mistakes in the way that a feature is implemented or features that work well with expected state transitions from honest users, but fail miserably in the face of misuse.

Determined attackers will probe monitoring and logging limits. This might be accomplished through assuming low thresholds, generating traffic that overwhelms the monitors such that the actual hidden attack is deeply hidden within the noise, bribing developers to obtain source code, using targeted phishing attacks against developers to obtain source code, and more steps that are limited only by creativity.

Security is an emergent property of a Web application. Individual countermeasures may address specific threats, but may have no effect or a detrimental effect on the site's overall security due to false assumptions or mistakes that arise from complexity.

Attacks do not need to submit invalid data or malicious characters to succeed. Abusing a site's functionality usually means the attacker is skipping an expected step or circumventing a policy by exploiting a loophole.

The site may be a conduit for an attack rather than a direct target of the attack. In Chapter 2, “Cross-Site Request Forgery,” we discussed how one site might contain a booby-trapped page that executes sensitive commands in the browser to another site without the victim's knowledge. In other cases, the site may be a tool for extracting hard currency from a stolen credit card, such as an auction or e-commerce application.
Attackers have large, distributed technical and information resources. Organized crime has shown coordinated ATM withdrawals using stolen account information across dozens of countries in a time window measured in minutes. Obviously, this required virtual access to steal bank information but physical presence to act upon it. In other situations, attackers may use discussion forums to anonymously share information and collaborate.

Induction

Information is a key element of logic-based attacks. One aspect of information regards the site itself, answering questions such as “What does this do?” or “What are the steps to accomplish an action?” Other types of information might be leaked by the Web site that lead to questions such as “What does this mean?” We'll first discuss an example of using induction to leverage information leaks against a Web site.

The Macworld Expo gathers Apple fanatics, press, and industry insiders to San Francisco each year. Prices to attend the event range from restricted passes for the lowly peon to extended privileges and treatment for those with expensive VIP passes. In 2007, the Expo's Web site leaked the access code to obtain a $1,695 platinum passes for free (http://news.cnet.com/2100-1002_3-6149994.html). The site used client-side JavaScript to push some validation steps off the server into the Web browser. This is a common technique that isn't insecure if server-side validation is still performed; it helps off-load bulk processing into the browser to ease resource utilization on the server. In the case of the Macworld registration page, an array of possible codes was included in the HTML. These codes ranged from small reductions in price to the aforementioned free VIP passes.

The site's developers, knowing that HTML is not a secure medium for storing secret information, obfuscated the codes with MD5 hashes. So, the code submitted by a user is converted to an MD5 hash, checked against an array of precalculated hashes, and accepted as valid if a match occurs. This is a common technique for matching a user-supplied string against a store of values that must remain secret. Consider the case where the site merely compares a value supplied by the user, VIPCODE, with an expected value, PC0602. The comparison will fail, and the site will inform the user to please try again. If the site uses the Web browser to perform the initial comparison, then a quick peek at the JavaScript source reveals the correct discount code. On the other hand, if the client-side JavaScript compared the MD5 hash of the user's discount code with a list of precalculated hashes, then the real discount code isn't immediately revealed.

However, hashes are always prone to brute force attacks. Because the conversion is performed fully within the browser adding a salt to the hash function that does not provide any incremental security – the hash must be available to, therefore visible within, the browser as well. The next step was to dump the hashes into a brute force attack. In 9 seconds, this produced a match of ADRY (http://grutztopia.jingojango.net/2007/01/your-free-macworld-expo-platinum-pass_11.html). In far less than a day's worth of work, the clever researcher obtained a free $1,695 pass – a pretty good return if you break down the value and effort into an hourly rate.

alt2 Epic Fail

In 2005, an online gaming site called Paradise Poker suffered from an issue in which observers could passively monitor the time delay between the site's virtual Black Jack dealer showing an ace and offering players insurance (http://haacked.com/archive/2005/08/29/online-games-written-by-humans.aspx). Knowing whether the dealer had 21 gave alert players an edge in minimizing their losses. This advantage led to direct financial gain based on nothing more than the virtual analog of watching a dealer's eyes light up when holding a pocket ten. (This is one of the reasons casino dealers offer insurance before determining if they're holding an ace and a ten.) This type of passive attack would be impossible for the site to detect. Only the consequence of the exploit, a player or players taking winnings far greater than the expected average, would start to raise suspicions. Even under scrutiny, the players would be seen as doing nothing more than making very good decisions when faced with a dealer who might have 21.

The Macworld Expo registration example demonstrated developers who were not remiss in security. If the codes had all been nine alphanumeric characters or longer, then the brute force attack would have taken considerably longer than a few seconds to succeed. Yet, brute force would have still been an effective, valid attack and longer codes might have been more difficult to distribute the legitimate users. The more secure solution would have moved the code validation entirely to server-side functions.^[B] This example also shows how it was necessary to understand the business purpose of the site (register attendees), a workflow (select a registration level), and purpose of code (an array of MD5 hashes). Human ingenuity and induction led to the discovery of vulnerability. No automated tool could have revealed this problem, nor would auditing the site against a security checklist have fully exposed the problem.

B As an aside, this is an excellent example where cloud computing, or computing on demand, might have been a positive aid in security. The Macworld registration system must be able to handle spikes in demand as the event nears but doesn't require the same resources year-round. An expensive hardware investment would have been underutilized the rest of the year. Because code validation was potentially a high-cost processing function, the Web site could have used an architecture that moved processing into a service-based model that would provide scalability on demand only at times when the processing was actually needed.

Player collusion in gambling predates the Internet, but like many scams, the Internet serves as a useful amplifier for fraudsters. These types of scams don't target the application or try to learn internal information about the card deck as in the case of Paradise Poker. Instead, a group of players attempt to join the same virtual gaming table to trade information about cards received and collude against the one or few players who are playing without secret partners. Normally, the policy for a game is that any two or more players caught sharing information are to be labeled cheating and at the very least they should be ejected from the game. This type of policy is easier to enforce in a casino or other situation where all the players are physically present and can be watched. Some cheaters might have a handful of secret signals to indicate good or bad hands, but the risks of being caught are far greater under direct scrutiny.

On the other hand, virtual tabletops have no mechanism for enforcing such a policy. Two players could sit in the same room or be separated by continents and easily use instant messaging or something similar to discuss strategy. Some sites may take measures to randomize the players at a table to reduce the chances of colluding players from joining the same game. This solution mitigates the risk, but doesn't remove it. Players can still be at risk from other information-based attacks. Other players might record a player's betting pattern and store the betting history in a database. Over time, these virtual tells might become predictable enough that it provides an advantage to the ones collecting and saving the data. Online games not only make it easy to record betting patterns but also enable collection on a huge scale. No longer would one person be limited to tracking a single game at a time. These are interesting challenges that arise from the type of Web application and have nothing to do with choice of programming language, software patches, configuration settings, or network controls.

Attacks against policies and procedures come in many guises. They also manifest themselves outside of Web applications (attackers also adopt fraud to Web applications). Attacks against business logic can harm Web sites, but attackers can also use Web sites as the intermediary. Consider a common scam among online auctions and classifieds. A buyer offers a cashier's check in excess of the final bid price, including a brief apology and explanation why the check is more. If the seller would only give the buyer a check in return for the excess balance, then the two parties can supposedly end the transaction on fair terms. The catch is that the buyer needs to refund soon, probably before the cashier's check can be sent or before the seller realizes the check won't be arriving. Another scam skips the artifice of buying an item. The grifter offers a check and persuades the victim to deposit it, stressing that the victim can keep a percentage, but the grifter really needs an advance on the deposited check. The check, of course, bounces.

These scams aren't limited to checks, and they exploit a loophole in how checks are handled – along with appealing to the inner greed, or misplaced trust, of the victim. Checks do not instantly transfer funds from one account to another. Even though a bank may make funds immediately available, the value of the check must clear before the recipient's account is officially updated. Think of this as a Time of Check, Time of Use (TOCTOU) problem that was mentioned in Chapter 1, “Cross-Site Scripting.”

alt1 Tip

Craigslist provides several tips on how to protect yourself from scams that try to take advantage of its site and others: www.craigslist.org/about/scams.

So, where's the Web site in this scam? That's the point. Logic-based attacks do not need a technical component to exploit a vulnerability. The problems arise from assumptions, unverified assertions, and inadequate policies. A Web site might have such a problem or simply be used as a conduit for the attacker to reach a victim.

Using induction to find vulnerabilities from information leaks falls squarely into the realm of manual methodologies. Many other vulnerabilities, from XSS to SQL injection, benefit from experienced analysis. In Chapter 3, “Structured Query Language Injection,” we discussed inference-based attacks (so-called blind SQL injection) that used variations of SQL statements to extract information from the database one bit at a time. This technique didn't rely on explicit error messages, but on differences in observed behavior of the site – differences that ranged from the time required to return an HTTP response to the amount or type of content with the response.

Denial of Service

Denial of Service (DoS) attacks consume a Web site's resources to such a degree that the site becomes unusable to legitimate users. In the early days (relatively speaking, let's consider the 1990s as early) of the Web, DoS attacks could rely on techniques as simple as generating traffic to take up bandwidth. These attacks are still possible today, especially in the face of coordinated traffic from botnets.^[C] The countermeasures to network-based DoS largely fall out of the purview of the Web application. On the other hand, other DoS techniques will target the business logic of the Web site and may or may not rely on high bandwidth.

C Botnets have been discovered that range in size from a few thousand compromised systems to a few million. Their uses range from spam to DoS to stealing personal information. One top 10 list of botnets can be found at www.networkworld.com/news/2009/072209-botnets.html.

For example, think of an e-commerce application that desires to fight fraud by running simple verification checks (usually based on matching a zip code) on credit cards before a transaction is made. This verification step might be attacked by repeatedly going through a checkout process without completing the transaction. Even if the attack does not generate enough requests to impede the Web site's performance, the amount of queries might incur significant costs for the Web site – costs that aren't recouped because the purchase was canceled after the verification step but before it was fully completed.

alt1 Warning

DoS need not always target bandwidth or server resources. More insidious attacks can target actions with direct financial consequence for the site. Paying for bandwidth is already a large concern for many site operators, so malicious traffic of any nature is likely to incur undesirable costs. Attacks can also target banner advertising by using click fraud to drain money out of the site's advertising budget. Or attacks might target back-end business functions such as credit-card verification systems that charge per request. This type of malicious activity doesn't make the site less responsive for other users, but it has a negative impact on the site's financial status.

Understanding Logic Attacks

Abusing Workflows

Exploit Policies and Practices

alt1 Tip

Induction

alt2 Epic Fail

alt1 Tip

Denial of Service

alt1 Warning