Security is always a trade-off. Often it’s security versus convenience, but sometimes it’s security versus features or security versus performance. That we prefer all of those things over security is most of the reason why computers are insecure, but it’s also true that securing computers is actually hard.
In 1989, Internet security expert Gene Spafford famously said: “The only truly secure system is one that is powered off, cast in a block of concrete and sealed in a lead-lined room with armed guards—and even then I have my doubts.” Almost 30 years later, that’s still true.
It’s true for stand-alone computers, and it’s true for the Internet-connected embedded computers that are everywhere. More recently, former National Cybersecurity Center director Rod Beckstrom summarized it this way: (1) anything connected to the Internet can be hacked; (2) everything is being connected to the Internet; (3) as a result, everything is becoming vulnerable.
Yes, computers are so hard to secure that every security researcher has his own pithy saying about it. Here’s mine from 2000: “Security is a process, not a product.”
There are many reasons why this is so.
I play Pokémon Go on my phone, and the game crashes all the time. Its instability is extreme, but not exceptional. We’ve all experienced this. Our computers and smartphones crash regularly. Websites don’t load. Features don’t work. We’ve all learned how to compensate. We compulsively save our data and back up our files, or use systems that do it for us automatically. We reboot our computers when things start behaving weirdly. We occasionally lose important data. And we don’t expect our computers to work as well as the typical consumer products in our lives, even though we get continually frustrated when they don’t.
Software is poorly written because, with only a few exceptions, the market doesn’t reward good-quality software. “Good, fast, cheap—pick any two”; inexpensive and quick to market is more important than quality. For most of us most of the time, poorly written software has been good enough.
This philosophy has permeated the industry at all levels. Companies don’t reward software quality in the same way they reward delivering products ahead of schedule and under budget. Universities focus more on code that barely works than on code that’s reliable. And most of us consumers are unwilling to pay what doing better would cost.
Modern software is riddled with a myriad of bugs. Some of them are inherent in the complexity of the software—more on that later—but most are programming mistakes. These bugs were not fixed during the development process; they remain in the software after it has been finished and shipped. That any of this software functions at all is a testament to how well we can engineer around buggy software.
Of course, not all software development processes are created equal. Microsoft spent the decade after 2002 improving its software development process to minimize the number of security vulnerabilities in shipped software. Its products are by no means perfect—that’s beyond the capabilities of the technologies right now—but they’re a lot better than average. Apple is known for its quality software. So is Google. Some very small and critical pieces of software are high quality. Aircraft avionics software is written to a much more rigorous quality standard than just about everything else. And NASA had a famous quality control process for its space shuttle software.
The reasons why these are exceptions vary from industry to industry, company to company. Operating system companies spend a lot of money; small pieces of code are easy to get right; airplane software is highly regulated. NASA still has crazily conservative quality assurance standards. And even for relatively high-quality software systems like Windows, macOS, iOS, and Android, you’re still installing patches all the time.
Some bugs are also security vulnerabilities, and some of those security vulnerabilities can be exploited by attackers. An example is something called a buffer overflow bug. It’s a programming mistake that allows an attacker, in some cases, to force the program to run arbitrary commands and take control of the computer. There are lots of areas of potential mistakes like this, some easier to make than others.
Here, numbers are hard to pin down. We don’t know what percentage of bugs are also vulnerabilities and what percentage of vulnerabilities are exploitable, and there is legitimate academic debate about whether these exploitable bugs are sparse or plentiful. I come down firmly on the side of plentiful: large software systems have thousands of exploitable vulnerabilities, and breaking into these systems is a matter—sometimes simple, sometimes not—of finding one of them.
But while vulnerabilities are plentiful, they’re not uniformly distributed. There are easier-to-find ones, and harder-to-find ones. Tools that automatically find and fix entire classes of vulnerabilities, and coding practices that eliminate many easy-to-find ones, greatly improve software security. And when one person finds a vulnerability, it is more likely that another person soon will, or recently has, found the same vulnerability. Heartbleed is a vulnerability in web security. It remained undiscovered for two years, and then two independent researchers found it within days of each other. The Spectre and Meltdown vulnerabilities in computer chips existed for at least ten years before multiple researchers discovered them in 2017. I have seen no good explanation for this parallel discovery other than it just happens; but it will matter when we talk about governments stockpiling vulnerabilities for espionage and cyberweapons in Chapter 9.
The explosion of IoT devices means more software, more lines of code, and even more bugs and vulnerabilities. Keeping IoT devices cheap means less-skilled programmers, sloppier software development processes, and more code reuse—and hence a greater impact from a single vulnerability if it is widely replicated.
The software we depend on—that’s running on our computers and phones, in our cars and medical devices, on the Internet, in systems controlling our critical infrastructure—is insecure in multiple ways. This isn’t simply a matter of finding the few vulnerabilities and fixing them; there are too many for that. It’s a software fact of life that we’re going to have to live with for the foreseeable future.
In April 2010, for about 18 minutes, 15% of all Internet traffic suddenly passed through servers in China on the way to its destination. We don’t know if this was the Chinese government testing an interception capability or it was an honest mistake, but we know how the attackers did it: they abused the Border Gateway Protocol.
The Border Gateway Protocol, or BGP, is how the Internet physically routes traffic through the various cables and other connections between service providers, countries, and continents. Because there’s no authentication in the system and everyone implicitly trusts all information about speed and congestion, BGP can be manipulated. We know from documents disclosed by government-contractor-turned-leaker Edward Snowden that the NSA uses this inherent insecurity to make certain data streams easier to eavesdrop on. In 2013, one company reported 38 different instances where Internet traffic was diverted to routers at Belarusian or Icelandic service providers. In 2014, the Turkish government used this technique to censor parts of the Internet. In 2017, traffic to and from several major US ISPs was briefly routed to an obscure Russian Internet provider. And don’t think this kind of attack is limited to nation-states; a 2008 talk at the DefCon hackers conference showed how anyone can do it.
When the Internet was developed, what security there was focused on physical attacks against the network. Its fault-tolerant architecture can handle servers and connections failing or being destroyed. What it can’t handle is systemic attacks against the underlying protocols.
The base Internet protocols were developed without security in mind, and many of them remain insecure to this day. There’s no security in the “From” line of an e-mail: anyone can pretend to be anyone. There’s no security in the Domain Name Service that translates Internet addresses from human-readable names to computer-readable numeric addresses, or the Network Time Protocol that keeps everything in synch. There’s no security in the original HTML protocols that underlie the World Wide Web, and the more secure “https” protocol still has lots of vulnerabilities. All of these protocols can be subverted by attackers.
These protocols were invented in the 1970s and early 1980s, when the Internet was limited to research institutions and not used for anything critical. David Clark, an MIT professor and one of the architects of the early Internet, recalls: “It’s not that we didn’t think about security. We knew that there were untrustworthy people out there, and we thought we could exclude them.” Yes, they really thought they could limit Internet usage to people they knew.
As late as 1996, the predominant thinking was that security would be the responsibility of the endpoints—that’s the computers in front of people—and not the network. Here’s the Internet Engineering Task Force (IETF), the body that sets industry standards for the Internet, in 1996:
It is highly desirable that Internet carriers protect the privacy and authenticity of all traffic, but this is not a requirement of the architecture. Confidentiality and authentication are the responsibility of end users and must be implemented in the protocols used by the end users. Endpoints should not depend on the confidentiality or integrity of the carriers. Carriers may choose to provide some level of protection, but this is secondary to the primary responsibility of the end users to protect themselves.
This is not obviously stupid. In Chapter 6, I’ll talk about the end-to-end networking model, which means that the network shouldn’t be responsible for security, as the IETF outlined. But people were too rigid about that for too long, and even aspects of security that only make sense to include inside the network were not being adopted.
Fixing this has been hard, and sometimes impossible. Since as far back as the 1990s, the IETF has offered proposals to add security to BGP to prevent attacks, but these proposals have always suffered from a collective action problem. Adopting the more secure system provides benefits only when enough networks do it; early adopters receive minimal benefit for their hard work. This situation results in a perverse incentive. It makes little sense for a service provider to be the first to adopt this technology, because it pays the cost and receives no benefit. It makes much more sense to wait and let others go first. The result, of course, is what we’re seeing: 20 years after we first started talking about the problem, there’s still no solution.
There are other examples like this. DNSSEC is an upgrade that would solve the security problems with the Domain Name Service protocol. As with BGP, there’s no security in the existing protocol and all sorts of ways the system can be attacked. And as with BGP, it’s been 20 years since the tech community developed a solution that still hasn’t been implemented because it requires most sites to adopt it before anyone sees benefits.
Recall an old-style telephone, the kind your parents or grandparents would have had in their homes. That object was designed and manufactured as a telephone, and that’s all it did and all it could do. Compare that to the telephone in your pocket right now. It’s not really a telephone; it’s a computer running a telephone app. And, as you know, it can do much, much more. It can be a telephone, a camera, a messaging system, a book reader, a navigation aid, and a million other things. “There’s an app for that” makes no sense for an old-style telephone, but is obvious for a computer that makes phone calls.
Similarly, in the centuries after Johannes Gutenberg invented the printing press around 1440, the technology improved considerably, but it was still basically the same mechanical—and then electromechanical—device. Throughout those centuries, a printing press was only ever a printing press. No matter how hard its operator tried, it couldn’t be made to perform calculus or play music or weigh fish. Your old thermostat was an electromechanical device that sensed the temperature, and turned a circuit on and off in response. That circuit was connected to your furnace, which gave the thermostat the ability to turn your heat on and off. That’s all it could do. And your old camera could only take pictures.
These are now all computers, and as such, they can be programmed to do almost anything. Recently, hackers demonstrated this by programming a Canon Pixma printer, a Honeywell Prestige thermostat, and a Kodak digital camera to play the computer game Doom.
When I tell that anecdote from the stage at tech conferences, everyone laughs at these new IoT devices playing a 25-year-old computer game—but no one is surprised. They’re computers; of course they can be programmed to play Doom.
It’s different when I tell the anecdote to a nontechnical audience. Our mental model of machines is that they can only do one thing—and if they’re broken, they don’t do it. But general-purpose computers are more like people; they can do almost anything.
Computers are extensible. As everything becomes a computer, this extensibility property will apply to everything. This has three ramifications when it comes to security.
One: extensible systems are hard to secure, because designers can’t anticipate every configuration, condition, application, use, and so on. This is really an argument about complexity, so we’ll take it up again in a bit.
Two: extensible systems can’t be externally limited. It’s easy to build a mechanical music player that only plays music from magnetic tapes stored in a particular physical housing, or a coffee maker that only uses disposable pods shaped a certain way, but those physical constraints don’t translate to the digital world. What this means is that copy protection—it’s known as digital rights management, or DRM—is basically impossible. As we’ve learned from the experiences of the music and movie industries over the past two decades, we can’t stop people from making and playing unauthorized copies of digital files.
More generally, a software system cannot be constrained, because the software used for constraining can be repurposed, rewritten, or revised. Just as it’s impossible to create a music player that refuses to play pirated music files, it’s impossible to create a 3D printer that refuses to print gun parts. Sure, it’s easy to prevent the average person from doing any of these things, but it’s impossible to stop an expert. And once that expert writes software to bypass whatever controls are in place, everyone else can do it, too. And this doesn’t take much time. Even the best DRM systems don’t last 24 hours. We’ll talk about this again in Chapter 11.
Three: extensibility means that every computer can be upgraded with additional features in software. These can accidentally add insecurities, both because the new features will contain new vulnerabilities, and because the new features probably weren’t anticipated in the original design. But, more importantly, new features can be added by attackers as well. When someone hacks your computer and installs malware, they’re adding new features. They’re features you didn’t ask for and didn’t want, and they’re features acting against your interest, but they are features. And they can, at least in theory, be added to every single computer out there.
“Backdoors” are also additional features in a system. I’ll be using this term a lot in the book, so it’s worth pausing to define it. It’s an old term from cryptography, and generally refers to any purposely designed access mechanism that bypasses a computer system’s normal security measures. Backdoors are often secret—and added without your knowledge and consent—but they don’t have to be. When the FBI demands that Apple provide a way to bypass the encryption in an iPhone, what the agency is demanding is a backdoor. When researchers spot a hard-coded extra password in Fortinet firewalls, they’ve found a backdoor. When the Chinese company Huawei inserts a secret access mechanism into its Internet routers, it has installed a backdoor. We’ll talk more about these in Chapter 11.
All computers can be infected with malware. All computers can be commandeered with ransomware. All computers can be dragooned into a botnet—a network of malware-infected devices that is controlled remotely. All computers can be remotely wiped clean. The intended function of the embedded computer, or the IoT device into which the computer is built, makes no difference. Attackers can exploit IoT devices in all the ways they currently exploit desktop and laptop computers.
Today, on the Internet, attackers have an advantage over defenders.
This is not inevitable. Historically, the advantage has seesawed between attack and defense over periods of decades and centuries. The history of warfare illustrates that nicely, as different technologies like machine guns and tanks shifted the advantage one way or another. But today, in computers and on the Internet, attack is easier than defense—and it’s likely to remain that way for the foreseeable future.
There are many reasons for this, but the most important is the complexity of these systems. Complexity is the worst enemy of security. The more complex a system is, the less secure it is. And our billions of computers, each with their tens of millions of lines of code, connected into the Internet, with its trillions of webpages and unknown zettabytes of data—comprise the most complex machine humankind has ever built.
More complexity means more people involved, more parts, more interactions, more layers of abstraction, more mistakes in the design and development process, more difficulty in testing, more nooks and crannies in the code where insecurities can hide.
Computer security experts like to speak about the attack surface of a system: all the possible points that an attacker might target and that must be secured. A complex system means a large attack surface, and that means a huge advantage for a would-be attacker. The attacker just has to find one vulnerability—one unsecured avenue for attack—and gets to choose the time and method of attack. He can also attack constantly until successful. At the same time, the defender has to secure the entire attack surface from every possible attack all the time. And while the defender has to win every time, the attacker only has to get lucky once. It’s simply not a fair battle—and the cost to attack a system is only a fraction of the cost to defend it.
Complexity goes a long way to explaining why computer security is still so hard, even as security technologies improve. Every year, there are new ideas, new research results, and new products and services. But at the same time, every year, increasing complexity results in new vulnerabilities and attacks. We’re losing ground even as we improve.
Complexity also means that users often get security wrong. Complex systems often have lots of options, making them hard to use securely. Users regularly fail to change default passwords, or misconfigure access control on data in the cloud. In 2017, Stanford University blamed “misconfigured permissions” for exposing thousands of student and staff records. There are lots of these stories.
There are other reasons, aside from complexity, why attack is easier than defense. Attackers have a first-mover advantage, along with a natural agility that defenders often lack. They often don’t have to worry about laws, or about conventional morals or ethics, and can more quickly make use of technical innovations. Because of the current disincentives to improve, we’re terrible at proactive security. We rarely take preventive security measures until an attack happens. Attackers also have something to gain, while defense is typically a cost of doing business that companies are looking to minimize—and many executives still don’t believe they could be a target. More advantages go to the attacker.
This doesn’t mean that defense is futile, only that it’s difficult and expensive. It’s easier, of course, if the attacker is a lone criminal who can be persuaded to switch to an easier target. But a sufficiently skilled, funded, and motivated attacker will always get in. Talking about nation-state cyber operations, former NSA deputy director Chris Inglis was quoted as putting it this way: “If we were to score cyber the way we score soccer, the tally would be 462–456 twenty minutes into the game, i.e., all offense.” That’s about right.
Of course, just because attack is technically easy doesn’t mean it’s pervasive. Murder is easy, too, but few actually do it, because of all the social systems around identifying, condemning, and prosecuting murderers. On the Internet, prosecution is more difficult because attribution is difficult—a topic we’ll discuss in Chapter 3—and because the international nature of Internet attacks results in difficult jurisdictional issues.
The Internet+ will make these trends worse. More computers, and especially more different kinds of computers, means more complexity.
The Internet is filled with emergent properties and unintended consequences. That is, even experts really don’t understand how the different parts of the Internet interact with each other as well as we think we do, and we are regularly surprised by how things actually work. This is also true for vulnerabilities.
The more we network things together, the more vulnerabilities in one system will affect other systems. Three examples:
Systems can affect other systems in unforeseen, and potentially harmful, ways. What might seem benign to the designers of a particular system becomes harmful when it’s combined with some other system. Vulnerabilities on one system cascade into other systems, and the result is a vulnerability that no one saw coming. This is how things like the Three Mile Island nuclear disaster, the Challenger space shuttle explosion, or the 2003 blackout in the US and Canada could happen.
Unintended effects like these have two ramifications. One: the interconnections make it harder for us to figure out which system is at fault. And two: it’s possible that no single system is actually at fault. The cause might be the insecure interaction of two individually secure systems. In 2012, someone compromised reporter Mat Honan’s Amazon account, which allowed them to gain access to his Apple account, which gave them access to his Gmail account, which allowed them to take over his Twitter account. The particular trajectory of the attack is important; some of the vulnerabilities weren’t in the individual systems, but became exploitable only when used in conjunction with each other.
There are other examples. A vulnerability in Samsung smart refrigerators left users’ Gmail accounts open to attack. The gyroscope on your iPhone, put there to detect motion and orientation, is sensitive enough to pick up acoustic vibrations and therefore can eavesdrop on conversations. The antivirus software sold by Kaspersky accidentally (or purposefully) steals US government secrets.
If 100 systems are all interacting with each other, that’s about 5,000 interactions and 5,000 potential vulnerabilities resulting from those interactions. If 300 systems are all interacting with each other, that’s 45,000 interactions. One thousand systems means half a million interactions. Most of them will be benign or uninteresting, but some of them will have very damaging consequences.
Computers don’t fail in the same way “normal” things do. They’re vulnerable in three different and important ways.
One: distance doesn’t matter. In the real world, we’re concerned about security against the average attacker. We don’t buy a door lock to keep out the world’s best burglar. We buy one to keep out the average burglars that are likely to be wandering around our neighborhoods. I have a home in Cambridge, and if there’s a super-skilled burglar in Canberra, I don’t care. She’s not going to fly around the world to rob my house. On the Internet, though, a Canberra hacker can just as easily hack my home network as she can hack a network across the street.
Two: the ability to attack computers is decoupled from the skill to attack them. Software encapsulates skill. That super-skilled hacker in Canberra can encapsulate her expertise into software. She can automate her attack and have it run while she sleeps. She can then give it to everyone else in the world. This is where the term “script kiddie” comes from: someone with minimal skill but powerful software. If the world’s best burglar could freely distribute a tool that allowed the average burglar to break into your house, you would be more concerned about home security.
Free distribution of potentially dangerous hacking tools happens all the time on the Internet. The attacker who created the Mirai botnet released his code to the world, and within a week a dozen attack tools had incorporated it. This is an example of what we call malware: worms and viruses and rootkits that give even unskilled attackers enormous capabilities. Hackers can buy rootkits on the black market. They can hire ransomware-as-a-service. European companies like HackingTeam and Gamma Group sell attack tools to smaller governments around the globe. The Russian Federal Security Service had a 21-year-old Kazakh Canadian citizen, Karim Baratov, running the phishing attacks that led to the successful attack on the Democratic National Committee in 2016. The malware was created by the skilled hacker Alexsey Belan.
Three: computers fail all at once or not at all. “Class break” is a concept from computer security. It’s a particular kind of security vulnerability that breaks not just one system, but an entire class of systems. Examples might be an operating system vulnerability that allows an attacker to take remote control of every computer that runs on that operating system. Or a vulnerability in Internet-enabled digital video recorders and webcams that allows an attacker to conscript those devices into a botnet.
The Estonian national ID card suffered a class break in 2017. A cryptographic flaw forced the government to suspend 760,000 cards used for all sorts of government services, some in high-security settings.
The risks are exacerbated by software and hardware monoculture. Nearly all of us use one of three computer operating systems and one of two mobile operating systems. More than half of us use the Chrome web browser; the other half use one of five others. Most of us use Microsoft Word for word processing and Excel for spreadsheets. Nearly all of us read PDFs, look at JPEGs, listen to MP3s, and watch AVI video files. Nearly every device in the world communicates using the same TCP/IP Internet protocols. And basic computer standards are not the only source of monocultures. According to a 2011 DHS study, GPS is essential to 11 out of 15 critical infrastructure sectors. Class breaks in these, and countless other common functions and protocols, can easily affect many millions of devices and people. Right now, the IoT is showing more diversity, but that won’t last unless some pretty basic economic policies change. In the future, there will only be a few IoT processors, a few IoT operating systems, a few controllers, and a few communications protocols.
Class breaks lead to worms, viruses, and other malware. Think “attack once, impact many.” We’ve conceived of voting fraud as unauthorized individuals trying to vote, not as the remote manipulation by a single person or organization of Internet-connected voting machines or online voter rolls. But this is how computer systems fail: someone hacks the machines.
Consider a pickpocket. Her skill took time to develop. Each victim is a new job, and success at one theft doesn’t guarantee success with the next. Electronic door locks, like the ones you now find in hotel rooms, have different vulnerabilities. An attacker can find a flaw in the design that allows him to create a key card that opens every door. If he publishes his attack software, then it’s not just the attacker, but anyone, who can now open every lock. And if those locks are connected to the Internet, attackers could potentially open door locks remotely—they could open every door lock remotely at the same time. That’s a class break.
In 2012, this happened to Onity, a company that makes electronic locks fitted on over four million hotel rooms for chains like Marriott, Hilton, and InterContinental. A homemade device enabled hackers to open the locks in seconds. Someone figured that out, and instructions on how to build the device quickly spread. It took months for Onity to realize it had been hacked, and—because there was no way to patch the system (I’ll talk about this in Chapter 2)—hotel rooms were vulnerable for months and even years afterwards.
Class breaks are not a new concept in risk management. It’s the difference between home burglaries and house fires, which happen occasionally to different houses in a neighborhood over the course of the year, and floods and earthquakes, which either happen to everyone in the neighborhood or to no one. But computers have aspects of both at the same time, while also having aspects of a public health risk model.
This nature of computer failures changes the nature of security failures, and completely upends how we need to defend against them. We’re not concerned about the threat posed by the average attacker. We’re concerned about the most extreme individual who can ruin it for everyone.
The Data Encryption Standard, or DES, is an encryption algorithm from the 1970s. Its security was deliberately designed to be strong enough to resist then-feasible attacks, but just barely. In 1976, cryptography experts estimated that building a machine to break DES would cost $20 million. In my 1995 book Applied Cryptography, I estimated that the cost had dropped to $1 million. In 1998, the Electronic Frontier Foundation built a custom machine for $250,000 that could break DES encryption in less than a day. Today, you can do it on your laptop.
In another realm, in the 1990s, cell phones were designed to automatically trust cell towers without any authentication systems. This was because authentication was hard, and it was hard to deploy fake cell phone towers. Fast-forward a half decade, and stingray fake cell towers became an FBI secret surveillance tool. Fast-forward another half decade, and setting up a fake cell phone tower became so easy that hackers demonstrate it onstage at conferences.
Similarly, the increasing speed of computers has made them exponentially faster at brute-force password guessing: trying every password until it finds the correct one. Meanwhile, the typical length and complexity of passwords that the average person is willing and able to remember has remained constant. The result is passwords that were secure ten years ago but are insecure today.
I first heard this aphorism from an NSA employee: “Attacks always get better; they never get worse.” Attacks get faster, cheaper, and easier. What is theoretical today becomes practical tomorrow. And because our information systems stay around far longer than we plan for, we have to plan for attackers with future technology.
Attackers also learn and adapt. This is what makes security different from safety. Tornadoes are a safety issue, and we could talk about different defenses against them and their relative effectiveness, and wonder about how future technological advances might better protect us from their destructiveness. But whatever we choose to do or not do, we know that tornadoes will never adapt to our defenses and change their behavior. They’re just tornadoes.
Human adversaries are different. They’re creative and intelligent. They change tactics, invent new things, and adapt all the time. Attackers examine our systems, looking for class breaks. And once one of them finds one, they’ll exploit it again and again until the vulnerability is fixed. A security measure that protects networks today might not work tomorrow because the attackers will have figured out how to get around it.
All this means that expertise flows downhill. Yesterday’s top-secret military capabilities become today’s PhD theses and tomorrow’s hacking tools. Differential cryptanalysis was such a capability, discovered by the NSA sometime before 1970. In the 1970s, IBM mathematicians discovered it again when they designed DES. The NSA classified IBM’s discovery, but the technique was rediscovered by academic cryptographers in the late 1980s.
Defense is always in flux. What worked yesterday might not work today and almost certainly won’t work tomorrow.