Chapter 15

REAL AND ARTIFICIAL INTELLIGENCE

Whoever becomes the leader [in AI] will become the ruler of the world.

—VLADIMIR PUTIN

We headed for the Eiffel Tower to see the future of cyberspace. It was a hot August day in 2016, and we were not in Paris, but in the Paris Hotel and Casino, with its replica of the Eiffel Tower, in Las Vegas. After a briefing in a small white-and-gold ballroom meant to be reminiscent of eighteenth-century Versailles, we were ushered into a hall next door, a vast twenty-first-century space filled with computer racks and purple lights. We had one question in mind: Was artificial intelligence (AI) the revolutionary tool that would finally give cyber defense the advantage?

We will return to Las Vegas in a bit, but first let’s put in perspective the introduction of AI into cyber-war. The ongoing low-grade cyber wars around the world are not static. Both offense and defense are forever developing tools or, as in the case of AI, entirely new classes of weapons. Having new technologies hit the battlefield during a period of conflict is not unusual.

During World War II, warring nations spent six years engaged in the most destructive campaigns in human history, killing millions and wiping out hundreds of cities. They did so with technology such as tanks and bomber aircraft that had evolved significantly since they were introduced, when the nations involved had engaged in the same kind of mass folly in Europe only twenty years earlier.

Those evolved weapons were far more destructive the second time around, but they were merely more advanced versions of the same kinds of technologies that had emerged in the Great War. While WWII was going on, however, some of the participants were engaged in frantic work back in the labs. The Germans made advances in rockets and weaponized them, creating ballistic missiles (the V-2) that were impervious to air defenses and could destroy entire city blocks. The Americans, with considerable help from immigrant European scholars, invented nuclear weapons. The Americans had been using large formations of B-29 Superfortress aircraft to drop thousands of conventional explosive bombs on Japanese cities. They then used one B-29, dropping only one bomb, to obliterate an entire city. A few days later, they did it again.

Nuclear weapons are a technology and have an outcome that is so qualitatively and quantitatively different from any weapon ever employed before that experts agreed that they created an entirely new era in war. They also created a form of peace. Almost seventy-five years after the first two nuclear weapons were used, no further nuclear combat detonations have occurred. The weapons are, however, used every day to deter and prevent certain kinds of warfare. Significant diplomatic and intelligence agency efforts have been expended trying to prevent other nations from acquiring nuclear weapons, but eight nations followed the Americans in successfully acquiring them, at considerable financial cost.

Just as nuclear weapons were developed during the war, AI and quantum cyber weapons are being developed during the present period of cyber hostilities. The more important historical similarity at work here, however, is that the qualitative and quantitative changes in cyber warfare that these two new classes of weapons can bring about could be as significant as the difference between what one conventional bomb dropped by a single B-29 aircraft could do compared with what one nuclear weapon dropped from that same aircraft actually did.

That may seem like hyperbole, especially given what AI has been used to do thus far in cybersecurity, but the use of AI in cyber war has barely started, and quantum capabilities have yet to be employed at all in the cybersecurity realm. Moreover, few analysts have begun to examine what the combination of the two new technologies could bring to the effort to secure cyberspace.

Both AI and quantum have been the subject of a lot of hype, venture-capital investment, and fearmongering about an arms race of sorts with China. So, in this and the next chapter we are going to explore these two new computer science technologies and specifically what they mean to cyber war.

The Reality of the Artificial

When Vladimir Putin told an audience of Russian students in September 2017, “Whoever becomes the leader [in AI] will become the ruler of the world,” he sounded a bit like a pseudo-technologist McKinsey consultant. China’s President Xi seems to agree with Putin, however, because China has set a national goal of being the dominant country in AI by 2030. These and other statements kicked off a round of punditry focused on the new “arms race in AI,” and what America must do to win it.

In the nonmilitary, nonsecurity arena, America seems to be doing quite well deploying AI. It is already widely in use in fields as diverse as banking and finance, logistics, advertising, and even medicine. To see what AI means for cybersecurity, we are going to take a bit of a digression to discuss the AI field in general, beginning by defining what we mean by the term. It is often the case in the field of information technology generally, and in the subset of IT that is cybersecurity specifically, that terms are thrown around loosely and commonly utilized definitions are hard to find.

Buzzword bingo is a common parlor game in cybersecurity. At least since 2012 at cybersecurity conferences, such as the huge RSA convention, the term “AI” has been used with wanton abandon and imprecision. AI, it would seem, is like bacon: it makes everything taste better. Thus, it is now alleged to be incorporated in many cybersecurity products. The frequent use and misuse of the term “AI,” and especially of the subsets of AI, can be confusing, especially to policy wonks.

Historians point to the 1956 Dartmouth Summer Research Project on Artificial Intelligence as the birthplace of AI, a meeting at which computer scientists agreed that it could be possible someday to have computational machines do things that hitherto had only been done by humans. AI was, thus, originally meant to be the simulation by machines of certain human cerebral activity. Many of the current uses of AI are still, in fact, attempts to have machines do things that humans do. What is important for us, however, is that AI has moved on to do things that no individual human could do, indeed what even groups of highly trained humans could not reliably do in any reasonable amount of time.

Using AI, machines can now have meaningful visual capacity, so-called computer vision. They can see things by translating images into code and classifying or identifying what appears in the image or video. Cars can now see other cars, view and understand certain traffic signs, and use the knowledge they gain from their visual capacity to make and implement decisions such as braking to avoid an accident. AI can “see” someone doing something on a digital video feed and recognize that the action requires an alarm: a package has been left unattended on a train platform, alert a guard. AI can “see” a face and, perhaps, recognize and associate it with a name, and maybe even correlate it with a police be-on-the-lookout (BOLO) notice. These types of facial recognition systems have been deployed in concert with CCTV systems in China on a massive scale in order to apprehend wanted persons with unparalleled speed.

Of course, many life-forms have visual capacity, so AI is not doing something uniquely human when it sees things and reacts to them, but humans were unique in their ability to engage in speech and conversation with other humans, until AI. Now machines can speak, not merely playing back recorded messages, but thinking about what it is appropriate to say in a context and then doing so. Moreover, they can then react to what is spoken back at them by a human, and sometimes, within the limits of their programming, even do so with an appropriate and humanlike response. (If you are thinking about Siri and its limitations, be assured that there are far more powerful programs working today in research labs.)

AI is also being used to allow machines to walk and perform other movements, identifying what is an obstacle and determining what to do to get around it. Machines created by Boston Dynamics have demonstrated remarkable dexterity using AI programs to guide their decision making as they traverse real-world obstacles outside the laboratory.

The field of AI gained greatest acceptance in the corporate world when it began processing the sea of data that the rest of information technology was producing. The subset of AI that is data mining proved that software could far exceed human capabilities. It could sift through great volumes of data in seconds to find what a large team of humans would have taken weeks to do. Moreover, it could read and correlate data from multiple databases, each formatted in a different way. Advanced forms of data mining could look at and pull information from both structured data, such as organized databases or Microsoft Excel spreadsheets, and unstructured data, such as a photograph, audio recording, or text document.

The smarter cousin of data mining and the most powerful type of AI is machine learning (ML). ML applications began with categorization. Computer scientists input data with labels assigning inputs to one category or another: this is a dog, this is a cat. The software “learned” what the distinctions were in the data that led to the labels, and was then able to do the sorting itself without them. This form of ML that requires training the application is called supervised learning. Other ML applications can sift through oceans of data and detect commonalities and patterns without being told what to look for. This unsupervised learning actually does somewhat simulate human learning and thought, most of which comes down to noting differences and patterns.

Within ML (got those acronyms now?) are two other subsets, deeper levels, literally, of machine learning. The first is what is called an artificial neural network. ANNs somewhat simulate in software design the way human cerebral wiring is structured, the way neurons interact with other neurons to send messages with charges of varying strength to cause thoughts within the human brain or to cause body parts to take actions. ANNs can adjust their own “wiring” and weighting based upon patterns in data, in order to improve themselves and their predictive abilities. For instance, an ANN that classifies pictures of cats or dogs will, over time, learn the distinguishing factors of the two types of animals and adjust its “wiring” so that its cat or dog predictions are more accurate. The second type of ML you will also hear about is deep learning, which is in turn a type of ANN that uses multiple layers of “neurons” to analyze data, allowing it to perform very complex analysis. Enough with the definitions. What can AI do defensively or offensively in security and warfare?

Artificially Intelligent About Security? AI/ML for the Defense

On a large corporate network today, there are between three and six dozen separate cybersecurity software applications in use, each contributing to the security of the overall network in a specific capacity. Pity the poor chief information security officer, who has to make the best-of-breed selection for each of those dozens of tools and then integrate them. She will likely select products from more than twenty different vendor companies. In the last three years, many of those vendors have claimed that they have woven AI into their products, and in fact many have done so to one degree or another. This trend toward AI-enabled, single-function, or “one-trick” security applications is one of the reasons that the balance is shifting away from offense and toward defense.

The most widely deployed cybersecurity products incorporating AI today are endpoint protection systems. In fact, this kind of software has begun to replace the traditional antivirus software packages, which were the first cybersecurity products in widespread use beginning thirty years ago. Unlike traditional antivirus or intrusion detection systems, which check network packets against “known bad” signatures (a blacklist approach), the endpoint protection systems using AI ask whether the user is trying to do something they have never done before. Is the activity unneeded or even unauthorized for the user’s role? Would the activity being attempted damage the network or security safeguards on the network? These products are learning patterns, rather than applying blacklists, and modifying their behavior as they learn. This software learns not just from what it sees on its endpoint, not just from what happens on other endpoints on the network, but, in a classic example of Metcalfe’s law, they learn from every endpoint on every network on which they are deployed.

A second widespread use of AI today is in applications known as vulnerability managers. AI can intake machine-readable intelligence reports on new threats and can automatically prioritize those threats based upon what it already knows or can quickly find out about your network. For example, an AI-driven vulnerability manager could check to see whether you have already addressed the weakness the new attack exploits. Have you installed a patch from the software manufacturer, or is the newly reported exploit attacking something you have not yet fixed on your network?

A third use of AI/ML is in cybersecurity software products known as identity and access management (IAM) and privileged access management (PAM). The software determines if the user is who they claim to be by checking multiple databases simultaneously. The AI would ask: Where physically in the real world is the user? Did the user already badge out of the building? Does travel data show that the user is not in the building, but actually in London today? Is the user originating on the appropriate computer? Is the user accessing applications and databases they normally use? Is the user attempting to do something with the data that is unusual, such as encrypting it, compressing it, downloading it, or transmitting it to an inappropriate destination? Are the mouse movements and stylistic keyboard usage of the actor consistent with their previous patterns? Or is the user acting too quickly and smoothly, like a bot?

The answer to those questions is not binary; rather, each of them is likely to produce a score relative to the confidence that the AI has in each answer based upon what data is available and how accurate it has been in the past. The combination of the weighted scores from the several questions asked will result in a decision, and the user will be allowed in, kept out, permitted only to perform limited functions, asked for further proof of identity, or placed under ongoing observation.

When conventional security software does detect possible malicious activity, it sends a message to a security-monitoring console being watched by a human. Unfortunately, today in most corporate or government networks, alarm messages are coming into security monitoring systems at such a high rate that triage must be performed by the human(s) in a security operations center, or SOC.

The humans might get distracted in ways any of us do at work, eating, nodding off, visiting a bathroom, or texting with a partner. AI, however, is always on, continuously at the same level of attentiveness. AI could prioritize threats for itself, or for the humans in the SOC. If trusted and authorized by the network owner to do so, AI could take action to block malicious activity, or to delay execution of a dubious command until a responsible human can review it.

Thus, there was a lot of hope for the new companies that rushed into the market claiming to be using AI to stop cyberattacks. When NSA Director Keith Alexander retired from the military, he quickly started an AI cybersecurity firm, IronNet, and gained more than $100 million in backing. A British firm, Darktrace, claimed its ML software could learn about an attack while it was ongoing and alert network operators. Darktrace soon became a “unicorn,” a company worth more than a billion dollars. Critics of IronNet and Darktrace claimed that the AI actually still needed a lot of human assistance.

Indeed, no cybersecurity firm has at this time deployed what we envision as the full promise of AI/ML, what we call the Network Master, the one AI to rule them all. A Network Master AI/ML program could correlate billions of actions on tens of thousands of machines over weeks of activity, drawing on all the data logs from the dozens of security software programs and network management systems. Within milliseconds of coming to a conclusion based on correlating scores of diverse data sources, the Network Master could quarantine suspicious activity, create honeypots, modify firewall rules, increase authentication requirements, isolate subnets, or even in an extreme emergency disconnect the corporate network from the internet. Such capacity does not yet exist.

If all of that sounds like AI could give the defender the decisive upper hand in dealing with cyberattacks, the reality is that we are not yet there. Yes, some previously existing one-trick security applications like those mentioned above have added AI/ML techniques and are now better at some specific tasks. However, the ML that runs the network, spotting and swatting attacks that no human would detect, is not yet a reality for a variety of technical and procedural reasons.

For example, AI/ML works best when it has a lot of data to analyze, particularly when that data is in a variety of databases and formats. Then AI/ML can do what humans manifestly can’t do, quickly cross-correlate billions of seemingly unrelated pieces of information to infer a conclusion. For most AI/ML programs to work well, that data all needs to be swimming in the same place, in what big corporations call their data lake. It seldom is. It’s scattered. Or sometimes it is not even collected or stored, or not stored for very long, or not stored in the right format. Then it has to be converted into a usable format through what data scientists politely call “manicuring” (and behind the scenes call “data mangling”) for the AI/ML engine to perform its work.

Capturing all the data and storing it for six weeks or more to catch the “low-and-slow” attacks (ones that take each step in the attack days or weeks apart so as not to be noticed) would be a very expensive proposition for any company. Only the wealthy and highly risk-averse corporations would do that. In our discussions with network operators at major corporations, we have found that very few have such vast and complete data lakes readily accessible.

Most network operators do not yet trust AI/ML programs to wander around their live databases. Maybe their fears are irrational, but they are real. How do they know if that semi-sentient ML program is going to morph its software and then do something it has not done before, like wipe software off a server or disconnect a key router from the network? While such nightmares have never happened, there is always a first time, and what paranoid network security officer is going to take that risk?

If, however, instead of allowing access to your live database, you only permit the database to rummage on a near real-time basis, that may mean that you have to replicate your vast data lake at considerable expense by adding many physical or cloud servers. If you do that, the AI/ML cannot really tell you about a live security event, only about something that happened before you mirrored the databases. Even then, while ML programs are learning, they are often returning so many false positive alarms themselves that they are actually making the monitoring problem worse for all the humans watching panes of glass in the SOC.

For a real Network Master to be created two conditions are necessary, beyond the enormous software challenges. First, someone or some group would have to be willing to fund the development. Most venture-capital or private-equity firms would not normally be willing to place a bet on that kind of development. It will likely require a well-heeled tech company willing to fund a “moon shot,” or possibly a government agency willing to sponsor and fund it. Second, after the AI/ML had been proven on simulated networks, it would require a large network whose owner-operator was willing to be the first mover to employ it on a live network. It is difficult to imagine who that might be.

If (and as you can see it is a big IF) those problems can be overcome, AI/ML could make it very difficult for today’s attackers to succeed. Meanwhile, however, today’s bad guys are not standing still. They are also playing around with AI/ML. Of course they are.

Weaponizing Nonhuman Sentience? AI/ML on the Offense

When most people think about AI, they’re not thinking about cybersecurity. They’re thinking about killer robots. Given all of the existing applications of AI that are in everyday use behind the scenes doing good things (such as instantly making a decision on whether a credit-card purchase made online is actually fraud), it is disappointing to data scientists that when (and if) the average citizen thinks about artificial intelligence, the Terminator comes to mind. Many people have already internalized the idea of weaponized AI, even though the kind of thing they fear does not really exist except in some experimental Russian programs.

In the field of kinetic war, the Russians, the Chinese, and the Pentagon do seem to have a fascination with semi-sentient swarms of fire-and-forget drones that could fly around, talk to one another, decide on what things look like targets, divide up which drone would go after what target, and so forth. Think of a bunch of smart, angry hornets, then give them explosives. A DoD directive bans autonomous weapons that would use AI to determine on their own if something or someone should be attacked. There still must be a human in the loop authorizing a weapon to attempt a kill. While that is mildly comforting, with a few keystrokes, autonomous killing AI software not used in peacetime might be activated on a drone or missile in wartime. In a shooting war with a near peer, it would be hard to resist such a tool, especially if the opponent were the first to use autonomous AI weapons to find, swarm, and kill.

All of those fears involve drones and kinetic kills in the physical world in the future, but for cybersecurity experts, the reality is more immediate. The same kind of defensive AI/ML software we discussed above can be used in a slightly modified way to perform a cyberattack on a network.

Cyber criminals and malicious actors are no longer, if they ever were, the archetypical acne-plagued boys alone in their parents’ basement hacking for fun and profit on a Red Bull–induced, pizza-fueled jag. Malicious actors in cyberspace today can be military or intelligence officers or well-funded criminals. Both types are staffed and supported by computer science graduates with advanced degrees and AI/ML proficiency. The criminals can afford the highly qualified help because cybercrime pays, and it pays well.

Using AI/ML on the offensive is not just theoretical. In August 2016, DARPA, the Defense Department’s Advanced Research Projects Agency, commissioned six teams from universities to develop attack AI programs to attempt to break into and steal information from a highly defended network, with no human in the loop after the attacks were launched. The teams convened at the Paris Hotel and Casino in Las Vegas. So, back to our visit to Vegas.

After DARPA Director Regina Dugan explained the competition to a select group of observers, and us, we all walked into a vast event space, with six stacks of servers on a stage. On signal, the competing teams launched their attack machines and then walked away. For hours there was no human participating in or overseeing their activity. The AI/ML software programs scanned the defended network for faults and defenses, learning and trying tools and techniques to break in until they succeeded, then climbing up through layers of protection software, acting on their own, until they captured the flag (the data they sought), and exfiltrated it back to their own computers.

If you had been there expecting to see something dramatic happen, the DARPA event was somewhat anticlimactic. The servers just sat there and blinked at us. Their fans whirred softly. For hours. Finally, the DARPA judges announced that the attack AI software created by the team from Carnegie Mellon University had broken through all the defenses and extracted the target data. We celebrated their win at one of the casino bars and concluded there were not many networks that could have successfully defended against that CMU attack bot. We said to each other then that we had just seen the future. Well, that future is now.

Computer scientists at Cornell University showed in a 2017 paper that one could use a “generative adversarial network” to fool other software defending against attacks (in other words, software that creates an attack AI perfectly suited to beat the defensive software it is up against). At the Black Hat cybersecurity conference in 2018, also in Las Vegas, an IBM team demonstrated an AI attack program called DeepLocker. As they described it, DeepLocker is a highly targeted and evasive malware powered by AI, which is trained to reason about its environment and is able to unleash its malicious behavior only when it recognizes its target. DeepLocker learns to recognize a specific target, concealing its attack payload in benign carrier applications until the intended target is identified.

In other words, like the fire-and-forget killer drones that would fly around until they saw something that looked like the kind of target they were supposed to blow up, DeepLocker would scan the internet looking for the kinds of networks it was supposed to attack. DeepLocker would do so in a camouflaged manner, perhaps while looking like a legitimate service used by the internet service provider. Then, DeepLocker would adaptively use multiple attack tools and techniques, learning about the defenses in use, until it got in.

It occurred to us that what IBM was showing off to the public in Vegas was probably something like what the United States and some other governments had already come up with on their own, and probably already used. Indicators of what may be happening behind closed doors often come from what cybersecurity experts not involved in classified programs are discussing publicly.

For example, an expert at the cybersecurity firm Endgame has publicly demonstrated how offensive AI can be used to “poison” defensive AI software by essentially fooling the defensive technology engaged in the learning phases of ML. Think of it this way: ML could be fooled by creating a flood of false positive alarms, which could cause the detection system to disregard a type of attack. Then the real attack, looking sufficiently like the false positive, could be launched successfully.

Alternatively, defensive systems could be attacked repeatedly and, after each time the attack was defeated, AI could alter the attack a little and try again. In that way, the attacker could persistently change its signature just enough so that it no longer matched what the defense was looking to block, but it would have sufficient functionality for the attack package to remain effective. Hackers have been doing this manually, testing their attack tools against antivirus software, but AI/ML would do this so much faster and more effectively.

Hackers have a big data problem too: lots of personally identifiable information stolen from hundreds of companies. An expert at McAfee has publicly discussed how AI could be used to plow through the troves of data that hackers have already stolen from numerous databases. Data in only one database may not be sufficient for the hacker to successfully impersonate you, but by using AI, they could scan multiple databases they had hacked and compile a sufficient amount of information to successfully impersonate your online identity. Maybe they have discovered your password on one website, and then they use the same password successfully to get into a secure network, pretending to be you, because you made the mistake of using the same password on multiple applications. You don’t do that, do you?

Neglected Defensive Potential

In the next three to five years, we are likely to see continued growth in the use of AI/ML as a part of applications that will do specific defensive tasks better than they are now being done. AI/ML use will become more sophisticated in one-trick security applications doing identity and access management, privileged access management, endpoint protection, and vulnerability scanning. Some vendors will incorporate the technology into their defensive applications more successfully than others. Already companies like Illumio are beginning to apply AI/ML to assist in network management orchestration. Some network operators will buy the better products, others will fall for flawed defensive applications, and some will neglect to buy cybersecurity products incorporating AI/ML technology altogether. Many cyber vendors will claim that they have an AI/ML application that will do everything, but they will be, to put it kindly, exaggerating.

The Network Master controller AI/ML for cybersecurity is unlikely to emerge soon for the reasons we have discussed. A time frame of three to six years from now is more likely. On the other hand, AI/ML for the offense, something like the weaponization of the 2016 DARPA Grand Challenge, is probably already in use to some degree and is likely to grow steadily in its application by a small set of nation-state actors. If recent history is any guide, such technology will eventually make its way into the hands of the second-tier nations and to nonstate actors. This second tier of players may not use AI/ML attack tools in as sophisticated ways as the U.S. and Chinese governments will, but AI/ML will still make them much more capable of successful attacks than they are currently.

On balance, in the near term the growing use of AI/ML in defensive software is likely to increase the ability of the defenders, unless they are being attacked by a sophisticated and determined state actor using AI/ML offensively.