This chapter examines an important and exciting new area of security innovation within Cisco Digital Network Architecture—namely, Encrypted Traffic Analytics (ETA).
Cisco ETA provides two very important capabilities for enterprises wishing to enhance their security posture, as depicted in Figure 22-1:
Encrypted malware detection: Today, more threats than ever that lurk on the enterprise network are leveraging encryption in an effort to avoid detection. ETA provides the capability to distinguish malicious from benign flows in encrypted traffic streams, without having to decrypt the traffic flows.
Cryptographic compliance: Many enterprises have policies regarding how encryption is supposed to be deployed within their organizations. However, assessing how cryptography is actually rolled out in applications running across the network is difficult. ETA addresses this shortcoming and allows enterprises to more easily audit their deployment of encryption policies.
This chapter explores these capabilities in detail by
Defining the challenges that ETA helps to solve
Outlining how ETA functions
Explaining the benefits that ETA provides
When examining encrypted traffic to determine whether or not it contains malware, several important considerations must be examined. End-to-end confidentiality must be preserved (thus, no decryption of data can occur at a midpoint in the network). The integrity of the confidential encrypted channel must be maintained. And, it is necessary to be able to adapt as encryption standards themselves change, as they do from time to time. When examining cryptographic compliance, it is necessary to determine how much of the organization’s digital business is using strong encryption—and to audit for Transport Layer Security (TLS) policy violations, as well as passively detect possible cipher suite vulnerabilities as these become identified. All of these are capabilities that ETA provides and areas of operation that it supports.
In short, there is a lot to what ETA offers. Let’s dive in to examine the issue of malware detection in encrypted traffic flows, and see how ETA provides a powerful, innovative new solution to this critical requirement for enterprise network security.
In today’s network, many types of malware lurk. These malicious applications are looking to perform any number of deleterious actions, including (but not limited to) the following:
Encrypting the hard disk storage of a user or server, then demanding a ransom to unlock and recover the encrypted data (for which only the malware owner holds the key)
Wiping the hard disk storage of a user or server, permanently deleting the data and causing havoc
Executing a denial of service (DoS) attack and crippling an organization’s network infrastructure, possibly leveraging a distributed set of internal or external compromised resources (botnets)
Surreptitiously exfiltrating confidential data from an organization, and subsequently leveraging this data for competitive advantage, financial gain, or invasion of privacy
Of course, this only scratches the surface of what malicious software, deployed across the network, is capable of. However, even this short list of possible malicious activities is enough to keep a network manager up at night, considering the various ways that such malicious activities impact any organization, public or private.
A troubling trend in recent times, as outlined in Figure 22-2, is that many such malware threats are beginning to leverage encryption as a way to hide their activities—and even their presence—within enterprise networks.
Many organizations are concerned that such threats exist, and are growing in terms of both scale and potential impact to the organization. And yet, what can be done to address such threats? If the malware is encrypted, how can you spot it?
Disabling encryption across the enterprise network is not possible as a response. Many, if not most, legitimate enterprise applications use encryption for their own purposes, such as to protect confidential data in flight and to provide privacy and authentication for users and things, and their associated data flows.
And yet, encryption is encryption. Its purpose is to hide data by effectively scrambling it—making data appear as noise. Unfortunately, the same encryption methods that protect sensitive data in transit for legitimate purposes are also leveraged by malicious actors to ensure that traditional methods of malware detection are rendered inoperable.
So how do you solve this problem? How can you tell “good noise” (benign encrypted applications) from “bad noise” (encrypted malware) within the network?
Until recently, this was considered an impossible problem to solve. However, the efforts of a few key Cisco engineers began the movement toward solving this critical problem—a solution that ultimately became the Cisco Encrypted Traffic Analytics capability.
An important contribution was a paper submitted to the Association for Computing Machinery (ACM) in 2016 by two Cisco engineers, Blake Anderson and David McGrew, entitled “Identifying Encrypted Malware Traffic with Contextual Flow Data.”1 This paper outlined methods which could be used to determine which flows in a network might be malicious in intent versus other flows that are benign—even when both types of flows are encrypted—and without having to decrypt any of the data.
1 https://dl.acm.org/citation.cfm?id=2996768
Expanding on this idea, Cisco crafted the Encrypted Traffic Analytics solution. Let’s explore how the solution works and the benefits it provides.
To begin with, let’s tackle the obvious question up front: Why not just decrypt the traffic in the network and run traditional methods of malware detection on the then-decrypted flows?
Two major challenges render this option impractical.
The first challenge is performance and resource requirements. Encryption of data is hard work to begin with, and decryption of data without access to the encryption key is massively difficult and time consuming, and requires extensive amounts of computing power. This makes such an approach completely impractical at the network edge—the place closest to the user, where you would typically like to be able to detect this issue first—and would be very expensive and difficult at any place in the network where high data rates are involved (and they almost always are in networking).
The second challenge is more subtle perhaps, but no less real. Users, devices, and applications leverage encryption for legitimate purposes, to encode data and protect confidential assets as well as individual privacy. Midstream decryption of data to search for encrypted malware that may be hiding within the data risks exposing all of the other benign, but encrypted, applications involved to additional risk to the compromise of their legitimate data, and may be prohibited by corporate policies or legal protections for such data flows.
But without decrypting the data in flight, you are back to the same problem. How can you tell the difference between legitimate encrypted data flows and encrypted malware? How can you tell “good noise” from “bad noise”?
Enter two key capabilities of Encrypted Traffic Analytics: Initial Data Packet (IDP) inspection, and examination of the Sequence of Packet Lengths and Times (SPLT). Each of these are examined in turn in the following text.
Encrypted Traffic Analytics takes advantage of the fact that the setup of any new encrypted flow always includes a few unencrypted packets as the encrypted session is established.
By inspecting these Initial Data Packets (IDP), it is possible to a fairly high degree of accuracy to determine many important characteristics of the flow—including (but not limited to) the application or browser types in use, the TLS types being used, what cipher suites are being asked for and agreed upon by the two endpoints involved, whether a self-signed certificate or a certificate signed by a known root certificate authority is being employed, and many other aspects of the subsequent encrypted session that is being set up. This is illustrated in Figure 22-3.
By examining these IDPs in the cryptographic exchange, a set of reference points can be determined to begin to assist with the identification of the encrypted flow that follows. By itself, this is not indicative of the presence of malware—however, when used in conjunction with additional data, as you will see, identification of malicious encrypted flows versus benign flows begins to emerge. Let’s examine this further to see how this is accomplished.
By combining the data extracted from observing the IDP exchange (which takes place within the first several packets of a given encrypted flow) with an examination of the subsequent packets in a flow—specifically, the sequence of encrypted packets, their length in terms of bytes, and the spacing of the packets in terms of time delays and inter-packet gaps—it is possible to further “fingerprint” the flows, identifying the patterns that begin to emerge that can signal the presence of encrypted malware.
The use of SPLT (Sequence of Packet Lengths and Times) is outlined in Figure 22-4.
The combination of IDP and SPLT data provides sufficient indicators for ETA to detect encrypted malware flows and differentiate them from benign encrypted flows—without having to decrypt any of the data in flight. Effectively, this provides a “fingerprint” for the encrypted flow within the network, a fingerprint that, when combined with knowledge of many thousands of encrypted flow types supplied via a cloud-based database, allows for a highly accurate determination to be made as to whether a given encrypted flow contains malware or not—all without decrypting the traffic flows involved.
Sounds intriguing? Let’s review an example of how this works to gain more insight into the operation of ETA.
To begin with, let’s examine an encrypted transaction over the network that is benign in nature. For this example, let’s review a user executing a Google search.
As a common practice, Google now encrypts all transactions using HTTPS, rendering them opaque to the network infrastructure, as well as to anyone in between the client using a web browser for the search and Google’s own servers. As you examine what this Google search looks like on the wire (again, without decrypting any of it), the pattern shown in Figure 22-5 emerges—essentially, a “digital fingerprint” of the Google search itself.
In the figure, the source for the Google search (the user) is shown on the left and the destination (a Google server) is shown on the right. The horizontal line running across the middle denotes the direction of traffic flow—everything above the line is user-to-server traffic and everything below the line is server-to-user traffic.
Initially, you see a certificate exchange (the line on the far left), followed quickly by a burst of data as Google serves up the initial Google search splash page. Then you see a number of small back-and-forth packet flows as the user types her query one character at a time, which is transmitted back in real time to Google’s servers, which attempt to provide auto-completion for the query as it is typed (every character typed by the user is a back-and-forth packet exchange with the Google server). Finally, the user submits her query, and a further burst of data signals the arrival of Google’s query results to the user.
You cannot decrypt any of the data—nor do you necessarily need to. Any Google query can differ in terms of what is looked for; however, they all produce a characteristic fingerprint that resembles the pattern shown in Figure 22-5, simply by the nature of how such a query operates. The data varies, and it is all encrypted and secure end to end, but the basic packet exchange pattern remains the same, and can be observed via ETA.
Now, compare this benign encrypted transaction to one that is more threatening.
For this example, let’s examine the fingerprint shown in Figure 22-6. This fingerprint is associated with a banking Trojan known as Bestafera. The purpose of this malware is to extract data from the user’s machine without the user’s knowledge or consent—possibly including confidential data related to the user’s banking account access or financial transactions.
In this fingerprint, notice several elements that may trigger suspicion. First, a self-signed certificate is in use. This is not necessarily incriminating in and of itself, but serves as a red flag that may warrant further inspection. Second, observe that almost no data is sent from the server to the user—but rather, a large amount of data is very quickly extracted from the client to the server (this might include the user’s contacts info or other important data, such as user keystrokes). Finally, after this data is exfiltrated, an ongoing command and control (C2) channel is established between the malicious actor’s remote server and the local client, possibly allowing further data to be exfiltrated in the future, or allowing other methods of compromise to be undertaken. Again, ETA observes and records all of this data at the switch level.
After recording all of this data for a given encrypted flow on the Catalyst 9000 Series switch, the data is packaged up in a NetFlow record associated with that encrypted flow, and exported to the Stealthwatch system. Stealthwatch is a Cisco security management platform that, among its other uses, is used to receive these NetFlow records containing ETA information, extract the necessary data, and compare this against a cloud-based database of known malicious traffic flow fingerprints.
The use of NetFlow is key because this not only allows the ETA-specific data as noted (IDP and SPLT) to be extracted for a given flow, but also allows the complete IP information about the flow to be included. This in turn allows the destination IP addresses involved to be compared against known malicious hosts, or specific regions or countries, which may serve as higher risk points for malware. In addition, the ability to inspect the initial data packets for a given flow allows for certain fields commonly used by malware in self-signed certificates to be inspected and this data exported for analysis. By leveraging NetFlow to extract this data, the maximum amount of information possible from the switch can be provided to Stealthwatch for analysis.
In both cases examined here by ETA’s IDP and SPLT capabilities, none of the data is decrypted—nor does it need to be. When looked at from this perspective, the differences between the benign Google search flow and the Bestafera banking Trojan flow are starkly apparent. These differences serve as a foundational analytical component of the Cisco Encrypted Traffic Analytics solution.
It is worth noting that ETA, by its nature of comparing encrypted traffic fingerprints to a cloud-based database system, is inherently focused on traffic flows from inside the organization going to locations outside the organizational boundaries (typically, to locations on the Internet). In other words, ETA focuses on flows that can be broadly observed and for which a malicious traffic fingerprint database can be constructed and maintained, rather than monitoring encrypted flows of traffic within the organization itself.
In ETA, each element within the network has an important and distinct role to play.
First, the flow must be detected, and the traffic within it gathered for analysis. This is done in the Catalyst 9300 and 9400 switches—platforms that sit directly adjacent to the users within the network infrastructure and thus are ideally placed to gather this data (because they always see both directions of the traffic flow, not being subject to asymmetrical routing that might otherwise interfere with the traffic analysis).
This is illustrated in Figure 22-7, which outlines the complete sequence of events for the ETA analysis of a flow.
Leveraging the UADP ASIC in the Catalyst 9300/9400 switch platforms, along with the powerful, multicore Intel CPU that these platforms also employ, Encrypted Traffic Analytics gathers the IDP and SPLT data for any new encrypted flow. This data is then exported via Flexible NetFlow up to the Cisco Stealthwatch collector. Again, the use of NetFlow from the access switch is key because this allows a number of important additional elements to be extracted concerning the flow, along with the critical elements of data gathered by ETA itself (IDP and SPLT).
In turn, Stealthwatch compares all of this data against Cisco’s cloud-based Cognitive Threat Analytics solution, which leverages an extensive data set from Cisco’s vast Talos security database. By applying machine-learning techniques, it is possible to identify malicious encrypted flows with this supplied data to a level of detection accuracy exceeding 99 percent, with a very low false-positive rate.
And so, ETA enables you to achieve what seemed impossible at first: without needing to decrypt anything, you are able to positively identify malicious traffic flows within the network, in close to real time, and differentiate them from benign encrypted flows.
Can you tell “good noise” from “bad noise”? It turns out that you can. By leveraging the unique and powerful capabilities of Cisco’s premier access switching platforms, and combining them with Cisco’s industry-leading cloud-based security capabilities and throwing in a dash of machine learning, Encrypted Traffic Analytics enables a new generation of sophisticated threat detection.
Stealthwatch analysis also enables organizations to monitor their compliance with cryptographic security policies by reporting even on benign flows via the Stealthwatch management console, and allowing network and security managers to be alerted (for example) if older, less-secure cipher suites are in use. Because encryption standards change periodically (with new ciphers being introduced and older, less-secure methods being deprecated), being able to passively audit the security protocols in use within the organization for non-malicious flows, and report on any deviations from the organization’s defined security best practices, is an important additional use case for ETA.
And so, ETA serves as an important new weapon in the arsenal of network and security managers everywhere in the ongoing battle to secure the enterprise network. While ETA is just one component in a much broader array of security tools and capabilities (which typically include firewalls, intrusion detection and prevention systems, and much more), it serves as an important complement to these systems, and assists to solve an important problem that is otherwise very difficult to address.
Enterprises everywhere can benefit from the powerful new, industry-leading, network-integrated security capability that Encrypted Traffic Analytics offers—an excellent proof point of the power of Cisco Digital Network Architecture.
This chapter introduced and explored the following:
The challenges that ETA helps to solve
Addressing encrypted malware in the network, without decrypting it
Allowing organizations to provide auditing for cryptographic compliance
How ETA functions
IDP analysis of initial packet exchanges between encryption endpoints
SPLT analysis of packet sequences, lengths, and times
NetFlow export of these and other data points to Stealthwatch for further analysis
The benefits that ETA provides
Keep your eye on this space! ETA is an evolving solution, and one that shows the power of network-integrated security to solve challenges that are otherwise difficult, if not impossible, to address. As you explore the possible uses of ETA within your own organization, you will no doubt gain better insights as to how to better secure your own network environment, now and into the future.