Strong cryptography is a critical piece of information security that can be applied at many levels, from data storage to network communication. One of the most common classes of security problems people introduce is the misapplication of cryptography. It's an area that can look deceptively easy, when in reality there are an overwhelming number of pitfalls. Moreover, it is likely that many classes of cryptographic pitfalls are still unknown.
It doesn't help that cryptography is a huge topic, complete with its own subfields, such as public key infrastructure (PKI). Many books cover the algorithmic basics; one example is Bruce Schneier's classic, Applied Cryptography (John Wiley & Sons). Even that classic doesn't quite live up to its name, however, as it focuses on the implementation of cryptographic primitives from the developer's point of view and spends relatively little time discussing how to integrate cryptography into an application securely. As a result, we have seen numerous examples of developers armed with a reasonable understanding of cryptographic algorithms that they've picked up from that book, who then go on to build their own cryptographic protocols into their applications, which are often insecure.
Over the next three chapters, we focus on the basics of symmetric cryptography . With symmetric cryptography, any parties who wish to communicate securely must share a piece of secret information. That shared secret (usually an encryption key) must be communicated over a secure medium. In particular, sending the secret over the Internet is a bad idea, unless you're using some sort of channel that is already secure, such as one properly secured using public key encryption (which can be tough to do correctly in itself). In many cases, it's appropriate to use some type of out-of-band medium for communication, such as a telephone or a piece of paper.
In these three chapters, we'll cover everything most developers need to use symmetric cryptography effectively, up to the point when you need to choose an actual network protocol. Applying cryptography on the network is covered in Chapter 9.
To ensure that you choose the right cryptographic protocols for your application, you need an understanding of these basics. However, you'll very rarely need to go all the way back to the primitive algorithms we discuss in these chapters. Instead, you should focus on out-of-the-box protocols that are believed to be cryptographically strong. While we therefore recommend that you thoroughly understand the material in these chapters, we advise you to go to the recipes in Chapter 9 to find something appropriate before you come here and build something yourself. Don't fall into the same trap that many of Applied Cryptography's readers have fallen into!
There are two classes of symmetric primitives, both of utmost importance. First are symmetric encryption algorithms, which provide for data secrecy. Second are message authentication codes (MACs), which can ensure that if someone tampers with data while in transit, the tampering will be detected. Recently, a third class of primitives has started to appear: encryption modes that provide for both data secrecy and message authentication. Such primitives can help make the application of cryptography less prone to disastrous errors.
In this chapter, we will look at how to generate, represent, store, and distribute symmetric-key material. In Chapter 5, we will look at encryption using block ciphers such as AES, and in Chapter 6, we will examine cryptographic hash functions (such as SHA1) and MACs.
Towards the end of this chapter, we do occasionally forward-reference algorithms from the next two chapters. It may be a good idea to read Recipe 5.1 through Recipe 5.4 and Recipe 6.1 through Recipe 6.4 before reading Recipe 4.10 through Recipe 4.14.
You need to keep an internal representation of a symmetric key. You may want to save this key to disk, pass it over a network, or use it in some other way.
Simply keep the key as an ordered array of bytes. For example:
/* When statically allocated */ unsigned char *key[KEYLEN_BYTES]; /* When dynamically allocated */ unsigned char *key = (unsigned char *)malloc(KEYLEN_BYTES);
When you're done using a key, you should delete it securely to prevent local attackers from recovering it from memory. (This is discussed in Recipe 13.2.)
While keys in public key cryptography are represented as very large numbers (and often stored in containers such as X.509 certificates), symmetric keys are always represented as a series of consecutive bits. Algorithms operate on these binary representations.
Occasionally, people are tempted to use a single 64-bit unit to
represent short keys (e.g., a long long
when using
GCC on most platforms). Similarly, we've commonly
seen people use an array of word-size values. That's
a bad idea because of byte-ordering issues.
When representing
integers, the bytes of the integer may appear most significant byte
first (big-endian) or least significant byte first (little-endian).
Figure 4-1 provides a visual illustration of the
difference between
big-endian and
little-endian storage:
Endian-ness doesn't matter when performing integer
operations, because the CPU implicitly knows how integers are
supposed to be represented and treats them appropriately. However, a
problem arises when we wish to treat a single integer or an array of
integers as an array of bytes. Casting the address of the first
integer to be a pointer to char
does not give the
right results on a little-endian machine, because the cast does not
cause bytes to be swapped to their
"natural" order. If you absolutely
always cast to an appropriate type, this may not be an issue if you
don't move data between architectures, but that
would defeat any possible reason to use a bigger storage unit than a
single byte. For this reason, you should always represent key
material as an array of one-byte elements. If you do so, your code
and the data will always be portable, even if you send the data
across the network.
You should also avoid using
signed data
types, simply to avoid potential printing oddities due to sign
extension. For example, let's say that you have a
signed 32-bit value, 0xFF000000
, and you want to
shift it right by one bit. You might expect the result
0x7F800000
, but you'd actually
get 0xFF800000
, because the sign bit gets shifted,
and the result also maintains the same sign.[1]
[1] To be clear on semantics, note that shifting right eight bits will always give the same result as shifting right one bit eight times. That is, when shifting right an unsigned value, the leftmost bits always get filled in with zeros. But with a signed value, they always get filled in with the original value of the most significant bit.