Appendix B

Some Elementary Tips
for Frequency Analysis



1. Begin by counting up the frequencies of all the letters in the ciphertext. About five of the letters should have a frequency of less than 1 per cent, and these probably represent j, k, q, x and z. One of the letters should have a frequency greater than 10 per cent, and it probably represents e. If the ciphertext does not obey this distribution of frequencies, then consider the possibility that the original message was not written in English. You can identify the language by analyzing the distribution of frequencies in the ciphertext. For example, typically in Italian there are three letters with a frequency greater than 10 per cent, and nine letters have frequencies less than 1 per cent. In German, the letter e has the extraordinarily high frequency of 19 per cent, so any ciphertext containing one letter with such a high frequency is quite possibly German. Once you have identified the language you should use the appropriate table of frequencies for that language for your frequency analysis. It is often possible to unscramble ciphertexts in an unfamiliar language, as long as you have the appropriate frequency table.

2. If the correlation is sympathetic with English but the plaintext does not reveal itself immediately, which is often the case, then focus on pairs of repeated letters. In English the most common repeated letters are ss, ee, tt, ff, ll, mm and oo. If the ciphertext contains any repeated characters, you can assume that they represent one of these.

3. If the ciphertext contains spaces between words, then try to identify words containing just one, two or three letters. The only one-letter words in English are a and i. The commonest two-letter words are of, to, in, it, is, be, as, at, so, we, he, by, or, on, do, if, me, my, up, an, go, no, us, am. The most common three-letter words are the and and.

4. If possible, tailor the table of frequencies to the message you are trying to decipher. For example, military messages tend to omit pronouns and articles, and the loss of words such as i, he, a and the will reduce the frequency of some of the commonest letters. If you know you are tackling a military message, you should use a frequency table generated from other military messages.

5. One of the most useful skills for a cyptanalyst is the ability to identify words, or even entire phrases, based on experience or sheer guesswork. Al-Khalīl, an early Arabian cryptanalyst, demonstrated this talent when he cracked a Greek ciphertext. He guessed that the ciphertext began with the greeting “In the name of God”. Having established that these letters corresponded to a specific section of ciphertext, he could use them as a crowbar to pry open the rest of the ciphertext. This is known as a crib.