Chapter 6

Encryption Fundamentals

Chapter Objectives

After reading this chapter and completing the exercises, you will be able to do the following:

images Explain encryption concepts.

images Describe the history of encryption and modern encryption methods.

images Use some simple decryption techniques.

Introduction

Encryption is a vital part of any network security strategy. No matter how secure the network is, if the data is not encrypted at rest or during transmission, then that data is vulnerable. Even most basic wireless routers for home users now offer encryption.

This chapter offers a basic overview of encryption and an explanation of how it works to help you make good decisions for your organization. A complete examination is beyond the scope of this book, but this chapter provides a “manager’s understanding” of cryptography to help you ask the right questions about your organization’s encryption needs.

The History of Encryption

Encrypting communications is a very old idea. People have found the need to send private communications for most of the history of civilization. The need for privacy originally arose from military and political needs, but has expanded beyond that. Businesses need to keep data private to maintain a competitive edge. People want to keep certain information, such as their medical records and financial records, private.

For much of human history private communications meant encrypting written communiqués. Over the past century that has expanded to radio transmission, telephone communications, and computer/Internet communications. In the past several decades the encryption of computerized transmissions has actually become commonplace. In fact you can find computer/Internet communications encrypted more often than phone or radio. The digital environment makes implementing a particular type of encryption much easier.

Whatever the nature of the data you are encrypting, or the mode of transmitting the data, the basic concept is actually quite simple. Messages must be changed in such a way that they cannot be read easily by any party that intercepts them but can be decoded easily by the intended recipient. In this section you will examine a few historical methods of encryption. Note that these are very old methods, and they cannot be used for secure communication today. An amateur could easily crack the methods discussed in this section. However, they are wonderful examples for conveying the concept of encryption without having to incorporate a great deal of math, which is required of the more complex encryption methods.

FYI: False Claims About Encryption

As you will see in this chapter, the basic concepts of encryption are very simple. Anyone with even rudimentary programming skills can write a program that implements one of the simple encryption methods examined here. However, these methods are not secure and are included only to illustrate fundamental encryption concepts.

From time to time someone new to encryption discovers these basic methods, and in his enthusiasm attempts to create his own encryption method by making some minor modifications. Although this can be a stimulating intellectual exercise, it is only that. Users without training in advanced math or cryptography are extremely unlikely to stumble across a new encryption method that is effective for secure communications.

Amateurs frequently post claims that they have discovered the latest, unbreakable encryption algorithm on the Usenet newsgroup sci.crypt (if you are not familiar with Usenet, those groups are now accessible via the Groups link on www.google.com). Their algorithms are usually quickly broken. Unfortunately, some people implement such a method into a software product and market it as secure.

Some distributors of insecure encryption methods and software do so out of simple greed and are intentionally defrauding an unsuspecting public. Others do so out of simple ignorance, honestly believing that their method is superior. Methods for evaluating encryption claims are discussed later in this chapter.

The Caesar Cipher

One of the oldest recorded encryption methods is the Caesar cipher. This name is based on a claim that this method was used by ancient Roman emperors. This method is simple to implement, requiring no technological assistance. You choose some number by which to shift each letter of a text. For example, if the text is

A cat

and you choose to shift by two letters, then the message becomes

C ecv

Or, if you choose to shift by three letters, it becomes

D fdw

In this example, you can choose any shifting pattern you want. You can shift either to the right or left by any number of spaces you like. Because this is a simple method to understand, it makes a good place to start your study of encryption. It is, however, extremely easy to crack. You see, any language has a certain letter and word frequency, meaning that some letters are used more frequently than others. In the English language, the most common single-letter word is a. The most common three-letter word is the. Knowing these two characteristics alone could help you decrypt a Caesar cipher. For example, if you saw a string of seemingly nonsense letters and noticed that a three-letter word was frequently repeated in the message, you might easily surmise that this word was the—and the odds are highly in favor of this being correct.

Furthermore, if you frequently noticed a single-letter word in the text, it is most likely the letter a. You now have found the substitution scheme for a, t, h, and e. You can now either translate all of those letters in the message and attempt to surmise the rest or simply analyze the substitute letters used for a, t, h, and e and derive the substitution cipher that was used for this message. Decrypting a message of this type does not even require a computer. Someone with no background in cryptography could do it in less than ten minutes using pen and paper.

Caesar ciphers belong to a class of encryption algorithms known as substitution ciphers. The name derives from the fact that each character in the unencrypted message is substituted by one character in the encrypted text. The particular substitution scheme used (for example, 12 or 11) in a Caesar cipher is called a substitution alphabet (that is, b substitutes for a, u substitutes for t, etc.). Because one letter always substitutes for one other letter, the Caesar cipher is sometimes called a mono-alphabet substitution method, meaning that it uses a single substitution for the encryption.

The Caesar cipher, like all historical ciphers, is simply too weak for modern use. It is presented here just to help you understand the concepts of cryptography.

ROT 13

ROT 13 is another single alphabet substitution cipher. All characters are rotated 13 characters through the alphabet.

The phrase

A CAT

becomes

N PNG

ROT 13 is a single-substitution cipher.

Atbash Cipher

Hebrew scribes copying the book of Jeremiah used the Atbash cipher. Using it is simple; you just reverse the alphabet. This is, by modern standards, a primitive and easy-to-break cipher. However, it will help you get a feel for how cryptography works.

The Atbash cipher is a Hebrew code that substitutes the first letter of the alphabet for the last and the second letter for the second to the last, etc. It simply reverses the alphabet; for example, A becomes Z, B becomes Y, C becomes X, etc.

This, like the Caesar and ROT 13 ciphers, is also a single-substitution cipher.

Multi-Alphabet Substitution

Eventually, a slight improvement on the Caesar cipher was developed, called multi-alphabet substitution (also called polyalphabetic substitution). In this scheme, you select multiple numbers by which to shift letters (that is, multiple substitution alphabets). For example, if you select three substitution alphabets (12, 22, 13), then

A CAT

becomes

C ADV

Notice that the fourth letter starts over with another 12, and you can see that the first A was transformed to C and the second A was transformed to D. This makes deciphering the underlying text more difficult. Although this is harder to decrypt than a Caesar cipher, it is not overly difficult to decode. It can be done with simple pen and paper and a bit of effort. It can be cracked quickly with a computer. In fact, no one would use such a method today to send any truly secure message, for this type of encryption is considered very weak.

One of the most widely known multi-alphabet ciphers was the Vigenère cipher. This topic is discussed in detail later in this chapter. This cipher was invented in 1553 by Giovan Battista Bellaso. It is a method of encrypting alphabetic text by using a series of different mono-alphabet ciphers selected based on the letters of a keyword. This algorithm was later misattributed to Blaise de Vigenère, and so it is now known as the “Vigenère cipher,” even though Vigenère did not really invent it.

Multi-alphabet ciphers are more secure than single-substitution ciphers. However, they are still not acceptable for modern cryptographic usage. Computer-based cryptanalysis systems can crack historical cryptographic methods (both single alphabet and multi-alphabet) easily. The single-substitution and multi-substitution alphabet ciphers are discussed just to show you the history of cryptography, and to help you get an understanding of how cryptography works.

Rail Fence

All the preceding ciphers we examined are substitution ciphers. Another approach to classic cryptography is the transposition cipher. The rail fence cipher may be the most widely known transposition cipher. You simply take the message you wish to encrypt and alter each letter on a different row. So “attack at dawn” is written as

A      t      c      a      d      w

      t      a      k      t       a      n

Next, you write down the text reading from left to right as one normally would, thus producing

atcadwtaktan

In order to decrypt the message, the recipient must write it out on rows:

A      t      c      a      d      w

      t      a      k      t       a      n

Then the recipient reconstructs the original message. Most texts use two rows as examples; however, this can be done with any number of rows you wish to use.

Vigenère

As we previously discussed, a polyalphabetic cipher uses multiple substitutions in order to disrupt letter and word frequency. Let us consider a simple example. Remember a Caesar cipher has a shift, for example a shift of +2 (two to the right). A polyalphabetic substitution cipher would use multiple shifts. Perhaps a +2, –1, +1, +3. When you get to the fifth letter, you simply start over again. So, consider the word Attack, being encrypted

A (1) + 2 = 3 or C

T (20) –1 = 19 or S

T (20) +1 = 21 or U

A (1) +3 = 4 or D

C (3) +2 = 5 or E

K (11) –1 = 10 or J

So, the ciphertext is CSUDEJ. Given that each letter has four possible substitutions, the letter and word frequency is significantly disrupted.

Perhaps the most widely known polyalphabetic cipher is the Vigenère cipher. This cipher was actually invented in 1553 by Giovan Battista Bellaso, though it is named after Blaise de Vigenère. It is a method of encrypting alphabetic text by using a series of different mono-alphabet ciphers selected based on the letters of a keyword. Bellaso added the concept of using any keyword one might wish, thereby making the choice of substitution alphabets difficult to calculate.

Enigma

It is really impossible to have a discussion about cryptography and not talk about Enigma. Contrary to popular misconceptions, the Enigma is not a single machine but rather a family of machines. The first version was invented by German engineer Arthur Scherbius near the end of World War I. It was used by several different militaries, not just the Nazi Germans.

Some military texts encrypted using a version of Enigma were broken by Polish cryptanalysts Marian Rejewski, Jerzy Rozycki, and Henryk Zygalski. The three basically reverse engineered a working Enigma machine and used that information to develop tools for breaking Enigma ciphers, including one tool named the cryptologic bomb.

The core of the Enigma machine was the rotors, or disks, that were arranged in a circle with 26 letters on them. The rotors were lined up. Essentially, each rotor represented a different single substitution cipher. You can think of the Enigma as a sort of mechanical polyalphabetic cipher. The operator of the Enigma machine would be given a message in plaintext and then type that message into Enigma. For each letter that was typed in, Enigma would provide a different ciphertext based on a different substitution alphabet. The recipient would type in the ciphertext, getting out the plaintext, provided both Enigma machines had the same rotor settings.

There were actually several variations of the Enigma machine. The Naval Enigma machine was eventually cracked by British cryptographers working at the now famous Bletchley Park. Alan Turing and a team of analysts were able to eventually break the Naval Enigma machine. Many historians claim this shortened World War II by as much as two years. This story is the basis for the 2014 movie The Imitation Game.

Binary Operations

Part of modern symmetric cryptography ciphers involves using binary operations. Various operations on binary numbers (numbers made of only zeroes and ones) are well known to programmers and programming students. But for those readers not familiar with them, a brief explanation follows. When working with binary numbers, three operations are not found in normal math: AND, OR, and XOR operations. Each is illustrated next.

AND

To perform the AND operation, you take two binary numbers and compare them one place at a time. If both numbers have a one in both places, then the resultant number is a one. If not, then the resultant number is a zero, as you see here:

1 1 0 1
1 0 0 1
-------
1 0 0 1
OR

The OR operation checks to see whether there is a one in either or both numbers in a given place. If so, then the resultant number is one. If not, the resultant number is zero, as you see here:

1 1 0 1
1 0 0 1
-------
1 1 0 1
XOR

The XOR operation impacts your study of encryption the most. It checks to see whether there is a one in a number in a given place, but not in both numbers at that place. If it is in one number but not the other, then the resultant number is one. If not, the resultant number is zero, as you see here:

1 1 0 1
1 0 0 1
-------
0 1 0 0

XORing has a an interesting property in that it is reversible. If you XOR the resultant number with the second number, you get back the first number. And, if you XOR the resultant number with the first number, you get the second number.

0 1 0 0
1 0 0 1
-------
1 1 0 1

Binary encryption using the XOR operation opens the door for some rather simple encryption. Take any message and convert it to binary numbers and then XOR that with some key. Converting a message to a binary number is a simple two-step process. First, convert a message to its ASCII code, and then convert those codes to binary numbers. Each letter/number will generate an eight-bit binary number. You can then use a random string of binary numbers of any given length as the key. Simply XOR your message with the key to get the encrypted text, and then XOR it with the key again to retrieve the original message.

This method is easy to use and great for computer science students; however, it does not work well for truly secure communications because the underlying letter and word frequency remains. This exposes valuable clues that even an amateur cryptographer can use to decrypt the message. Yet, it does provide a valuable introduction to the concept of single-key encryption, which is discussed in more detail in the next section. Although simply XORing the text is not the method typically employed, single-key encryption methods are widely used today. For example, you could simply include a multi-alphabet substitution that was then XORed with some random bit stream—variations of which do exist in a few actual encryption methods currently used.

Modern cryptography methods, as well as computers, make decryption a rather advanced science. Therefore, encryption must be equally sophisticated in order to have a chance of success.

What you have seen so far regarding encryption is simply for educational purposes. As has been noted several times, you would not have a truly secure system if you implemented any of the previously mentioned encryption schemes. You might feel that this has been overstated in this text. However, having an accurate view of what encryption methods do and do not work is critical. It is now time to discuss a few methods that are actually in use today.

The following websites offer more information about cryptography:

images Cryptography I course on Coursera: https://www.coursera.org/course/crypto

images Applied cryptography course on Udacity: https://www.udacity.com/course/applied-cryptography--cs387

images Cypher Research Laboratories: www.cypher.com.au/crypto_history.htm

Understanding the simple methods described here and other methods provided by the aforementioned websites should give you a sense of how cryptography works as well as what is involved in encrypting a message. Regardless of whether you go on to study modern, sophisticated encryption methods, having some basic idea of how encryption works at a conceptual level is important. Having a basic grasp of how encryption works, in principle, will make you better able to understand the concepts of any encryption method you encounter in the real world.

FYI: Careers in Cryptography

Some readers might be interested in a career in cryptography. Basic knowledge of cryptography is enough to be a security administrator, but not enough to be a cryptographer. A strong mathematics background is essential for in-depth exploration of cryptography, particularly when pursuing a career in this field. An adequate background includes a minimum of the complete calculus sequence (through differential equations), statistics through basic probability theory, abstract algebra, linear algebra, and number theory. A double major in computer science and mathematics is ideal. A minimum of a minor in mathematics is required, and familiarity with existing encryption methods is critical.

Learning About Modern Encryption Methods

Not surprisingly, modern methods of encryption are more secure than the historical methods just discussed. All the methods discussed in this section are in use today and are considered reasonably secure. Note that DES is an exception, but only due to its short key length.

In some cases the algorithm behind these methods requires a sophisticated understanding of mathematics. Number theory often forms the basis for encryption algorithms. Fortunately for our purposes having the exact details of these encryption algorithms is not important; this means that you don’t require a strong mathematics background to follow this material. More important is a general understanding of how a particular encryption method works and how secure it is.

Symmetric Encryption

Symmetric encryption refers to those methods where the same key is used to encrypt and decrypt the plaintext.

Data Encryption Standard

Data Encryption Standard, or DES as it is often called, was developed by IBM in the early 1970s and made public in 1976. DES uses a symmetric key system. Recall from our earlier discussion that this means the same key is used to encrypt and to decrypt the message. DES uses short keys and relies on complex procedures to protect its information. The actual DES algorithm is quite complex. The basic concept, however, is as follows:

1. The data is divided into 64-bit blocks, and those blocks are then transposed.

2. Transposed data is then manipulated by 16 separate rounds of encryption, involving substitutions, bit-shifting, and logical operations using a 56-bit key.

3. Finally, the data is transposed one last time.

More information about DES is available at the Federal Information Processing Standards website at https://csrc.nist.gov/csrc/media/publications/fips/46/3/archive/1999-10-25/documents/fips46-3.pdf. A more detailed description of DES is given below, but you can skip this if you wish.

DES uses a 56-bit cipher key applied to a 64-bit block. There is actually a 64-bit key, but one bit of every byte is actually used for error detection, leaving just 56 bits for actual key operations.

DES is a Feistel cipher with 16 rounds and a 48-bit round key for each round. A round key is just a sub key that is derived from the cipher key each round, according to a key schedule algorithm. DES’s general functionality follows the Feistel method of dividing the 64-bit block into two halves (32 bits each; this is not an unbalanced Feistel cipher), applying the round function to one half, then XORing that output with the other half.

The first issue to address is the key schedule. How does DES generate a new sub key each round? The idea is to take the original 56-bit key and to slightly permute it each round, so that each round is applying a slightly different key, but one that is based on the original cipher key. To generate the round keys, the 56-bit key is split into two 28-bit halves and those halves are circularly shifted after each round by one or two bits. This will provide a different sub key each round. During the round key generation portion of the algorithm (recall that this is referred to as the key schedule) each round, the two halves of the original cipher key (the 56 bits of key the two endpoints of encryption must exchange) are shifted a specific amount.

Once the round key has been generated for the current round, the next step is to address the half of the original block that is going to be input into the round function. Recall that the two halves are each 32 bits. The round key is 48 bits. That means that the round key does not match the size of the half block it is going to be applied to. You cannot really XOR a 48-bit round key with a 32 bit half block, unless you simply ignore 16 bits of the round key. If you did so, you would basically be making the round key effectively shorter and thus less secure, so this is not a good option.

The 32-bit half needs to be expanded to 48 bits before it is XORed with the round key. This is accomplished by replicating some bits so that the 32-bit half becomes 48 bits.

This expansion process is actually quite simple. The 32 bits that is to be expanded is broken into 4-bit sections. The bits on each end are duplicated. If you divide 32 by 4 the answer is 8. So there are eight of these 4-bit groupings. If you duplicate the end bits of each grouping, that will add 16 bits to the original 32, thus providing a total of 48 bits.

It is also important to keep in mind that it was the bits on each end that were duplicated; this will be a key item later in the round function. Perhaps this example will help you to understand what is occurring at this point. Let us assume 32 bits as shown here:

11110011010111111111000101011001

Now divide that into eight sections each of 4 bits, as shown here:

1111 0011 0101 1111 1111 0001 0101 1001

Now each of these has its end bits duplicated, as you see here:

1111 becomes 111111

0011 becomes 000111

0101 becomes 001011

1111 becomes 111111

1111 becomes 111111

0001 becomes 000011

0101 becomes 001011

1001 becomes 110011

The resultant 48-bit string is now XORed with the 48-bit round key. That is the extent of the round key being used in each round. It is now dispensed with, and on the next round another 48-bit round key will be derived from the two 28-bit halves of the 56-bit cipher key.

Now we have the 48-bit output of the XOR operation. That is now split into eight sections of 6 bits each. For the rest of this explanation we will focus on just one of those 6-bit sections, but keep in mind that the same process is done to all eight sections.

The 6-bit section is used as the input to an s-box. An s-box is a table that takes input and produces an output based on that input. In other words, it is a substitution box that substitutes new values for the input. The s-boxes used in DES are published, the first of which is shown in Figure 6-1.

A table of a DES s-box is shown.

FIGURE 6-1 The first DES s-box

Notice this is simply a lookup table. The 2 bits on either end are shown in the left hand column and the 4 bits in the middle are shown in the top row. They are matched, and the resulting value is the output of the s-box. For example, with the previous demonstration numbers we were using, our first block would be 111111. So you find 1xxxx1 on the left and x1111x on the top. The resulting value is 13 in decimal or 1101 in binary.

At the end of this you have produced 32 bits that are the output of the round function. Then in keeping with the Feistel structure, they get XORed with the 32 bits that were not input into the round function, and the two halves are swapped. DES is a 16-round Feistel cipher, meaning this process is repeated 16 times.

There are only two parts still left to discuss regarding DES. The first is the initial permutation, called the IP, then the final permutation, which is an inverse of the IP.

One advantage that DES offers is efficiency. Some implementations of DES offer data throughput rates on the order of hundreds of megabytes per second. In plain English, what this means is that it can encrypt a great deal of data very quickly. You might assume that 16 steps would cause encryption to be quite slow; however, that is not the case using modern computer equipment. The problem with DES is the same problem that all symmetric key algorithms have: How do you transmit the key without it becoming compromised? This issue led to the development of public key encryption.

Another advantage of DES is the complexity with which it scrambles the text. DES uses 16 separate rounds to scramble the text. This yields a scrambled text that is very difficult to break. DES is no longer used, because the short key size is no longer adequate against brute force attacks. However, the overall structure, called a Feistel network or Feistel cipher, is the basis for many algorithms that are still used today, such as Blowfish.

As has been mentioned, DES uses a key that is no longer considered long enough. Modern computers can brute-force crack a 56-bit key. The algorithm used in DES is actually quite good. It was the first widely used Feistel structure, and that structure is still a good basis for block ciphers.

As computers became more powerful, the search began for a DES replacement. Ultimately, the Rijndael cipher would be used for the Advanced Encryption Standard (AES) and would replace DES. In the interim, the idea was to use multiple DES keys to encrypt. Ideally, three separate 56-bit keys were used, thus this interim solution was called triple-DES or 3DES. In some cases only two DES keys were used, and the algorithm alternated applying them.

Blowfish

Blowfish is a symmetric block cipher. This means that it uses a single key to both encrypt and decrypt the message and works on “blocks” of the message at a time. It uses a variable-length key ranging from 32 to 448 bits. This flexibility in key size allows you to use it in various situations. Blowfish was designed in 1993 by Bruce Schneier. It has been analyzed extensively by the cryptography community and has gained wide acceptance. It is also a non-commercial (that is, free of charge) product, thus making it attractive to budget-conscious organizations.

FYI: Block Ciphers and Stream Ciphers

A block cipher operates on blocks of fixed length, often 64 or 128 bits. In a block cipher, a cryptographic key and algorithm are applied to a block of data (for example, 64 contiguous bits) at once as a group rather than to one bit at a time. Stream ciphers simply take the text as an ongoing stream, encrypting each bit as it encounters it. Stream ciphers tend to be faster than block ciphers. A stream cipher generates a keystream (a sequence of bits used as a key). Encryption is accomplished by combining the keystream with the plaintext, usually with the bitwise XOR operation.

AES

Advanced Encryption Standard (AES) uses the Rijndael algorithm. The developers of this algorithm have suggested multiple alternative pronunciations for the name, including “reign dahl,” “rain doll,” and “rhine dahl.” This algorithm was developed by two Belgian researchers, Joan Daemen of Proton World International and Vincent Rijmen, a postdoctoral researcher in the Electrical Engineering Department of Katholieke Universiteit Leuven.

AES specifies three key sizes: 128, 192, and 256 bits. By comparison, DES keys are 56 bits long, and Blowfish allows varying lengths up to 448 bits. AES uses a block cipher. Interested readers can find detailed specifications for this algorithm, including a detailed discussion of the mathematics, at https://csrc.nist.gov/csrc/media/publications/fips/197/final/documents/fips-197.pdf.

This algorithm is widely used (as shown in the firewall discussion in Chapter 4, “Firewall Practical Applications”), considered very secure, and therefore a good choice for many encryption scenarios.

For those readers who want more detail, here is a general overview of the process used in AES. The algorithm consists of a few relatively simple steps that are used during various rounds. The steps are described here:

images AddRoundKey: Each byte of the state is combined with the round key using bitwise XOR. This is where Rijndael applies the round key generated from the key schedule.

images SubBytes: A nonlinear substitution step where each byte is replaced with another according to a lookup table. This is where the contents of the matrix are put through the s-boxes. Each of the s-boxes is 8 bits.

images ShiftRows: A transposition step where each row of the state is shifted cyclically a certain number of steps. In this step the first row is left unchanged. Every byte in the second row is shifted one byte to the left (with the far left wrapping around). Every byte of the third row is shifted two to the left, and every byte of the fourth row is shifted three to the left (again with wrapping around).

images MixColumns: A mixing operation which operates on the columns of the state, combining the 4 bytes in each column. In the MixColumns step, each column of the state is multiplied with a fixed polynomial.

With the aforementioned steps in mind, this is how those steps are executed in the Rijndael cipher. For 128-bit keys, there are 10 rounds. For 192-bit keys, there are 12 rounds. For 256-bit keys, there are 14 rounds.

images Key Expansion: The first step is that the round keys are derived from the cipher key using Rijndael’s key schedule. The key schedule is how a key is generated for each round, based on the cryptographic key that was exchanged between the sender and receiver.

images Initial Round: This initial round will only execute the AddRoundKey step. This is simply XORing with the round key. This initial round is executed once, then the subsequent rounds will be executed.

images Rounds: This phase of the algorithm executes several steps, in the following order:

images SubBytes

images ShiftRows

images MixColumns

images AddRoundKey

images Final Round: This round has everything the rounds phase has, except no MixColumns:

images SubBytes

images ShiftRows

images AddRoundKey

In the AddRoundKey step, the sub-key is XORed with the state. For each round, a sub-key is derived from the main key using Rijndael’s key schedule; each sub-key is the same size as the state.

IDEA

International Data Encryption Algorithm (IDEA) is another block cipher. This particular algorithm works with 64-bit blocks of data two at a time and uses a 128-bit key. The procedure is fairly complicated and uses sub-keys generated from the key to carry out a series of modular arithmetic and XOR operations on segments of the 64-bit plaintext block. The encryption scheme uses a total of 52 16-bit sub-keys. These are generated from the 128-bit sub-key with the following procedure:

images The 128-bit key is split into eight 16-bit keys, which are the first eight sub-keys.

images The digits of the 128-bit key are shifted 25 bits to the left to make a new key, which is then split into the next eight 16-bit sub-keys.

images The second step is repeated until the 52 sub-keys have been generated. The encryption consists of eight rounds of encrypting.

Serpent

This algorithm was invented by Ross Anderson, Eli Biham, and Lars Knudsen. It was submitted to the AES competition but was not selected, in large part due to the fact that its performance is slower than AES. However, in the ensuing years since the AES competition, computational power has increased dramatically. This has led some experts to reconsider the use of Serpent on modern systems.

Twofish

Twofish was one of the five finalists of the AES contest (which we will explore in more detail in Chapter 7, “Virtual Private Networks”). It is related to the block cipher Blowfish, and Bruce Schneier also was part of the team that worked on this algorithm. Twofish is a Feistel cipher that uses a 128-bit block size and key sizes of 128, 192, and 256 bits. It also has 16 rounds, like DES. Like Blowfish, Twofish is not patented and is in the public domain and can be used without restrictions by anyone who wishes to use it.

Selecting a Block Cipher

If you have decided on a block cipher, which one do you choose? We have examined several here, all with different strengths and weaknesses. No single answer is right for everyone. Several factors are involved in the selection of an encryption algorithm. Here are a few things to keep in mind:

images If you encrypt large amounts of data, then speed of the encryption might be almost as important as security.

images If you have standard business data, then almost any of the well-known, accepted encryption methods will probably be secure enough, and you can focus on things such as key length and speed in your decision-making process. However, if you are sending highly sensitive data, such as research or military data, you should be more concerned about security, even at the expense of speed.

images Variable-length keys are important only if you need them. If you have some encryption products used inside the United States and some outside, then at least two lengths are needed. If you have some data you want more strongly encrypted even if it means slower speed, and other data that needs to be fast but not as secure, then a variable-length key is also important.

Key Stretching

It is sometimes necessary to lengthen a key to make it stronger. This process is often called key stretching. The key is put through an algorithm that will stretch it, or make it longer. There are two widely used key stretching algorithms:

images PBKDF2 (Password-Based Key Derivation Function 2) is part of PKCS #5 v. 2.01. It applies some function (like a hash or HMAC) to the password or passphrase along with salt to produce a derived key. (The salt concept is discussed later, in the “Hashing” section.)

images bcrypt is used with passwords, and it essentially uses a derivation of the Blowfish algorithm, converted to a hashing algorithm, to hash a password and add salt to it.

PRNG

You have already seen that symmetric ciphers all need a cipher key. How are those generated? In fact, algorithms called pseudo-random number generators (PRNG) are used to generate these keys. Truly random numbers are only generated by natural phenomena such as radioactive decay. This is not convenient for encrypting data. So instead, we use algorithms that produce numbers that are “random enough,” and these algorithms are pseudo-random number generators. What makes a PRNG “good enough”? There are three properties that one desires:

images Uncorrelated sequences: The sequences are not correlated. You cannot take a given stretch of numbers (say 16 bits) and use that to predict subsequent bits.

images Long period: Ideally, the series of digits (usually bits) should never have any repeating patterns. However, the reality is that there will eventually be some repetition. The distance (in digits or bits) between repetitions is the period. The longer the period the better.

images Uniformity: Pseudo-random numbers are usually represented in binary format. There should be an equal number of 1s and 0s, though they need not be distributed in any discernible pattern. The sequence of random numbers should be uniform and unbiased.

The German Federal Office for Information Security (BSI) has established four criteria for quality of random number generators:

images K1: A sequence of random numbers with a low probability of containing identical consecutive elements.

images K2: A sequence of numbers that is indistinguishable from “true random” numbers according to specified statistical tests.

images K3: It should be impossible for any attacker to calculate, or otherwise guess, from any given sub-sequence, or from any previous or future values in the sequence.

images K4: It should be impossible for an attacker to calculate, or guess from an inner state of the generator, any previous numbers in the sequence or any previous inner generator states.

Public Key Encryption

Public key encryption is essentially the opposite of single-key encryption. With any public key encryption algorithm, one key is used to encrypt a message (called the public key) and another is used to decrypt the message (the private key). You can freely distribute your public key so that anyone can encrypt a message to send to you, but only you have the private key and only you can decrypt the message. The actual mathematics behind the creation and applications of the keys is a bit complex and beyond the scope of this book. Many public key algorithms are dependent, to some extent, on large prime numbers, factoring, and number theory.

RSA

The RSA method is a widely used encryption algorithm. You cannot discuss cryptography without at least some discussion of RSA. This public key method was developed in 1977 by three mathematicians: Ron Rivest, Adi Shamir, and Len Adleman. The name RSA is derived from the first letter of each mathematician’s last name .

One significant advantage of RSA is that it is a public key encryption method. That means there are no concerns with distributing the keys for the encryption. However, RSA is much slower than symmetric ciphers. In fact, in general, asymmetric ciphers are slower than symmetric ciphers.

The steps to create the key are as follow:

1. Generate two large random primes, p and q, of approximately equal size.

2. Pick two numbers so that when they are multiplied together the product will be the size you want (that is, 2048 bits, 4096 bits, etc).

3. Now multiply p and q to get n.

4. Let n = pq.

5. Multiply Euler’s totient for each of these primes. If you are not familiar with this concept, the Euler’s Totient is the total number of co-prime numbers. Two numbers are considered co-prime if they have no common factors. For example, if the original number is 7, then 5 and 7 would be co-prime. It just so happens that for prime numbers, this is always the number minus 1. For example, 7 has 6 numbers that are co-prime to it (if you think about this a bit you will see that 1, 2, 3, 4, 5, 6 are all co-prime with 7).

6. Let m = (p – 1)(q – 1).

7. Select another number; call this number e. You want to pick e so that it is co-prime to m.

8. Find a number d that when multiplied by e and modulo m would yield 1. (Note: Modulo means to divide two numbers and return the remainder. For example, 8 modulo 3 would be 2.)

9. Find d, such that de mod m ≡ 1.

Now you publish e and n as the public key and keep d and n as the secret key.

To encrypt you simply take your message raised to the e power and modulo n:

= Me % n

To decrypt you take the ciphertext, and raise it to the d power modulo n:

P = Cd % n

If all this seems a bit complex to you, you must realize that many people work in network security without being familiar with the actual algorithm for RSA (or any other cryptography for that matter). You can also get a better understanding of RSA by walking through the algorithm utilizing small integers.

Normally RSA is done with very large integers. To make the math easy to follow, this example uses small integers (Note: This example is from Wikipedia):

1. Choose two distinct prime numbers, such as p = 61 and q = 53.

2. Compute n = pq giving n = 61 × 53 = 3233.

3. Compute the totient of the product as Φ(n) = (p − 1)(q − 1) giving Φ(3233) = (61 − 1)(53 − 1) = 3120.

4. Choose any number 1 < e < 3120 that is co-prime to 3120. Choosing a prime number for e leaves us only to check that e is not a divisor of 3120. Let e = 17.

5. Compute d, as shown before such that de mod m ≡ 1; yielding d = 2753.

6. The public key is (n = 3233, e = 17). For a padded plaintext message m, the encryption function is m17 (mod 3233).

7. The private key is (n = 3233, d = 2753). For an encrypted ciphertext c, the decryption function is c2753 (mod 3233).

RSA is based on large prime numbers. You might think, “Couldn’t someone take the public key and use factoring to derive the private key?” Well, hypothetically, yes. However, it turns out that factoring really large numbers into their prime factors is pretty difficult. No efficient algorithm exists for doing it. By “large numbers,” we mean that RSA can use 1024-, 2048-, 4096-bit and larger keys. Those make for some huge numbers. Of course, if anyone ever invents an efficient algorithm that will factor a large number into its prime factors, RSA would be dead.

RSA has become a popular encryption method. It is considered quite secure and is often used in situations where a high level of security is needed.

Diffie-Hellman

Now that you have seen RSA, consider a few other asymmetric algorithms. Probably the most well known is Diffie-Hellman, which was the first publicly described asymmetric algorithm.

This cryptographic protocol allows two parties to establish a shared key over an insecure channel. In other words, Diffie-Hellman is often used to allow parties to exchange a symmetric key through some unsecure medium, such as the Internet. It was developed by Whitfield Diffie and Martin Hellman in 1976. An interesting factoid is that the method had actually been developed a few years earlier by Malcolm J. Williamson of the British Intelligence Service, but it was classified.

ElGamal

ElGamal is based on the Diffie-Hellman key exchange algorithm just described. It was first described by Taher Elgamal in 1984. It is used in some versions of Pretty Good Privacy (PGP).

MQV

Like ElGamal, MQV (Menezes-Qu-Vanstone) is a protocol for key agreement that is based on Diffie-Hellman. It was first proposed by Menezes, Qu, and Vanstone in 1995 and then modified in 1998. MQV is incorporated in the public key standard IEEE P1363.

Digital Signature Algorithm

Digital Signature Algorithm (DSA) is described in U.S. Patent 5,231,668, filed July 26, 1991, and attributed to David W. Kravitz. It was adopted by the U.S. government in 1993 with FIPS 186. Although any asymmetric algorithm can be used for digital signatures, this algorithm was designed for that purpose.

Elliptic Curve

The Elliptic Curve algorithm was first described in 1985 by Victor Miller (IBM) and Neil Koblitz (University of Washington).

The security of Elliptic Curve cryptography is based on the fact that finding the discrete logarithm of a random elliptic curve element with respect to a publicly known base point is difficult to the point of being impractical to do.

The size of the elliptic curve determines the difficulty of the finding the algorithm, and thus the security of the implementation. The level of security afforded by an RSA-based system with a large modulus can be achieved with a much smaller elliptic curve group. There are actually several ECC algorithms. There is an ECC version of Diffie-Hellman, an ECC version of DSA, and many others.

The U.S. National Security Agency has endorsed ECC (Elliptic Curve Cryptography) by including schemes based on it in its Suite B set of recommended algorithms and allows their use for protecting information classified up to top secret with 384-bit keys.

Digital Signatures

A digital signature is using asymmetric cryptography, in reverse order. Consider a situation wherein the concern is not data confidentiality, but rather verifying who sent the message. Perhaps you get an e-mail from your boss telling you that you should take next week off with pay. It would be a good idea for you to verify this message indeed came from your boss, and is not a spoofed message from a colleague playing a prank. Digital signatures accomplish this.

Some part of the message, often a hash of the message, is encrypted (or signed) with the user’s private key. Of course, because anyone can access that sender’s public key, this process does nothing for confidentiality. But any recipient can verify the signature using the sender’s public key, and be confident the sender really sent the message.

Identifying Good Encryption

Dozens of other encryption methods are released to the public for free or patented and sold for profit every year. However, this particular area of the computer industry is replete with frauds and charlatans. One need only search for “encryption” on any search engine to find a plethora of ads for the latest greatest “unbreakable” encryption. If you are not knowledgeable about encryption, how do you separate legitimate encryption methods from frauds?

Although no guaranteed way to detect fraud exists, the following guidelines should help you avoid most fraudulent encryption claims:

images “Unbreakable”: Anyone with any experience in cryptography knows that there is no such thing as an unbreakable code. Codes exist that have not yet been broken. Some codes are very hard to break. However, when someone claims that his method is completely unbreakable, you should be suspicious.

images “Certified”: No recognized certification process for encryption methods exists, so any “certification” the company has is totally worthless.

images Inexperienced vendors: Find out about the experience of any company marketing a new encryption method. What is the experience of the people working with it? Do they have a background in math, encryption, or algorithms? If not, have they submitted their method to experts in peer-reviewed journals? Are they at least willing to disclose how their method works so that it can be fairly judged?

Some experts claim you should only use widely known methods such as Blowfish. I disagree. Having a secure system using less well-known or even new encryption methods is certainly possible. All the widely used methods of today were once new and untested. However, taking extra precautions to ensure that you are not being misled when using a less well-known method is necessary. To be clear, I am not in any way suggesting untested algorithms. For example, when the NIST had the contest that ended with selecting the Rijndael cipher to be AES, there were four other finalist algorithms that had been rigorously tested, but were rejected, some for performance issues. Those might be good algorithms to consider.

Understanding Digital Signatures and Certificates

A digital signature is not used to ensure the confidentiality of a message, but rather to guarantee who sent the message. This is referred to as non-repudiation. Essentially, a digital signature proves who the sender is. Digital signatures are actually rather simple, but clever. They simply reverse the asymmetric encryption process. Recall that in asymmetric encryption, the public key (which anyone can have access to) is used to encrypt a message to the recipient, and the private key (which is kept secure, and private) can decrypt it. With a digital signature, the sender encrypts something with his or her private key. If the recipient is able to decrypt that with the sender’s public key, then it must have been sent by the person purported to have sent the message.

Digital Certificates

Remember from the asymmetric cryptography discussion that public keys are widely distributed and that getting someone’s public key is fairly easy to do. You have also seen in the preceding section that public keys are also needed to verify a digital signature. As to how public keys are distributed, probably the most common way is through digital certificates. The digital certificate contains a public key and some means to verify whose public key it is.

X.509 is an international standard for the format and information contained in a digital certificate. X.509 is the most used type of digital certificate in the world. It is a digital document that contains a public key signed by the trusted third party, which is known as a certificate authority (CA). The contents of an X.509 certificate are

images Version

images Certificate holder’s public key

images Serial number

images Certificate holder’s distinguished name

images Certificate’s validity period

images Unique name of certificate issuer

images Digital signature of issuer

images Signature algorithm identifier

A certificate authority issues digital certificates. The primary role of the CA is to digitally sign and publish the public key bound to a given user. It is an entity trusted by one or more users to manage certificates.

A registration authority (RA) is often used to take the burden off of a CA by handling verification prior to certificates being issued. RAs act as a proxy between users and CAs. RAs receive a request, authenticate it, and forward it to the CA.

A public key infrastructure (PKI) distributes digital certificates. This is a network of trusted CA servers that serves as the infrastructure for distributing digital certificates that contain public keys. A PKI is an arrangement that binds public keys with respective user identities by means of a CA.

What if a certificate is expired, or revoked? A certificate revocation list (CRL) is a list of certificates that have been revoked for one reason or another. Certificate authorities publish their own certificate revocation lists. A newer method for verifying certificates is Online Certificate Status Protocol (OSCP), a real-time protocol for verifying certificates.

There are several different types of X.509 certificates. They each have at least the elements listed at the beginning of this section, but are for different purposes. The most common certificate types are listed here.

images Domain validation certificates are among the most common. These are used to secure communication with a specific domain. This is a low-cost certificate that website administrators use to provide TLS for a given domain.

images Wildcard certificates, as the name suggests, can be used more widely, usually with multiple sub-domains of a given domain. So rather than have a different X.509 certificate for each sub-domain, you would use a wildcard certificate for all sub-domains.

images Code-signing certificates are X.509 certificates used to digitally sign some type of computer code. These usually require more validation of the person requesting the certificate, before they can be issued.

images Machine/computer certificates are X.509 certificates assigned to a specific machine. These are often used in authentication protocols. For example, in order for the machine to sign into the network, it must authenticate using its machine certificate.

images User certificates are used for individual users. Like machine/computer certificates, these are often used for authentication. The user must present his or her certificate to authenticate prior to accessing some resource.

images E-mail certificates are used for securing e-mail. Secure Multipurpose Internet Mail Extensions (S/MIME) uses X.509 certificates to secure e-mail communications. PGP, of course, uses PGP certificates.

images A Subject Alternative Name (SAN) is not so much a type of certificate as a special field in X.509. It allows you to specify additional items to be protected by this single certificate. These could be additional domains or IP addresses.

images Root certificates are used for root authorities. These are usually self-signed by that authority.

PGP Certificates

Pretty Good Privacy (PGP) is not a specific encryption algorithm, but rather a system. It offers digital signatures, asymmetric encryption, and symmetric encryption. It is often found in e-mail clients. PGP was introduced in the early 1990s, and it’s considered to be a very good system.

PGP uses its own certificate format. The main difference, however, is that PGP certificates are self-generated. They are not generated by any certificate authority.

Hashing

A hash function, H, is a function that takes a variable-size input m and returns a fixed-size string. The value that is returned is called the hash value h or the digest. This can be expressed mathematically as h = H(m). There are three properties a hash function should have:

images Variable length input with fixed length output. In other words, no matter what you put into the hashing algorithm, the same sized output is produced.

images H(x) is one-way; you cannot “un-hash” something.

images H(x) is collision-free. Two different input values do not produce the same output. A collision refers to a situation where two different inputs yield the same output. A hash function should not have collisions.

Hashing is how Windows stores passwords. For example, if your password is “password,” then Windows will first hash it, producing something like:

0BD181063899C9239016320B50D3E896693A96DF

It then stores that hash in the SAM (Security Accounts Manager) file in the Windows System directory. When you log on, Windows cannot “un-hash” your password, so what Windows does is take whatever password you type in, hash it, and then compare the result with what is in the SAM file. If they match (exactly) then you can log in.

Storing Windows passwords is just one application of hashing. There are others. For example, in computer forensics, hashing a drive before starting a forensic examination is common practice. Then later you can always hash it again to see whether anything was changed (accidently or intentionally). If the second hash matches the first, then nothing has been changed.

In relationship to hashing, the term salt refers to random bits that are used as one of the inputs to the hash. Essentially, the salt is intermixed with the message that is to be hashed. Salt data complicates dictionary attacks that use pre-encryption of dictionary entries. It also is effective against rainbow table attacks. For best security, the salt value is kept secret, separate from the password database/file.

MD5

MD5 is a 128-bit hash that is specified by RFC 1321. It was designed by Ron Rivest in 1991 to replace an earlier hash function, MD4. In 1996, a flaw was found with the design of MD5. Although it was not a clearly fatal weakness, cryptographers began recommending the use of other algorithms, such as SHA-1. The biggest problem with MD5 is that it is not collision resistant.

SHA

The Secure Hash Algorithm is perhaps the most widely used hash algorithm today. Several versions of SHA now exist. SHA (all versions) is considered secure and collision free. The versions include

images SHA-1: This 160-bit hash function resembles the MD5 algorithm. This was designed by the National Security Agency (NSA) to be part of the Digital Signature Algorithm.

images SHA-2: This is actually two similar hash functions, with different block sizes, known as SHA-256 and SHA-512. They differ in the word size; SHA-256 uses 32-byte (256 bits) words whereas SHA-512 uses 64-byte (512 bits) words. There are also truncated versions of each standard, known as SHA-224 and SHA-384. These were also designed by the NSA.

images SHA-3: This is the latest version of SHA. It was adopted in October of 2012.

RIPEMD

RACE Integrity Primitives Evaluation Message Digest is a 160-bit hash algorithm developed by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel. There exist 128-, 256-, and 320-bit versions of this algorithm, called RIPEMD-128, RIPEMD-256, and RIPEMD-320, respectively. These all replace the original RIPEMD, which was found to have collision issues. The larger bit sizes make this far more secure than MD5 or RIPEMD.

RIPEMD-160 was developed in Europe as part of RIPE project and is recommended by the German Security Agency. The authors of the algorithm describe RIPEMD as follows: “RIPEMD-160 is a fast cryptographic hash function that is tuned towards software implementations on 32-bit architectures. It has evolved from the 256-bit extension of MD4, which was introduced in 1990 by Ron Rivest. Its main design features are two different and independent parallel chains, the result of which are combined at the end of every application of the compression function.” To read the authors’ full explanation of RIPEMD-160, see www.esat.kuleuven.be/cosic/publications/article-317.pdf.

HAVAL

HAVAL is a cryptographic hash function. Unlike MD5, but like most other modern cryptographic hash functions, HAVAL can produce hashes of different lengths. HAVAL can produce hashes in lengths of 128 bits, 160 bits, 192 bits, 224 bits, and 256 bits. HAVAL also allows users to specify the number of rounds (3, 4, or 5) to be used to generate the hash. HAVAL was invented by Yuliang Zheng, Josef Pieprzyk, and Jennifer Seberry in 1992.

Understanding and Using Decryption

Obviously, if one can encrypt a message, one can decrypt it as well. The preferred method is, of course, to have the key and the algorithm, and hopefully the software used to encrypt and then easily decrypt the message. However, people attempting to breach your security will not have the algorithm and key and will want to break your encrypted transmissions or data.

Decryption is a science much like encryption. It employs various mathematical methods to crack encryption. You can find a number of utilities on the web that will actually crack encryption for you. As discussed earlier, no such thing exists as unbreakable encryption. However, the more secure an encryption technique is, the longer it will take to crack. If it takes months or years of dedicated effort to crack, then your data is secure. By the time someone cracks it, the information will likely no longer be relevant or useful to them.

Security professionals and security-savvy network administrators frequently use the same tools to inspect their systems that hackers use to try to break into them. Using the tools of hackers to try to crack an encryption method is a practical and straightforward way of testing data security.

Cracking Passwords

Although not exactly the same as breaking encrypted transmissions, cracking passwords is similar to it. If someone is able to successfully crack a password, particularly the administrator password, then other security measures are rendered irrelevant.

John the Ripper

John the Ripper is a password cracker popular with both network administrators and hackers. You can download it for free from www.openwall.com/john/.

This product is completely command line–based and has no Windows interface. It enables the user to select text files for word lists to attempt cracking a password. Although John the Ripper is less convenient to use because of its command-line interface, it has been around for a long time and is well regarded by both the security and hacking communities. Interestingly, a tool is available at www.openwall.com/passwdqc/ that ensures your passwords cannot easily be cracked by John the Ripper.

John the Ripper works with password files rather than attempting to crack live passwords on a given system. Passwords are usually encrypted and in a file on the operating system. Hackers frequently try to get that file off of a machine and download it to their own system so they can crack it at will. They might also look for discarded media in your dumpster in order to find old backup tapes that might contain password files. Each operating system stores that file in a different place:

images In Linux, it is /etc/passwd.

images In Windows 95, it is in a .pwl file.

images In Windows 2000 and beyond, it is in a hidden .sam file.

After you have downloaded John the Ripper, you can run it by typing in (at a command line) the word john followed by the file you want it to try to crack:

john passwd

To make it use a wordlist with rules only, type:

john -wordfile:/usr/dict/words -rules passwd

Cracked passwords will be printed to the terminal and saved in a file called john.pot, found in the directory into which you installed John the Ripper.

Using Rainbow Tables

In 1980 Martin Hellman described a cryptanalytic time-memory trade-off that reduces the time of cryptanalysis by using pre-calculated data stored in memory. Essentially, these types of password crackers are working with pre-calculated hashes of all passwords available within a certain character space, be that a-z or a-zA-z or a-zA-Z0-9 etc. These files are called rainbow tables. They are particularly useful when trying to crack hashes. Because a hash is a one-way function, the way to break it is to attempt to find a match. The attacker takes the hashed value and searches the rainbow tables seeking a match to the hash. If one is found, then the original text for the hash is found. Popular hacking tools such as Ophcrack depend on rainbow tables.

Using Other Password Crackers

Many other password crackers are available, many of which you can find on the Internet and download for free. The following list of websites might be useful in that search:

images Russian password crackers: www.password-crackers.com/crack.html

images Password recovery: www.elcomsoft.com/prs.html

images LastBit password recovery: http://lastbit.com/mso/Default.asp

Password crackers should be used only by administrators to test their own systems’ defenses. Attempting to crack another person’s password and infiltrate her system has both ethical and legal ramifications.

General Cryptanalysis

Rainbow tables are a way to get around passwords; however, cryptanalysis is the science of trying to find alternate ways to break cryptography. In most cases, it is not terribly successful. If you have watched the news in the past year or two, you are aware that the U.S. FBI has been unable to break the AES encryption on the iPhone. Cryptanalysis can be quite tedious, and with no guarantee of success. However, some common methods are discussed here.

Brute Force

This method simply involves trying every possible key. It is guaranteed to work, but is likely to take so long that it is simply not useable. For example, to break a Caesar cipher there are only 26 possible keys, which you can try in a very short time. But consider AES, with the smallest key size of 128 bits. If you tried 1 trillion keys a second, it could take 112,527,237,738,405,576,542 years to try them all. That is a bit longer than I care to wait!

Frequency Analysis

Frequency analysis involves looking at blocks of an encrypted message to determine if any common patterns exist. Initially, the analyst doesn’t try to break the code but looks at the patterns in the message. In the English language, the letters e and t and words like the, and, that, it, and is are very common. Single letters that stand alone in a sentence are usually limited to a and I.

A determined cryptanalyst looks for these types of patterns and, over time, may be able to deduce the method used to encrypt the data. This process can sometimes be simple, or it may take a lot of effort. This method works only on the historical ciphers we discussed at the beginning of this chapter. It does not work on modern algorithms.

Known Plaintext

This attack relies on the attacker having pairs of known plaintext along with the corresponding ciphertext. This gives the attacker a place to start attempting to derive the key. With modern ciphers, it would still take many billions of such combinations to have a chance at cracking the cipher. This method was, however, successful at cracking the German Naval Enigma. The code breakers at Bletchley Park realized that all German Naval messages ended with Heil Hitler. They used this known plaintext attack to crack the key.

Chosen Plaintext

In this attack, the attacker obtains the ciphertexts corresponding to a set of plaintexts of their own choosing. This allows the attacker to attempt to derive the key used and thus decrypt other messages encrypted with that key. This can be difficult, but it is not impossible. Advanced methods such as differential cryptanalysis are types of chosen plaintext attacks.

Related Key Attack

This is like a chosen-plaintext attack, except the attacker can obtain ciphertexts encrypted under two different keys. This is actually a very useful attack if you can obtain the plaintext and matching ciphertext.

Birthday Attack

This is an attack on cryptographic hashes, based on something called the birthday theorem. The basic idea is this: How many people would you need to have in a room to have a strong likelihood that two people would have the same birthday (month and day, but not year)?

Obviously, if you put 367 people in a room, at least 2 of them must have the same birthday, since there are only 365 days in a year, plus one more in a leap year.

The paradox is not asking how many people you need to guarantee a match, just how many you need to have a strong probability.

Even with 23 people in the room, you have a 50 percent chance that 2 will have the same birthday.

The probability that the first person does not share a birthday with any previous person is 100 percent, because there are no previous people in the set. That can be written as 365/365.

The second person has only one preceding person, and the odds that the second person has a birthday different from the first are 364/365.

The third person might share a birthday with two preceding people, so the odds of having a birthday from either of the two preceding people are 363/365. Because each of these are independent, we can compute the probability as follows:

365/365 × 364/365 × 363/365 * 362/365 … × 342/365

(342 is the probability the 23rd person shares a birthday with a preceding person.) When we convert these to decimal values, it yields (truncating at the third decimal point):

1 × 0.997 × 0.994 × 0.991 × 0.989 × 0.986 × … 0.936 = 0.49, or 49 percent

This 49 percent is the probability that 23 people will not have any birthdays in common; thus, there is a 51 percent (better than even odds) chance that 2 of the 23 will have a birthday in common.

The math works out to about 1.7 √n to get a collision. Remember a collision is when two inputs produce the same output. So for an MD5 hash, you might think you need 2128 + 1 different inputs to get a collision. And for a guaranteed collision you do. That is an exceedingly large number: 3.4028236692093846346337460743177e+38. But the birthday paradox tells us that to just have a 51 percent chance of there being a hash, you only need 1.7 √n (n being 2128) inputs. That number is still very large: 31,359,464,925,306,237,747.2. But it is much smaller than the effort of trying every single input!

Differential Cryptanalysis

Differential cryptanalysis is a form of cryptanalysis applicable to symmetric key algorithms. This was invented by Eli Biham and Adi Shamir. Essentially it is the examination of differences in an input and how that affects the resultant difference in the output. It originally worked only with chosen plaintext. However, it could also work with known plaintext and ciphertext only.

The attack is based on seeing pairs of plaintext inputs that are related by some constant difference. The usual way to define the differences is via XOR operation, but other methods can be used. The attacker computes the differences in the resulting ciphertexts and is looking for some statistical pattern. The resulting differences are called the differential. Put another way, differential cryptanalysis focuses on finding a relationship between the changes that occur in the output bits as a result of changing some of the input bits.

Linear Cryptanalysis

This technique was invented by Mitsuru Matsui. It is a known plaintext attack and uses a linear approximation to describe the behavior of the block cipher. Given enough pairs of plaintext and corresponding ciphertext, bits of information about the key can be obtained. Obviously, the more pairs of plaintext and ciphertext one has, the greater the chance of success. Linear cryptanalysis is based on finding affine approximations to the action of a cipher. It is commonly used on block ciphers.

Remember cryptanalysis is an attempt to crack cryptography. For example, with the 56-bit DES key, brute force could take up to 256 attempts. Linear cryptanalysis will take 247 known plaintexts.1 This is better than brute force, but still impractical for most situations.

Steganography

Steganography is the art and science of writing hidden messages in such a way that no one, apart from the sender and intended recipient, suspects the existence of the message; this is a form of security through obscurity. The message is often hidden in some other file such as a digital picture or audio file, so as to defy detection.

The advantage of steganography, over cryptography alone, is that messages do not attract attention to themselves. If no one is aware the message is even there, then they won’t even try to decipher it. In many cases messages are encrypted and hidden via steganography.

The most common implementation of steganography utilizes the least significant bits in a file in order to store data. By altering the least significant bit, one can hide additional data without altering the original file in any noticeable way.

Here are some basic steganography terms you should know:

images Payload is the data to be covertly communicated. In other words, it is the message you want to hide.

images The carrier is the signal, stream, or data file into which the payload is hidden.

images The channel is the type of medium used. This might be still photos, video, or sound files.

Although the use of digital steganography is obviously rather recent, the concept of hiding messages is not. Here are some instances of historical hidden messages:

images The ancient Chinese wrapped notes in wax and swallowed them for transport.

images In ancient Greece a messenger’s head might be shaved, a message written on his head, then his hair was allowed to grow back.

images In the early 1500s Johannes Trithemius wrote a book on cryptography and described a technique where a message was hidden by having each letter taken as a word from a specific column.

In more recent times, but before the advent of computers, other methods were used to hide messages:

images During WWII the French Resistance sent messages written on the backs of couriers using invisible ink.

images Microdots are images/undeveloped film the size of a typewriter period, embedded on innocuous documents. These were said to be used by spies during the Cold War.

The most common way steganography is accomplished today is via least significant bits. Every file has a certain number of bits per unit of the file. For example, an image file in Windows is 24 bits per pixel. If you change the least significant of those bits, then the change is not noticeable with the naked eye. For example, one can hide information in the least significant bits of an image file. With least significant bit (LSB) replacement, certain bits in the carrier file are replaced.

Steganophony is a term for hiding messages in sound files. This can be done with the LSB method or other methods, such as echo hiding, which adds extra sound to an echo inside an audio file—that extra sound conceals information.

Information can also be hidden in video files. Various methods to accomplish this exist. Discrete Cosine Transform is often used for video steganography. This method alters values of certain parts of the individual frames. The usual method is to round up the values.

A number of tools are available for implementing steganography. Many are free or at least have a free trial version. A few of these tools are listed here:

images QuickStego: Easy to use but very limited

images Invisible Secrets: Much more robust with both a free and commercial version

images MP3Stego: Specifically for hiding payload in MP3 files

images Stealth Files 4: Works with sound files, video files, and image files

images SNOW: Hides data in whitespace

Steganalysis

Forensics examiners must be concerned with detecting steganography and extracting the hidden information. This task is usually done by software, but understanding what the software is doing is important. By analyzing changes in an image’s close color pairs, the steganalyst can determine whether LSB substitution was used. Close color pairs consist of two colors whose binary values differ only in the LSB.

Several methods exist for analyzing an image to detect hidden messages, one of which is the Raw Quick Pair (RQP) method. This is based on statistics of the numbers of unique colors and close-color pairs in a 24-bit image. RQP analyzes the pairs of colors created by LSB embedding.

Another option uses the chi-squared method from statistics. Chi-square analysis calculates the average LSB and builds a table of frequencies and pair of values. It then performs a chi-square test on these two tables. Essentially, it measures the theoretical versus the calculated population difference.

Steganalysis of audio files involves examining noise distortion in the carrier file. Noise distortion could indicate the presence of a hidden signal.

Quantum Computing and Quantum Cryptography

Innovations in quantum computing promise to significantly increase computing power. This will provide advances in numerous aspects of computing including data mining, artificial intelligence, and other applications. However, the increase in computing power will also present a challenge for cryptography. Computing relies on individual bits to store information. Quantum computing relies on qubits.

The essential issue with quantum computing is the ability to represent more than two states. Current computing technology, using classical bits, can only represent binary values. Qubits, or quantum bits, store data via the polarization of a single photon. The two basic states are horizontal or vertical polarization. However, quantum mechanics allows for a superposition of the two states at the same time. This is something simply not possible in a classical bit. The two states of a qubit are represented with quantum notation as |0> or |1>. These represent horizontal or vertical polarization. A qubit is the superposition of these two basis states. This superposition is represented as | ψ > = α|0> + β|1>.

Essentially a classical bit can represent a one or a zero. A qubit can represent a one, a zero, or any quantum superposition of those two qubit states. This superposition allows for much more powerful computing. The superposition allows the qubit to store a one, a zero, both a one and a zero, or an range of values between one and zero. This significantly increases data storage and data processing power.

There are already quantum based algorithms that are far superior at factoring large numbers than are classical algorithms. That is a critical issue because the widely used RSA algorithm is based on the difficulty of factoring a large number into its prime factors. When quantum computers become a reality, that factoring problem will no longer be difficult, and RSA will be obsolete. Various key exchange algorithms such as Diffie-Hellman depend on the difficulty in solving the discrete logarithm problem. The two most significant improvements to Diffie-Hellman, ElGamal, and MQV (Menezes-Qu-Vanstone) also depend on the discrete logarithm problem. Elliptic curve cryptography is based on the difficulty of solving the discrete logarithm problem with respect to an elliptic curve. Quantum algorithms will also make the discrete logarithm problem quite solvable, thus rendering these algorithms obsolete as well.

Essentially all current asymmetric cryptography is based on one of these two general classes of number theoretic problems: factoring or solving the discrete logarithm problem. Essentially, when quantum computers become a practical reality, rather than just a research interest, all modern asymmetric algorithms will become obsolete. This is a significant concern for cyber security because all modern e-commerce, encrypted e-mail, and secure communications over a network depend on these algorithms. Currently, the NIST is working on a multi-year study to determine standards for post-quantum cryptography. And there are several promising cryptographic systems. It is beyond the scope of this chapter to explore quantum cryptography in depth; however, lattice based cryptography and multi-variate cryptography look to be good candidates for post quantum computing cryptographic systems.

Summary

Encryption is a basic element of computer security. You should never send sensitive data that has not been encrypted. Encrypting your system’s hard drives is also a good idea, so that if they are stolen, the valuable data on the drives is less likely to be compromised. Reading this chapter won’t qualify you as a cryptographer, but the information it provides does offer a basic outline of how cryptography works. In the following exercises, you will practice using different cipher methods and learn more about a number of encryption methods.

Test Your Skills

MULTIPLE CHOICE QUESTIONS

1. Why is encryption an important part of security?

A. No matter how secure your network is, the data being transmitted is still vulnerable without encryption.

B. Encrypted transmissions will help stop denial of service attacks.

C. A packet that is encrypted will travel faster across networks.

D. Encrypted transmissions are only necessary with VPNs.

2. Which of the following is the oldest known encryption method?

A. PGP

B. Multi-alphabet

C. Caesar cipher

D. Cryptic cipher

3. Which of the following is the primary weakness in the Caesar cipher?

A. It does not disrupt letter frequency.

B. It does not use complex mathematics.

C. It does not use a public key system.

D. There is no significant weakness; the Caesar cipher is adequate for most encryption uses.

4. An improvement on the Caesar cipher that uses more than one shift is called a what?

A. DES encryption

B. Multi-alphabet substitution

C. IDEA

D. Triple DES

5. Which binary mathematical operation can be used for a simple encryption method?

A. Bit shift

B. OR

C. XOR

D. Bit swap

6. Why is the method described in Question 5 not secure?

A. It does not change letter or word frequency.

B. The mathematics are flawed.

C. It does not use a symmetric key system.

D. The key length is too short.

7. Which of the following is a symmetric key system using blocks?

A. RSA

B. DES

C. PGP

D. Diffie-Hellman

8. What is the primary advantage of the encryption algorithm described in Question 7?

A. It is complex.

B. It is unbreakable.

C. It uses asymmetric keys.

D. It is relatively fast.

9. What size key does the algorithm in question 7 use?

A. 255 bit

B. 128 bit

C. 56 bit

D. 64 bit

10. What type of encryption uses a different key to encrypt the message than it uses to decrypt the message?

A. Private key

B. Public key

C. Symmetric

D. Secure

11. Which of the following is an encryption method developed by three mathematicians in the 1970s?

A. PGP

B. DES

C. DSA

D. RSA

12. Which encryption algorithm uses a variable-length symmetric key?

A. RSA

B. Blowfish

C. DES

D. PGP

13. Which of the following encryption algorithms is a block cipher, and uses the Rijndael algorithm?

A. DES

B. RSA

C. AES

D. NSA

14. If you are using a block cipher to encrypt large amounts of data, which of the following would be the most important consideration when deciding which cipher to use (assuming all of your possible choices are well known and secure)?

A. Size of the keys used

B. Speed of the algorithm

C. Whether or not it has been used by any military group

D. Number of keys used

15. Which of the following has three different key sizes it can use?

A. AES

B. DES

C. Triple DES

D. IDEA

16. Which of the following is the most common legitimate use for a password cracker?

A. There is no legitimate use for a password cracker.

B. Military intelligence agents using it to break enemy communications.

C. Testing the encryption of your own network.

D. Trying to break the communications of criminal organizations in order to gather evidence.

17. What is a digital signature?

A. A piece of encrypted data added to other data to verify the sender

B. A scanned-in version of your signature, often in .jpg format

C. A signature that is entered via a digital pad or other device

D. A method for verifying the recipient of a document

18. What is the purpose of a certificate?

A. To verify that software is virus free

B. To guarantee that a signature is valid

C. To validate the sender of a digital signature or software

D. To validate the recipient of a document

19. Who issues certificates?

A. The UN encryption authority

B. The United States Department of Defense

C. A private certificate authority

D. The Association for Computing Machinery

EXERCISES

EXERCISE 6.1: Using the Caesar Cipher

Note: This exercise is well suited for group or classroom exercises.

1. Write a sentence in normal text.

2. Use a Caesar cipher of your own design to encrypt it.

3. Pass the encrypted sentence to another person in your group or class.

4. Time how long it takes that person to break the encryption.

5. (optional) Compute the mean time for the class to break Caesar ciphers.

EXERCISE 6.2: Using Binary Block Ciphers

1. Write a single sentence in normal text.

2. Convert the text to ASCII. You can find several websites with ASCII code tables, such as http://www.asciitable.com.

3. Convert each character to binary.

4. Create a random 16-bit key. You can literally simply write down a random string of 1s and 0s.

5. XOR that key with your text.

6. Pass the encrypted sentence to another student in class and give her a chance to decipher it.

7. When all students have had adequate opportunity to break their fellow students’ encryption, have them give each other the appropriate key.

EXERCISE 6.3: Certificate Authorities

1. Search the web for certificate authorities.

2. Compare two certificate authorities. Which of the two would you recommend?

3. What reasons would you give a client for recommending the certificate authority you chose?

EXERCISE 6.4: Password Cracking

1. Download a password cracker of your choice.

2. Attempt to crack the password on your own PC.

3. Describe the results of your experiment. Were you able to crack the password? If so, how long did it take?

4. How does changing your password to make it more difficult affect the time it takes to crack the password?

PROJECTS

PROJECT 6.1: RSA Encryption

Using the web or other resources, write a brief paper about quantum encryption. Of particular interest should be the current state of research in that field (as opposed to simple background/history). You should also address what significant impediments there are to implementing quantum encryption.

PROJECT 6.2: Programming Caesar Cipher

Note: This project is for those students with some programming background.

Write a simple program, in any language you prefer or in the language your instructor recommends, that can perform a Caesar cipher. This chapter explains how this cipher works and offers some ideas for how to use ASCII codes for encryption in any standard programming language.

PROJECT 6.3: Historical Encryption

Find an encryption method that has been used historically but is no longer used (such as the Enigma cipher of the Germans in World War II). Describe how that encryption method works, paying particular attention to how it contrasts with more modern methods.

PROJECT 6.4: Password Cracking

Follow the steps in Exercise 6.4 with at least two other password-cracking utilities, and then write a report comparing and contrasting the password crackers. Note which one you think is most efficient. Also explain how using such a utility can be beneficial to a network administrator.

 

1. Matsui, M. (1999) Linear Cryptanalysis Method for DES. https://www.cs.bgu.ac.il/~beimel/Courses/crypto2001/Matsui.pdf.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.51.103