Chapter 1
Introduction to Block Ciphers

1.1 Block Cipher in Cryptology

1.1.1 Introduction

Information includes our private data that we desire to protect from unwilling leakage depending on the application. Cryptology is a field of research that offers appropriate solutions for the data protection by exploring how to construct a secure communication for fair information exchange. Modern cryptology often deals with digitalized data rather than analog data that cannot be expressed simply with a series of 0s and 1s. In our daily life, information is exchanged by digital devices such as radio frequency identification (RFID) tags, smart cards, and smart phones, where a computational resource is limited. Therefore, it is one of the most important challenges in cryptology to realize an efficient implementation of cryptosystems.

c1f001

Figure 1.1 Basic model for a symmetric-key cryptosystem

1.1.2 Symmetric-Key Ciphers

There are various ways to realize encryption that is a kind of computational process for information to be protected. In a symmetric-key cipher, information is encrypted with a secret key, and it is expected that the owner of the secret key can decrypt the encrypted information correctly. For instance, let us see the situation, where Alice would like to send a message to Bob in a secure way. If the secret key, K, is shared only with Alice and Bob, only Bob can decrypt the message from the encrypted message. The original and the encrypted messages are called plaintext and ciphertext, respectively. Figure 1.1 illustrates the encryption and decryption processes.

The encryption by Alice can be written as

1.1 equation

The ciphertext is decrypted by Bob as

1.2 equation

Only Bob can decrypt and read the message, and Eve, who does not own the secret key, cannot decrypt it.

Alice and Bob need to compute the cryptographic operations based on the functions, c1-math-0004 and c1-math-0005. The simpler the functions are, the more efficiently they can compute. For instance, Vernam cipher, invented in 1917, uses just XOR operations as

1.3 equation

to convert plaintext and ciphertext. The XOR operation is explained in Section 1.2.1.

However, in order to guarantee the security, that is, in order that Eve cannot obtain any information of message from C, the secret key needs to be refreshed with a random number for each encryption/decryption. In other words, in order to communicate securely with the Vernam cipher, a very long key, which is the same size as M, is required. This is significantly inefficient. In general, encryption anddecryption processes are based on the trade-offs between cost, performance, and security.

1.1.3 Efficient Block Cipher Design

The fundamental idea to achieve an efficient encryption scheme is designing a fixed-input size encryption scheme, and iteratively applying this scheme to encrypt arbitrary length messages. Such a fixed-input size encryption scheme is called block cipher, and the group of bits with the fixed-input size is called block. If the unit of operation is small enough, for example, 1 bit or 1 byte, such a symmetric-key cipher is called stream cipher. As block ciphers are expected to compute encryption and decryption efficiently, they have an iterated structure, and repeat the same function several times. Such a function is called round function. The iterated structure contributes to achieving a small program code in software and implementing a compact circuit design in hardware.

Modern block ciphers are mainly categorized into two kinds: Feistel structure and substitution-permutation network (SPN) structure. Feistel structure was employed in data encryption standard (DES) block cipher proposed in 1977. Including FEAL and Camellia, the Feistel structure has been employed by many block ciphers.

On the contrary, Advanced Encryption Standard (AES) employed SPN structure. AES is the main target of this book as it is one of the most widely used block ciphers, and it contains fundamental ideas of SPN structure. The basic mathematics to understand SPN structure and AES specification will be explained later in this chapter.

1.2 Boolean Function and Galois Field

Boolean functions are used in most of the block ciphers including AES. A Boolean function, f, is described as

1.4 equation

where c1-math-0011 is called Boolean domain and c1-math-0012 is the set of all n-tuples c1-math-0014, where c1-math-0015 are all in Boolean domain.1

1.2.1 INV, OR, AND, and XOR Operators

The most simple Boolean function is inversion or the INV operation that is a bit complement. It operates as

1.5a, 1.5b equation

where c1-math-0018 is used for representing the INV operation. Alternatively, the logic symbol, c1-math-0019, is also used for INV. In this book, we allow both usage, that is, c1-math-0020.

For the case of c1-math-0021, representative Boolean functions are OR, AND, and XOR. OR is defined as

1.6a, 1.6b equation

Likewise, AND and XOR are defined, respectively, as

1.7a, 1.7b equation
1.8a, 1.8b equation

“c1-math-0024,” “c1-math-0025,” and “c1-math-0026” are used for representing OR, AND, and XOR operations.

The truth table for OR, AND, and XOR is described in Table 1.1.

Table 1.1 Truth table for basic operators

x y c1-math-0029 c1-math-0030 c1-math-0031
0 0 0 0 0
0 1 1 0 1
1 0 1 0 1
1 1 1 1 0

1.2.2 Galois Field

Finite filed or Galois field deals with a finite number of elements. Over a Galois filed, addition, subtraction, multiplication, and division are defined. Galois field with the smallest order is called a binary field or c1-math-0032. For instance, addition, multiplication, additive inverse, and multiplicative inverse over c1-math-0033 are defined in Table 1.2.

Table 1.2 Operations over c1-math-0034

x y c1-math-0037 c1-math-0038 c1-math-0039 c1-math-0040
0 0 0 0 0 –
0 1 1 0 0 –
1 0 1 0 1 1
1 1 0 1 1 1

As can be found from Tables 1.1 and 1.2, addition and multiplication over c1-math-0041 are realized, respectively, with XOR and AND.

Table 1.3 Operations over c1-math-0043

x y c1-math-0046 c1-math-0047 c1-math-0048 c1-math-0049
0 0
0 1
0 2
0 3
0 4
1 0
1 1
1 2
1 3
1 4
2 0
2 1
2 2
2 3
2 4
3 0
3 1
3 2
3 3
3 4
4 0
4 1
4 2
4 3
4 4

1.2.3 Extended Binary Field and Representation of Elements

Binary field, c1-math-0050, can be extended to a large field size called extended binary field, c1-math-0051, where n is a positive integer. Especially, in the case of AES, operations in c1-math-0053 are of special interest. The number of elements of c1-math-0054 is c1-math-0055. There are several different representations for the elements, which affect the cost and speed performance of software and hardware implementations.

1.2.3.1 Polynomial Basis Representation

As the number of elements of c1-math-0056 is a power of 2, each bit of the binary representation can be used for each coefficient of a polynomial whose degree is c1-math-0057. Any element in c1-math-0058 can be expressed with the so-called polynomial basis as

1.9 equation

where c1-math-0060. For instance, 16 elements in c1-math-0061 can be expressed with the binary representation, c1-math-0062. By assigning each bit to the coefficient of a polynomial of x, we have c1-math-0064. Addition of two field elements, for example, c1-math-0065, can be calculated as

1.10 equation

as c1-math-0067 over c1-math-0068.

Multiplication of the two field elements, for example, c1-math-0069, needs modular reduction with an irreducible polynomial, for example, c1-math-0070, which specifies the field.2 Therefore, the multiplication result becomes as

1.11 equation

1.2.3.2 Normal Basis Representation

Alternatively, elements in c1-math-0142 are described using normal basis as

where c1-math-0144 and c1-math-0145 are roots of an irreducible polynomial, c1-math-0146, that is,

1.13 equation

Furthermore,

1.14 equation

This can be confirmed by Fermat little theorem.

For the case of c1-math-0149, suppose that c1-math-0150, that is, c1-math-0151. Addition in the normal basis representation of c1-math-0152 can be calculated simply by XORing each coefficient of two elements in the form of Equation ((1.12) ). That is,

1.15 equation

where the normal basis representations of c1-math-0154 and c1-math-0155 can be found in Table 1.4.

Table 1.4 Representations of elements for irreducible polynomial c1-math-0073 in c1-math-0074

Binary c1-math-0075 Bit concatenation Hex. Polynomial basis Power of c1-math-0076 Normal basis c1-math-0077
c1-math-0078 c1-math-0079 0 c1-math-0080 c1-math-0081 (0, 0, 0, 0)
c1-math-0082 c1-math-0083 1 c1-math-0084 c1-math-0085 (1, 1, 1, 1)
c1-math-0086 c1-math-0087 2 x c1-math-0089 (0, 0, 0, 1)
c1-math-0090 c1-math-0091 4 c1-math-0092 c1-math-0093 (0, 0, 1, 0)
c1-math-0094 c1-math-0095 8 c1-math-0096 c1-math-0097 (1, 0, 1, 1)
c1-math-0098 c1-math-0099 9 c1-math-0100 c1-math-0101 (0, 1, 0, 0)
c1-math-0102 c1-math-0103 b c1-math-0104 c1-math-0105 (0, 1, 0, 1)
c1-math-0106 c1-math-0107 f c1-math-0108 c1-math-0109 (0, 1, 1, 1)
c1-math-0110 c1-math-0111 7 c1-math-0112 c1-math-0113 (1, 1, 0, 0)
c1-math-0114 c1-math-0115 e c1-math-0116 c1-math-0117 (1, 0, 0, 0)
c1-math-0118 c1-math-0119 5 c1-math-0120 c1-math-0121 (1, 1, 0, 1)
c1-math-0122 c1-math-0123 a c1-math-0124 c1-math-0125 (1, 0, 1, 0)
c1-math-0126 c1-math-0127 d c1-math-0128 c1-math-0129 (0, 1, 1, 0)
c1-math-0130 c1-math-0131 3 c1-math-0132 c1-math-0133 (1, 1, 1, 0)
c1-math-0134 c1-math-0135 6 c1-math-0136 c1-math-0137 (0, 0, 1, 1)
c1-math-0138 c1-math-0139 c c1-math-0140 c1-math-0141 (1, 0, 0, 1)

This is correct as c1-math-0156. By using the fact of c1-math-0157, multiplication in c1-math-0158, for example, c1-math-0159 is calculated as

1.16 equation

The most advantageous point to use the normal basis representation lies in the fact that squaring is easy to compute in c1-math-0161. As can be found in Table 1.4, squaring for c1-math-0162 is c1-math-0163. More precisely, in squaring, the elements in the normal basis representation are derived as

1.17 equation
1.18 equation
1.19 equation

This merit is often used in both software and hardware implementations. However, in general, implementing modular multiplication in the normal basis requires more computation than that in the polynomial basis. Hereafter, we mainly use polynomial basis representation.

1.3 Linear and Nonlinear Functions in Boolean Algebra

1.3.1 Linear Functions

Addition and multiplication by a constant are linear functions in c1-math-0167. Suppose that c1-math-0168 and c1-math-0169, where c1-math-0170. Addition of c1-math-0171 and c1-math-0172 is

1.20 equation

From the fact that c1-math-0174, it is confirmed that addition in c1-math-0175 is a linear function.

For multiplication by a constant B, there exist c1-math-0177 such that

1.21 equation

Therefore, we know that such multiplication in c1-math-0179 is also a linear function. It can be easily understood considering the fact that multiplication by a constant can be computed with multiple additions of c1-math-0180 in c1-math-0181.

 

1.3.2 Nonlinear Functions

On the contrary, (normal) modular multiplication and multiplicative inverse in c1-math-0193 are nonlinear functions. The AES block cipher uses a nonlinear function in a part of the design that is based on modular multiplicative inversion in c1-math-0194. The multiplicative inverse computation can be done with Fermat's (little) theorem as

1.22 equation

for c1-math-0196. In AES, multiplicative inverse of 0 is mapped to 0.

One of the most optimal ways to compute the inversion is to find addition chain. On the basis of the Itoh–Tsujii algorithm, the computation can be performed with four multiplications and seven modular squarings as

1.23a, 1.23b, 1.23c, 1.23d, 1.23d, 1.23e, 1.23f equation

Itoh–Tsujii algorithm utilizes the relationship of

1.24 equation
c1f002

Figure 1.2 Block cipher design strategy. Nonlinear operations and linear operations are alternately applied

1.4 Linear and Nonlinear Functions in Block Cipher

As discussed in Section 1.3, logical operations are classified into linear operations and nonlinear operations. Composition of linear operations is also linear. Hence, if all the cipher's operations are linear, the resulting cipher is also linear, which is insecure. In order to break the linearity of the cipher, nonlinear operations need to be introduced. However, in general, the cost of implementing nonlinear operations is more expensive than the one for linear operations.

The strategy of the block cipher design is alternately applying nonlinear and linear operations several times. To avoid the heavy cost, nonlinear operation is designed to be weak but its cost is small. In many cases, a nonlinear operation is designed to be operated on a smaller size than the block size, and the operation is applied in parallel to all the data. Then, in order to compensate the weak nonlinear computations, a linear operation mixes the entire block. The strategy is depicted in Figure 1.2. In the following, each of the nonlinear layer and linear layer is further detailed.

1.4.1 Nonlinear Layer

In order to reduce the implementation cost, a nonlinear operation is designed to work on a fraction of the data. Typical choices of the size are 64 bits, 32 bits, 8 bits (called byte), 4 bits (called nibble), and 1 bit. The size of the nonlinear operation is determined depending on the following two aspects.

  • type of nonlinear operation
  • target platform in which the cipher is implemented.

1.4.1.1 Modular Operation

When the cipher is designed for being used in high-end CPUs, the implementation cost is not a big issue but the operation should be optimized for instructions adopted in such a CPU. Currently, many CPUs operate on 64 or 32 bits, thus the size of the nonlinear operation is also adjusted to 64 or 32 bits. The high-end CPUs can perform the modular addition or subtraction efficiently. The nonlinearity is often introduced by addition or subtraction on modulo c1-math-0199 or c1-math-0200.

1.4.1.2 Substitution Table (S-box)

When the cipher is designed for more resource-constrained hardwares such as micro-controllers, the balance of the implementation cost and the computation efficiency is important. When the CPU register size is smaller than 32 bits, the 32- or 64-bit modular addition cannot be performed efficiently. The hardware implementation also faces some problems for those operations. Typical choices of the size of the nonlinear operation are 8 or 4 bits. Because the size is small, using the substitution table is a popular approach to introduce the nonlinearity. The substitution table, or S-box, is a pre-specified mapping from the input values to the output values. An example of 4-bit to 4-bit S-box is given in Table 1.5.

Table 1.5 An example of 4-bit to 4-bit S-box, c1-math-0201

Input x 0 1 2 3 4 5 6 7 8 9 a b c d e f
Output c1-math-0203 c 0 f a 2 b 9 5 8 3 d 7 1 e 6 4

All values are described in the hexadecimal format.

 

In this S-box, the input value 5 is transformed to b according to the table. A 4-bit to 4-bit S-box is implemented only with c1-math-0210 bits of memory, which is very small. An 8-bit to 8-bit S-box is implemented only with c1-math-0211 bits of memory, which is bigger than the 4-bit to 4-bit S-box but can mix the data faster than the 4-bit to 4-bit S-box.

1.4.1.3 Boolean Function

A Boolean function is the smallest tool to introduce the nonlinearity. By using an AND or OR operation, the nonlinearity is introduced in 1 bit. When the cipher is designed to be a very resource constraint environment such as RFID, a Boolean function is a typical choice as a source of the nonlinearity. A Boolean function can also fit the high-end CPUs. Thirty two-bit CPUs can operate bit-wise for each of the 32 bits in parallel. If this is combined with modular additions (not bit-wise), the nonlinearity can be introduced quickly.

It is also a popular approach to specify the input and output correspondence of some Boolean functions as an S-box. If the cipher is implemented with some memory, the S-box can be implemented, and the nonlinearity of several bits can be introduced with 1 table look-up. If the cipher is implemented with small hardware, the logic of the Boolean function is implemented to minimize the implementation cost.

1.4.1.4 Balanced Choice

Unfortunately, there is no obvious choice that shows the overwhelming performance in any implementation environment. When the cipher is designed in multi-platforms, that is, both the high- and low-end environment, an S-box maybe chosen as the source of nonlinearity that shows a relatively good performance in both the environments. The popular choices of the nonlinear operations are summarized in Figure 1.3.

c1f003

Figure 1.3 Substitution-permutation network. Popular choices of size and type of nonlinear operations

Note that the data is mixed by alternately applying a nonlinear operation and a linear operation. The choice of the nonlinear operation also depends on the choice of the linear operation.

1.4.2 Linear Layer

The purpose of the linear layer is mixing all the output data from the nonlinear layer in which the data is updated in a small part independently. The linear layer is required to be performed efficiently and implemented lightly.

One of the simplest linear operations is XOR. A part of the nonlinear layer output is XORed to another part to mix the data from different parts. The XOR operation can be performed several times between different parts to mix the data more.

The bit-rotation and bit-shift are also simple linear operations. For example, by applying the 1-bit rotation to the entire data, 1-bit from each part will be moved to the next part. The XOR, bit-shift, and bit-rotation can be implemented efficiently in various platforms, thus they are suitable for the block cipher design.

Another important example is a multiplication over a finite field or modular multiplication. Suppose that the size of the nonlinear operation is n bits and each bit of n-bit value represents each coefficient of a polynomial whose degree is c1-math-0214. As explained in Section 1.3, multiplication over a finite field with some irreducible polynomial c1-math-0215 is a linear function. Suppose that the entire data consists of m parts of n-bit data, that is, its size is c1-math-0218 bits. The purpose of the linear function is mixing m independent outputs from the nonlinear layer. In order to mix all the m outputs, c1-math-0221 matrices are often used.

For instance, when c1-math-0222, four n-bit values c1-math-0224, c1-math-0225, c1-math-0226, c1-math-0227 are updated to four n-bit values c1-math-0229, c1-math-0230, c1-math-0231, c1-math-0232 by the following matrix operation:

1.25 equation

where each c1-math-0234 is a constant number.

Any combination of linear operations is a linear operation. A popular design approach is combining different types of light linear operations to introduce a strong mixing effect. An example of the linear layer is depicted in Figure 1.4.

c1f004

Figure 1.4 An example of linear layer consisting of three linear operations. Nonlinear layer is supposed to update data in eight parts independently

1.4.2.1 Maximum Distance Separable Matrix (MDS Matrix)

A maximum distance separable matrix (in short MDS matrix) is a matrix with some special property useful for block cipher's design. Considering the usage in block cipher AES, only the case with the same input and output size is discussed here. Let x be the m-component input to the matrix, M, and y be the m-component output from the matrix, that is, c1-math-0240. The matrix M is called MDS if no distinct input-output pairs c1-math-0242 collide in m or more components.

For the application to cryptology, the fact that at least c1-math-0244 components differ in distinct pairs of c1-math-0245 is important. In other words, the MDS matrix guarantees a certain amount of change in different input and output values. For instance, suppose that the value of x is slightly modified to c1-math-0247, which differs only 1 bit from x, and the corresponding output value c1-math-0249 is computed. The multiplication by the MDS matrix can guarantee that all the m components of the outputs y and c1-math-0252 have different values, meaning that the 1-bit change of the input always changes all the m components of the output.

1.4.3 Substitution-Permutation Network (SPN)

Substitution-permutation network, which is often called SPN, is a design approach to mix a fixed-length input data. SPN is a special form of the iterative application of nonlinear and linear computations.

The substitution layer (or S-layer), which applies a nonlinear operation, is supposed to be an S-box application in a small size. The permutation layer (or P-layer) applies a linear operation to mix the results of the S-layer efficiently.

The SPN structure is adopted in many block ciphers. AES, which is a main target of this book, also adopts the SPN structure.

1.5 Advanced Encryption Standard (AES)

AES is the most widely used block cipher in present time in both governmental and commercial purposes. AES is standardized internationally, and a lot of academic researches and industrial developments have been proposed about AES. This section explains the specification of AES.

The block cipher AES supports three different key sizes: 128 bits, 192 bits, and 256 bits. The corresponding AES algorithms are called AES-128, AES-192, and AES-256, respectively. AES supports a fixed block size: 128 bits. That is to say, when the key is determined, AES provides a bijective map from 128-bit plaintext to 128-bit ciphertext, that is, for a key K, AES-128c1-math-0255, AES-192c1-math-0256, AES-256c1-math-0257:c1-math-0258 (Figure 1.5).

c1f005

Figure 1.5 Three algorithms of AES

1.5.1 Specification of AES-128 Encryption

In high level, the 128-bit key K is expanded to eleven 128-bit subkeys c1-math-0260 according to the key schedule function, or KSF.

  1. The 128-bit key K is set to the first 128-bit subkey c1-math-0262.
  2. The KSF is computed to update 128-bit subkey c1-math-0263 to another 128-bit subkey c1-math-0264.
  3. Similarly, the KSF is iterated nine times. In each time, 128-bit subkey c1-math-0265 is updated to another 128-bit subkey c1-math-0266 for c1-math-0267.

Then, a plaintext is encrypted to a ciphertext as follows:

  1. An XOR of the plaintext and the first subkey c1-math-0268 is computed, and this value is set to a 128-bit internal state value c1-math-0269. This operation is often called whitening.
  2. The 128-bit internal state value c1-math-0270 is updated to c1-math-0271 by computing a round function, which also takes as input subkey c1-math-0272. This operation is called round 1 or the first round.
  3. The round function is iterated nine times to update the internal state value c1-math-0273 to c1-math-0274. In round i, where c1-math-0276, the round function takes as input c1-math-0277 and outputs c1-math-0278. Note that the round function in the last round is slightly different from the other rounds. The last state that is c1-math-0279 is the ciphertext.

The computation structure of AES-128 in a function level is described in Figure 1.6.

c1f006

Figure 1.6 High-level computation structure of the encryption of AES-128. c1-math-0280 and c1-math-0281 denote the round function and KSF, respectively. c1-math-0282 is the last round function, which is different from the other rounds

In practice, it is not necessary to compute all the 11 subkeys at the very beginning. For example, the last subkey will not be used until the very end of the encryption process. Thus, generating the last subkey and keeping it in a register is a waste of computation resource. In order to minimize the computation resource, the KSF and the round function updates are computed in parallel round by round. The AES-128 encryption algorithm in the function level can be described as Algorithm 1.1.

1.5.1.1 Preliminaries to Describe Computation Details

In AES, byte represents 8-bit values. AES is a byte-oriented cipher. All operations are defined at byte level. Let v be a byte value and c1-math-0296 be its bit-wise representation, of which the corresponding vector representation is c1-math-0297. In AES, each bit of a byte represents coefficients of polynomial of c1-math-0298:

1.26 equation

A byte value can be represented in hexadecimal. For example, the byte c1-math-0300 represents the polynomial c1-math-0301.

Addition

Addition of two bytes, c1-math-0302 and c1-math-0303, returns

1.27 equation
Multiplication

Multiplication in c1-math-0305 corresponds with multiplication of polynomials modulo, an irreducible binary polynomial of degree 8. The irreducible polynomial of AES is defined as

1.28 equation

Because the multiplication by c1-math-0307 and c1-math-0308 is later introduced inside the round function, more details of the operation c1-math-0309 are explained here. c1-math-0310 is written as

1.29 equation
1.30 equation

When c1-math-0313, the result is c1-math-0314 according to the definition of byte. When c1-math-0315, the irreducible polynomial c1-math-0316 is subtracted from the result. Subtraction is the inverseof the addition. Because the addition is the XOR, the subtraction is also a simple application of the XOR operations. Hence, the result is

1.31 equation
1.32 equation

According to the definition of byte, the result is c1-math-0319.

1.5.1.2 S-box

AES uses a substitution-box (S-box) to mix the data. The S-box is used in both of the round function and the KSF, and thus is defined here. The S-box used in AES is a pre-determined bijective mapping from an 8-bit value to an 8-bit value. The definition of the AES S-box is shown in Table 1.6. Hereafter, the S-box transformation is described as c1-math-0320. For example, c1-math-0321 returns 2f, and c1-math-0322 returns 03.

Table 1.6 AES S-box

Lower four digits
0 1 2 3 4 5 6 7 8 9 a b c d e f
0 63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76
1 ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0
2 b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15
3 04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75
4 09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84
5 53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf
6 d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8
Upper 7 51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2
four digits 8 cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73
9 60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db
a e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79
b e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08
c ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a
d 70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e
e e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df
f 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16

1 All the numbers in this table are in hexadecimal.

Note that the S-box and the inverse S-box transformations are not identical. As explained later, AES decryption algorithm requires the look-up table for the inverse of c1-math-0323, that is c1-math-0324.

1.5.1.3 State

The block size of AES is 128 bits. In AES, 128-bit data is called state. The 128-bit state consists of 16 bytes, and is represented as a c1-math-0325 two-dimensional array as depicted in Figure 1.7.

c1f007

Figure 1.7 AES state. Each cell denotes a byte

1.5.1.4 Key Schedule Function (KSF)

The 128-bit key K is loaded into a 128-bit subkey c1-math-0327. Then, c1-math-0328 is computed for c1-math-0329. The input c1-math-0330 is represented as a state. The state is further divided into four columns: c1-math-0331, c1-math-0332, c1-math-0333, and c1-math-0334. The output c1-math-0335 is computed column by column. At first, a temporary 4-byte value tmp is generated by using the value of c1-math-0336.

  1. c1-math-0337.
  2. Apply the S-box defined in Table 1.6 to each of the 4 bytes in tmp.
  3. Rotate tmp by 1 byte. Precisely, let c1-math-0338 be the 4 bytes of tmp. Then, tmp is updated to c1-math-0339.
  4. XOR the pre-specified 1-byte constant c1-math-0340 to the first byte of tmp.

Then, by using the 4-byte value tmp, the next subkey c1-math-0341 is generated as follows.

  1. c1-math-0342.
  2. c1-math-0343.
  3. c1-math-0344.
  4. c1-math-0345.

The key schedule procedure for AES-128 is depicted in Figure 1.8.

c1f008

Figure 1.8 Key schedule function of AES-128. The key schedule function is iterated for c1-math-0346

1.5.1.5 Round Function (RF)

The round function takes as input the previous 128-bit state c1-math-0347 and subkey c1-math-0348, and generates the next 128-bit state c1-math-0349. The round function consists of four transformations called SubBytes, ShiftRows, MixColumns, and AddRoundKey. It updates the state by following Algorithm 1.2.

SubBytes c1-math-0360

SubBytes is a byte-wise operation. It updates the state by applying the S-box defined in Table 1.6 to each of the 16 bytes of the state. It is worth noting that the c1-math-0361 operation is the only nonlinear one in the AES round function.

ShiftRows c1-math-0362

ShiftRows is a row-wise byte positions swap. The state consists of four rows: row 0, row 1, row 2, and row 3. Each row of the state consists of 4 bytes. The ShiftRows operation applies a left cyclic shift by i bytes to the 4 bytes of row i. The ShiftRows operation is depicted in Figure 1.9.

c1f009

Figure 1.9 ShiftRows operation

MixColumns c1-math-0365

MixColumns is a column-wise data mixing operation. It takes as input 4 bytes in a column and computes another 4-byte value. The same computation is applied to all of the four columns. Let c1-math-0366 and c1-math-0367 be the 4-byte input and 4-byte output, respectively. The c1-math-0368 is computed by the following matrix operation:

1.33 equation

Each element in the matrix is written in hexadecimal.

The c1-math-0370 operation was designed to satisfy the MDS property explained in Section 1.4.2.1. The impact of modifying 1 input byte always expands to all the 4 output bytes. More generally, the sum of the number of modified input bytes and the number of modified output bytes is always greater than or equal to 5.

AddRoundKey c1-math-0371

AddRoundKey updates the state by XORing the subkey c1-math-0372 to the state.

Last Round Function c1-math-0373

In the last round (Round 10 for AES-128), the round function is different from the middle rounds. The c1-math-0374 operation is not computed that is, only the c1-math-0375, and c1-math-0376 operations are performed.

1.5.2 AES-128 Decryption

To decrypt ciphertext C to P, the round function is applied in reverse order. The KSF is exactly the same. Eleven subkeys c1-math-0385 are generated from K. Different from the encryption algorithm, c1-math-0387 is firstly used, and then the decryption is processed with c1-math-0388, and finally with c1-math-0389.

Inside the round function, four operations are computed in reverse order. The inverse of the c1-math-0390 operation is exactly the same as the original c1-math-0391 operation because the XOR operation is involution.

For the inverse of the c1-math-0392 operation, the inversion matrix is required. Let c1-math-0393 and c1-math-0394 be the 4-byte input and 4-byte output to the inverse c1-math-0395 operation, respectively. The c1-math-0396 is computed by the following matrix operation:

1.34 equation

Each element in the matrix is again written in hexadecimal.

The inverse of the c1-math-0398 operation is relatively simple. It applies a right cyclic shift by i bytes to the 4 bytes of row i.

The inverse of the c1-math-0401 operation requires another table to substitute each byte value. The inverse S-box, denoted by c1-math-0402, is defined in Table 1.7.

Table 1.7 AES inverse S-box

Lower four digits
0 1 2 3 4 5 6 7 8 9 a b c d e f
0 52 09 6a d5 30 36 a5 38 bf 40 a3 9e 81 f3 d7 fb
1 7c e3 39 82 9b 2f ff 87 34 8e 43 44 c4 de e9 cb
2 54 7b 94 32 a6 c2 23 3d ee 4c 95 0b 42 fa c3 4e
3 08 2e a1 66 28 d9 24 b2 76 5b a2 49 6d 8b d1 25
4 72 f8 f6 64 86 68 98 16 d4 a4 5c cc 5d 65 b6 92
5 6c 70 48 50 fd ed b9 da 5e 15 46 57 a7 8d 9d 84
6 90 d8 ab 00 8c bc d3 0a f7 e4 58 05 b8 b3 45 06
Upper 7 d0 2c 1e 8f ca 3f 0f 02 c1 af bd 03 01 13 8a 6b
four digits 8 3a 91 11 41 4f 67 dc ea 97 f2 cf ce f0 b4 e6 73
9 96 ac 74 22 e7 ad 35 85 e2 f9 37 e8 1c 75 df 6e
a 47 f1 1a 71 1d 29 c5 89 6f b7 62 0e aa 18 be 1b
b fc 56 3e 4b c6 d2 79 20 9a db c0 fe 78 cd 5a f4
c 1f dd a8 33 88 07 c7 31 b1 12 10 59 27 80 ec 5f
d 60 51 7f a9 19 b5 4a 0d 2d e5 7a 9f 93 c9 9c ef
e a0 e0 3b 4d ae 2a f5 b0 c8 eb bb 3c 83 53 99 61
f 17 2b 04 7e ba 77 d6 26 e1 69 14 63 55 21 0c 7d

All the numbers in this table are in hexadecimal.

1.5.3 Specification of AES-192 and AES-256

AES supports not only the 128-bit key but also the 192-bit and the 256-bit keys. For all the key sizes, round function is identical. The differences are the number of rounds computed and the KSF.

  • AES-128 generates eleven 128-bit subkeys c1-math-0403 from 128-bit K, and the number of rounds is 10.
  • AES-192 generates thirteen 128-bit subkeys c1-math-0405 from 192-bit K, and the number of rounds is 12.
  • AES-256 generates fifteen 128-bit subkeys c1-math-0407 from 256-bit K, and the number of rounds is 14.

1.5.3.1 The Key Schedule Function for AES-192

The 192-bit key K is loaded into a c1-math-0410 array of bytes, which is denoted by c1-math-0411. Then, c1-math-0412 is computed for c1-math-0413. The state is further divided into six columns: c1-math-0414, c1-math-0415, c1-math-0416, c1-math-0417, c1-math-0418, and c1-math-0419. The output c1-math-0420 is computed column by column. At first, a temporary 4-byte value tmp is generated by using the value of c1-math-0421.

  1. c1-math-0422.
  2. Apply the S-box defined in Table 1.6 to each of the 4 bytes in tmp.
  3. Rotate tmp by 1 byte. Precisely, let c1-math-0423 be the 4 bytes of tmp. Then, tmp is updated to c1-math-0424.
  4. XOR the pre-specified 1-byte constant c1-math-0425 to the first byte of tmp.

Then, by using the 4-byte value tmp, the next subkey c1-math-0426 is generated as follows.

  1. c1-math-0427.
  2. c1-math-0428.
  3. c1-math-0429.
  4. c1-math-0430.
  5. c1-math-0431.
  6. c1-math-0432.

Among the 192-bit of the c1-math-0433, the first four columns (128 bits) are set to c1-math-0434, and the remaining two columns (64 bits) are set to the left half of c1-math-0435. Among the 192-bit of the c1-math-0436, the first two columns (64 bits) are set to the right half of c1-math-0437, and the remaining four columns (128 bits) are set to c1-math-0438. Similarly, c1-math-0439 are obtained.

Note that c1-math-0440 is the last four columns of c1-math-0441, and then c1-math-0442 is the first four columns of c1-math-0443. The last two columns of c1-math-0444 are never used. Thus, in order to omit the redundant computations, the KSF should be processed up to the first four columns of c1-math-0445.

The key schedule procedure for AES-192 is depicted in Figure 1.10.

c1f010

Figure 1.10 Key schedule function of AES-192. The key schedule function is iterated until 13 subkeys are generated

1.5.3.2 The Key Schedule Function for AES-256

The KSF for AES-256 can be similarly defined. The size of the key state is 256 bits consisting of c1-math-0446-byte array. Each key state produces two subkeys, and 15 subkeys c1-math-0447 are generated.

The update computation is very similar to the ones for AES-128 and AES-192. In order to mix the data quickly, another S-box layer is introduced between columns 3 and 4. The detailed procedure is omitted. The key schedule procedure for AES-256 is depicted in Figure 1.11.

c1f011

Figure 1.11 Key schedule function of AES-256. The key schedule function is iterated until 15 subkeys are generated

Note that c1-math-0448 is the first four columns of c1-math-0449. The last four columns of c1-math-0450 are never used. Thus, in order to omit the redundant computations, the KSF should be processed up to the first four columns of c1-math-0451.

 

c1f012

Figure 1.12 Notations for each state of AES-128

1.5.4 Notations to Describe AES-128

The computation of AES-128 with all the operations is described in Figure 1.12. The state after the first XOR of the plaintext and c1-math-0452 is denoted by c1-math-0453. Similarly in round i, where c1-math-0455,

  • the state at the beginning of the round is denoted by c1-math-0456;
  • the state after the c1-math-0457 operation is denoted by c1-math-0458;
  • the state after the c1-math-0459 operation is denoted by c1-math-0460;
  • the state after the c1-math-0461 operation is denoted by c1-math-0462;
  • the state after the c1-math-0463 operation is denoted by c1-math-0464, which is equivalent to c1-math-0465.

As explained before, the state is represented by a c1-math-0466-byte array. Using two subindices often causes misunderstanding, and thus each byte position is also denoted by a single sequence c1-math-0467. For state S, the byte c1-math-0469, where c1-math-0470 is converted to the byte c1-math-0471. The byte positions in the single sequence are shown in Figure 1.13.

c1f013

Figure 1.13 Notations for inside AES state

Byte values of state S in several different byte positions c1-math-0473 are often denoted by c1-math-0474. For example, the 4-byte value in the column 0 of state S is denoted by c1-math-0476.

  • The first column, or column 0, of state S is denoted by c1-math-0478, which is equivalent to c1-math-0479.
  • The second column, or column 1, of state S is denoted by c1-math-0481, which is equivalent to c1-math-0482.
  • The third column, or column 2, of state S is denoted by c1-math-0484, which is equivalent to c1-math-0485.
  • The fourth column, or column 3, of state S is denoted by c1-math-0487, which is equivalent to c1-math-0488.
  • The first row, or row 0, of state S is denoted by c1-math-0490, which is equivalent to c1-math-0491.
  • The second row, or row 1, of state S is denoted by c1-math-0493, which is equivalent to c1-math-0494.
  • The third row, or row 2, of state S is denoted by c1-math-0496, which is equivalent to c1-math-0497.
  • The fourth row, or row 3, of state S is denoted by c1-math-0499, which is equivalent to c1-math-0500.

State c1-math-0501 becomes state c1-math-0502 after the c1-math-0503 operation. During this process, 4 bytes in c1-math-0504 are moved to different byte positions in c1-math-0505. The moved positions are denoted by c1-math-0506.

  • For state S, 4 bytes of c1-math-0508 are equivalent to c1-math-0509.
  • For state S, 4 bytes of c1-math-0511 are equivalent to c1-math-0512.
  • For state S, 4 bytes of c1-math-0514 are equivalent to c1-math-0515.
  • For state S, 4 bytes of c1-math-0517 are equivalent to c1-math-0518.

Those 4-byte positions are called diagonal.

Similarly 4 bytes in c1-math-0519 are moved to different byte positions in c1-math-0520 through the inverse of the c1-math-0521 operation. The moved positions are denoted by c1-math-0522.

  • For state S, 4 bytes of c1-math-0524 are equivalent to c1-math-0525.
  • For state S, 4 bytes of c1-math-0527 are equivalent to c1-math-0528.
  • For state S, 4 bytes of c1-math-0530 are equivalent to c1-math-0531.
  • For state S, 4 bytes of c1-math-0533 are equivalent to c1-math-0534.

Those 4-byte positions are called inverse diagonal.

Further Reading

  1. Daemen J and Rijmen V June 1998 AES submission document on Rijndael.
  2. Daemen J and Rijmen V 2002 The Design of Rijndeal: AES—The Advanced Encryption Standard (AES). Springer-Verlag.
  3. Deschamps JP 2009 Hardware Implementation of Finite-Field Arithmetic 1st edn. McGraw-Hill, Inc., New York, NY.
  4. Paar C and Pelzl J 2010 Understanding Cryptography: A Textbook for Students and Practitioners. Springer-Verlag, New York.

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.134.17