Chapter 10

Secure Systems

Abstract

This chapter introduces the key concepts in implementing block ciphers such as the data encryption standard (DES). A stream cipher operates on a digital data stream one or more bits at a time. A block cipher operates on complete blocks of data at any one time and produces a ciphertext block of equal size. DES, in common with other block ciphers is based around a structure called a Feistel Lattice so it is useful to describe how this works.

Keywords

Block ciphers

Secure systems

DES

10.1 Introduction to Block Ciphers

The data encryption standard (DES) is a symmetric block cipher. A stream cipher operates on a digital data stream one or more bits at a time. A block cipher operates on complete blocks of data at any one time and produces a ciphertext block of equal size. DES is a block cipher that operates on data blocks of 64 bits in size. DES uses a 64-bit key 8 × 8 including 1 bit for parity, so the actual key is 56 bits. DES, in common with other block ciphers, is based around a structure called a Feistel Lattice so it is useful to describe how this works.

10.2 Feistel Lattice Structures

A block cipher operates on a plaintext block of n bits to produce a block of ciphertext of n bits. For the algorithm to be reversible (i.e., for decryption to be possible) there must be a unique mapping between the two sets of blocks. This can also be called a non singular transformation. For example, consider the following transformations as shown in Figure 10.1.

f10-01-9780080971292
Figure 10.1 Reversible and irreversible transformations.

Obviously, this is essentially a substitution cipher, which may be susceptible to the standard statistical analysis techniques used for simple cryptanalysis of text (such as frequency analysis). As the block size increases, then this becomes increasingly less feasible. An obvious practical difficulty with this approach is the number of transformations required as n increases. This mapping is essentially the key and the number of bits will determine the key size. Therefore, for an n-bit general substitution block cipher, the key size is calculated as follows:

key=n×2n

si1_e  (10.1)

For a specific case where n = 64, the key size becomes 64 × 264 = 1021.

In order to get around this complexity problem, Feistel proposed an approach called a product cipher whereby the combination of several simple steps leads to a much more cryptographically secure solution than any of the component ciphers used. His approach relies on the alternation of two types of function:

 Diffusion

 Confusion

These two concepts are grounded in an approach developed by Shannon used in most standard block ciphers in common use today. Shannon’s goal was to define cryptographic functions that would not be susceptible to statistical analysis. He therefore proposed two methods for reducing the ability of statistical cryptanalysis to find the original message, classified as diffusion and confusion.

In diffusion, the statistical structure of the plaintext is dissipated throughout the long-term statistics of the ciphertext. This is achieved by making each bit of the plaintext affect the value of many bits of the ciphertext. An example of this would be to add letters to a ciphertext such that the frequency of each letter is the same, regardless of the message. In binary block ciphers the technique uses multiple permutations and functions such that each bit of the ciphertext is affected by multiple bits in the plaintext.

Each block of plaintext is transformed into a block of ciphertext, and this depends on the key. Confusion aims to make the relationship between the ciphertext and the key as complex as possible to reduce the possibility of ascertaining the key. This requires a complex substitution algorithm, as a linear substitution would not protect the key.

Both diffusion and confusion are the cornerstones of successful block cipher design.

The result of these requirements is the Feistel Lattice (shown in Figure 10.2). This is the basic architecture that is used in block ciphers such as DES.

f10-02-9780080971292
Figure 10.2 Feistel lattice structure.

The inputs to the algorithm are the plaintext (of length 2w bits) and a key K. The plaintext is split into two halves L and R, and the data is then passed through n rounds of processing and then recombined to produce the ciphertext. Each round has an input Li−1 and Ri−1 derived from the previous round and a subkey Ki, derived from the overall key K. Each round has the same structure. The left half of the data has a substitution performed. This requires a round function F to be performed on the right half of the data and then XORed with the left half. Finally a permutation is performed that requires the interchange of the two halves of the data.

The implementation of a Feistel network has the following key parameters:

 Block size A larger block size generally means greater security, but reduced speed. 64-bit block sizes are very heavily used as being a reasonable trade-off—although AES now uses 128 bits.

 Key Size The same trade-off applies as for block size. Generally 64 bits is not now considered adequate and 128 bits is preferred.

 Number of rounds Each round adds additional security. A single round is inadequate, but 16 is considered standard.

 Subkey generation The more complex this algorithm is, the more secure the overall system will be.

 Round function Greater complexity again means greater resistance to cryptanalysis.

10.3 The Data Encryption Standard (DES)

10.3.1 Introduction

The Data Encryption Standard (DES) was adopted by the National Institute of Standards and Technology (NIST) in 1977 as the Federal Information Processing Standards 46 (FIPS PUB 46). As mentioned previously, the algorithm operates on plaintext blocks of 64 bits and the key size is 56 bits. By 1999, NIST had decreed that DES was no longer secure and should only be used for legacy systems and that triple DES should be used instead. As will be described later, DES has since been superceded by the Advanced Encryption Standards (AES). The coarse structure (overall architecture) of DES is shown in Figure 10.3.

f10-03-9780080971292
Figure 10.3 DES coarse structure.

The center section (where the main repetition occurs) is called the fine structure and is where the details of the encryption take place. This fine structure is detailed in Figure 10.4.

f10-04-9780080971292
Figure 10.4 DES fine structure.

The fine structure of DES consists of several important functional blocks:

 Initial permutation Fixed, known mapping 64-64 bits.

 Key transformations Circular L shift of keys by A(i) bits in round (A(i) is known and fixed).

 Compression Permutation Fixed known subset of 56-bit input mapped onto 48-bit output.

 Expansion permutation 32-bit data shuffled and mapped (both operations fixed and known) onto 48 bits by duplicating 16 input bits. This makes diffusion quicker.

Another significant section of the algorithm is the substitution or S-box. The nonlinear aspect of the cipher is vital in cryptography. In DES the eight S boxes each contain four different (fixed and known) 4:4 input maps. These are selected by the extra bits created in the expansion box. The S boxes are structured as shown in Figure 10.5.

f10-05-9780080971292
Figure 10.5 DES S-box architecture.

The final part of the DES structure is the key generation architecture for the individual round keys and this is given in Figure 10.6.

f10-06-9780080971292
Figure 10.6 DES round key generation.

The remaining functional block is the initial and final permutation. The initial permutation (P-Box) is a 32:32 fixed, known bit permutation. The final permutation is the inverse of the initial permutation. The initial permutation is defined using the following table:

585042342618102
605244362820124
625446383022146
645648403224168
57494133251791
595143352719113
615345372921135
635547393123157

t0010

10.3.2 DES VHDL Implementation

DES can be implemented in VHDL using a structural or a functional approach. As has been discussed previously, there are advantages to both methods; however, the DES algorithm is tied implicitly to the structure, so a structural approach will give an efficient implementation.

Implementing the initial permutation in VHDL requires a 64-bit input vector and a 64-bit output vector. We can create this in VHDL with an entity that defines an input and output std_logic vector as follows. The architecture is simply the assignment of bits from input to output according to the initial permutation table previously defined.

1 library   ieee;

2 use   ieee. std_logic_1164. all;

3

4 entity   des_ip   is   port

5 (

6 d   :   in  std_logic_vector(1 to 64);

7 y : out std_logic_vector(1 to 64)

8 );

9 end des_ip;

10

11 architecture behavior of des_ip is

12 begin

13 y(1)<=d(58); y(2)<=d(50); y(3)<=d(42); y(4)<=d(34);

14 y(5)<=d(26); y(6)<=d(18); y(7)<=d(10); y(8)<=d(2);

15 y(9)<=d(60); y(10)<=d(52); y(11)<=d(44); y(12)<=d(36);

16 y(13)<=d(28); y(14)<=d(20); y(15)<=d(12); y(16)<=d(4);

17 y(17)<=d(62); y(18)<=d(54); y(19)<=d(46); y(20)<=d(38);

18 y(21)<=d(30); y(22)<=d(22); y(23)<=d(14); y(24)<=d(6);

19 y(25)<=d(64); y(26)<=d(56); y(27)<=d(48); y(28)<=d(40);

20 y(29)<=d(32); y(30)<=d(24); y(31)<=d(16); y(32)<=d(8);

21 y(33)<=d(57); y(34)<=d(49); y(35)<=d(41); y(36)<=d(33);

22 y(37)<=d(25); y(38)<=d(17); y(39)<=d(9);  y(40)<=d(1);

23 y(41)<=d(59); y(42)<=d(51); y(43)<=d(43); y(44)<=d(35);

24 y(45)<=d(27); y(46)<=d(19); y(47)<=d(11); y(48)<=d(3);

25 y(49)<=d(61); y(50)<=d(53); y(51)<=d(45); y(52)<=d(37);

26 y(53)<=d(29); y(54)<=d(21); y(55)<=d(13); y(56)<=d(5);

27 y(57)<=d(63); y(58)<=d(55); y(59)<=d(47); y(60)<=d(39);

28 y(61)<=d(31); y(62)<=d(23); y(63)<=d(15); y(64)<=d(7);

29 end behavior;

As this function is purely combinatorial we don’t need to have a register (i.e., clocked input) on this model, although we could implement that if required using a simple process.

As shown in the previous description of the expansion function, we need to take a word consisting of 32 bits and expand it to 48 bits. This requires a translation table as shown below. Notice that there are duplicates in the cell which means that you only need 32 input bits to obtain 48 output bits.

3212345
456789
8910111213
121314151617
161718192021
202122232425
242526272829
28293031321

t0015

We can use a VHDL model similar to the initial permutation function, except that in this case there are 32 input bits and 48 output bits. Notice that some of the input bits are repeated, giving a straightforward expansion function.

1 library   ieee;

2 use   ieee. std_logic_1164. all;

3

4 entity   des_e   is   port

5 (

6 d   :   in   std_logic_vector(1 to 32);

7 y : out std_logic_vector(1 to 48)

8 );

9 end des_e;

The architecture is simply the assignment of bits from input to output according to the initial permutation table previously defined.

1 architecture behavior of des_e  is

2 begin

3 y (1)<= d (32);  4y(2)<=d(1);  y(3)<=d(2);  y(4)<=d(3);

4 y(5)<=d(4);  y(6)<=d(5);  y(7)<=d(4);  y(8)<=d(5);

5 y(9)<=d(6);  y(10)<=d(7);  y(11)<=d(8);  y(12)<=d(9);

6 y(13)<=d(8);  y(14)<=d(9);  y(15)<=d(10);  y(16)<=d(11);

7 y(17)<=d(12);  y(18)<=d(13); y(19)<=d(12); y(20)<=d(13);

8 y(21)<=d(14);  y(22)<=d(15); y(23)<=d(16); y(24)<=d(17);

9 y(25)<=d(16);  y(26)<=d(17); y(27)<=d(18); y(28)<=d(19);

10 y(29)<=d(20);  y(30)<=d(21); y(31)<=d(20); y(32)<=d(21);

11 y(33)<=d(22); y(34)<=d(23); y(35)<=d(24); y(36)<=d(25);

12 y(37)<=d(24); y(38)<=d(25); y(39)<=d(26); y(40)<=d(27);

13 y(41)<=d(28); y(42)<=d(29); y(43)<=d(28); y(44)<=d(29);

14 y(45)<=d(30); y(46)<=d(31); y(47)<=d(32); y(48)<=d(1);

15 end behavior;

The final permutation block is the permutation marked (P) on the fine structure after the key function. This is a straightforward bit substitution function with 32 bits input and 32 bits output. The bit translation table is shown in the following table:

1672021
29122817
1152326
5183110
282414
322739
1913306
2211425

t0020

This is implemented in VHDL using exactly the same approach as the previous expansion and permutation functions as follows:

1 library   ieee;

2 use   ieee. std_logic_1164. all;

3

4 entity   des_p   is   port

5 (

6 d   :   in   7std_logic_vector(1 to 32);

7 y : out std_logic_vector(1 to 32)

8 );

9 end des_p;

The architecture is simply the assignment of bits from input to output according to the initial permutation table previously defined.

1 architecture   behavior   of   des_p  is

2 begin

3 y (1)<= d (16);   4y(2)<=d(7);  y(3)<=d(20);  y(4)<=d(21);

4 y(5)<=d(29);  y(6)<=d(12);  y(7)<=d(28);  y(8)<=d(17);

5 y(9)<=d(1); y(10)<=d(15);  y(11)<=d(23);  y(12)<=d(26);

6 y(13)<=d(5);  y(14)<=d(18);  y(15)<=d(31);  y(16)<=d(10);

7 y(17)<=d(2);  y(18)<=d(8);  y(19)<=d(24);  y(20)<=d(14);

8 y(21)<=d(32);  y(22)<=d(27);  y(23)<=d(3);  y(24)<=d(9);

9 y(25)<=d(19); y(26)<=d(13); y(27)<=d(30); y(28)<=d(6);

10 y(29)<=d(22); y(30)<=d(11); y(31)<=d(4);  y(32)<=d(25);

11 end behavior;

The nonlinear part of the DES algorithm is the S box. This is a set of 6->4 bit transformations that reduce the 48 bits of the expanded word in the DES f function to the 32 bits for the next round. The required row and column are obtained from the data passed into the S box. The data into the S box is a 6-bit binary word. The row is obtained from 2.b1 + b6 and the column is obtained from b2b3b4b5. For example, S(011011) would give a row of 01 (1) and a column of 1101 (13). For S8 this would result in a value returning of 1110 (14).

The basic S-box entity can therefore be constructed using the following VHDL:

1 library   ieee;

2 use   ieee. std_logic_1164. all;

3 entity   des_sbox   is

4 port   (

5 d   :   in   std_logic_vector (1 to 6);

6 y : out std_logic_vector (1 to 4)

7 );

8 end entity des_sbox;

One approach is to define the row and column from the input D word and then calculate the output Y word from that using a lookup table approach or minimize the logic as a truth table. The basic architecture could then look something like this:

1 architecture   behavior   of   sbox   is

2 signal r : std_logic_vector (1 to 2);

3 signal c : std_logic_vector (3 to 6);

4 begin

5 r <= d ( 1 to 2);

6 c <= d (3 to 6 );

7 −− the look up table or logic goes here

8 end;

Another approach is to define a simple lookup table with the input d as the unique address and the output y stored in the memory; this is exactly the same as a ROM, so the input is defined as an unsigned integer to look up the required value. In this case the memory is defined in exactly the same way as the ROM separately in this book.

The S-box substitutions are given in the table following and the VHDL can either use the lookup table approach to store the address of each substitution, or logic can be used to decode the correct output.

RowColumn Number
[0][1][2][3][4][5][6][7][8][9][10][11][12][13][14][15]
S1
[0]1441312151183106125907
[1]0157414213110612119538
[2]4114813621115129731050
[3]1512824917511314100613
S2
[0]1518146113497213120510
[1]3134715281412011069115
[2]0147111041315812693215
[3]1381013154211671205149
S3
[0]1009146315511312711428
[1]1370934610285141211151
[2]1364981530111212510147
[3]1101306987415143115212
S4
[0]7131430691012851112415
[1]1381156150347212110149
[2]1069012117131513145284
[3]3150610113894511127214
S5
[0]2124171011685315130149
[1]1411212471315015103986
[2]4211110137815912563014
[3]1181271142136150910453
S6
[0]1211015926801334147511
[1]1015427129561131401138
[2]9141552812370410113116
[3]4321295151011141760813
S7
[0]4112141508133129751061
[1]1301174911014351221586
[2]1411131237141015680592
[3]6111381410795015142312
S8
[0]1328461511110931450127
[1]1151381037412561101492
[2]7114191214206101315358
[3]2114741081315129035611

t0025

In order to use this table, the appropriate S box is selected and then the 2 bits of the row select the appropriate row and the same for the column. For example, for S box S1, if the row is 3 (11) and the Column is 10 (1010) then the output can be read off as 3 (0011). This can be coded in VHDL using nested case statements as follows:

1 case row is

2 when 0 =>

3   case column is

4 when 0 => y <= 14;

5 when 1 => y <= 64;

 6−− and so on

7 end case; 

8 when 1 =>

9 case column is

10 −− and the lookup goes here

11 End case 

12 −− and so on for all the rows

13 end case;

Obviously this is quite cumbersome, but also very easy to code automatically using a simple code generator and offers the possibility of the synthesis tool carrying out logic optimization and providing a much more efficient implementation than a memory block.

10.3.3 DES Verilog Implementation

DES can also be implemented in Verilog using a structural or a functional approach. As has been discussed previously, there are advantages to both methods; however, the DES algorithm is tied implicitly to the structure, so a structural approach will give an efficient implementation.

Implementing the initial permutation in Verilog requires a 64-bit input vector and a 64-bit output vector. In Verilog this is very easy to create by simply specifying the size of the input and output accordingly. In this combinatorial example (the same as the VHDL example previously in this chapter), we have not used an enable signal, but this would be easy to add, and then of course we can implement the output with a high impedance (z) state for unknown input values.

1 module desip ( y , d );

2 // Define the IO

3 input [64:1] d;

4 output [64:1] y;

6 // Define the output as a register

7 reg [64:1] y;

8

9 // Assign the permutations on any change in d

10 always @ d

11 begin

12 y [1]<= d [58]; 13y[2]<=d[50];  y[3]<=d[42];  y[4]<=d[34];

13 y[5]<=d[26];  y[6]<=d[18];  y[7]<=d[10];  y[8]<=d[2];

14 y[9]<=d[60];  y[10]<=d[52];  y[11]<=d[44];  y[12]<=d[36];

15 y[13]<=d[28];  y[14]<=d[20];  y[15]<=d[12];  y[16]<=d[4];

16 y[17]<=d[62];  y[18]<=d[54];  y[19]<=d[46];  y[20]<=d[38];

17 y[21]<=d[30];  y[22]<=d[22];  y[23]<=d[14];  y[24]<=d[6];

18 y[25]<=d[64];  y[26]<=d[56];  y[27]<=d[48];  y[28]<=d[40];

19 y[29]<=d[32];  y[30]<=d[24];  y[31]<=d[16];  y[32]<=d[8];

20 y[33]<=d[57];  y[34]<=d[49];  y[35]<=d[41];  y[36]<=d[33];

21 y[37]<=d[25];  y[38]<=d[17];  y[39]<=d[9];  y[40]<=d[1];

22 y[41]<=d[59];  y[42]<=d[51];  y[43]<=d[43];  y[44]<=d[35];

23 y[45]<=d[27];  y[46]<=d[19];  y[47]<=d[11];  y[48]<=d[3];

24 y[49]<=d[61];  y[50]<=d[53];  y[51]<=d[45];  y[52]<=d[37];

25 y[53]<=d[29];  y[54]<=d[21];  y[55]<=d[13];  y[56]<=d[5];

26 y[57]<=d[63];  y[58]<=d[55];  y[59]<=d[47];  y[60]<=d[39];

27 y[61]<=d[31];  y[62]<=d[23];  y[63]<=d[15];  y[64]<=d[7];

28 end

29 endmodule

As this function is purely combinatorial we don’t need to have a register (i.e., clocked input) on this model, although we could implement that if required using a simple addition of a clock.

As shown in the previous description of the expansion function, we need to take a word consisting of 32 bits and expand it to 48 bits. This requires a translation table as shown below. Notice that there are duplicates in the cell, which means that you only need 32 input bits to obtain 48 output bits.

3212345
456789
8910111213
121314151617
161718192021
202122232425
242526272829
28293031321

t0030

We can use a Verilog model similar to the initial permutation function, except that in this case there are 32 input bits and 48 output bits. Notice that some of the input bits are repeated, giving a straightforward expansion function.

1 module dese ( y , d );

2 // Define the IO

3 input [32:1] d;

4 output [48:1] y;

5

6 // Define the output as a register

7 reg [48:1] y;

8

9 // Assign the permutations on any change in d

10 always @ d

11 begin

12 y [1]<= d [32]; 13y[2]<=d[1];  y[3]<=d[2];  y[4]<=d[3];

13 y[5]<=d[4];  y[6]<=d[5];  y[7]<=d[4];  y[8]<=d[5];

14 y[9]<=d[6];  y[10]<=d[7];  y[11]<=d[8];  y[12]<=d[9];

15 y[13]<=d[8];  y[14]<=d[9];  y[15]<=d[10];  y[16]<=d[11];

16 y[17]<=d[12]; y[18]<=d[13]; y[19]<=d[12]; y[20]<=d[13];

17 y[21]<=d[14]; y[22]<=d[15]; y[23]<=d[16]; y[24]<=d[17];

18 y[25]<=d[16]; y[26]<=d[17]; y[27]<=d[18]; y[28]<=d[19];

19 y[29]<=d[20]; y[30]<=d[21]; y[31]<=d[20]; y[32]<=d[21];

20 y[33]<=d[22]; y[34]<=d[23]; y[35]<=d[24]; y[36]<=d[25];

21 y[37]<=d[24]; y[38]<=d[25]; y[39]<=d[26]; y[40]<=d[27];

22 y[41]<=d[28]; y[42]<=d[29]; y[43]<=d[28]; y[44]<=d[29];

23 y[45]<=d[30]; y[46]<=d[31]; y[47]<=d[32]; y[48]<=d[1];

24 end

25 endmodule

The final permutation block is the permutation marked (P) on the fine structure after the key function. This is a straightforward bit substitution function with 32 bits input and 32 bits output. The bit translation is shown in the following table:

1672021
29122817
1152326
5183110
282414
322739
1913306
2211425

t0035

This is implemented in Verilog using exactly the same approach as the previous expansion and permutation functions as follows, with the behavior simply the assignment of bits from input to output according to the initial permutation table previously defined.

1 module desp ( y , d );

2 // Define the IO

3 input [32:1] d;

4 output [32:1] y;

5

6 // Define the output as a register

7 reg [32:1] y;

8

9 // Assign the permutations on any change in d

10 always @ d

11 begin

12 y [1]<= d [16]; 13y[2]<=d[7];  y[3]<=d[20];  y[4]<=d[21];

13 y[5]<=d[29];  y[6]<=d[12];  y[7]<=d[28];  y[8]<=d[17];

14 y[9]<=d[1]; y[10]<=d[15];  y[11]<=d[23]; y[12]<=d[26];

15 y[13]<=d[5];  y[14]<=d[18];  y[15]<=d[31]; y[16]<=d[10];

16 y[17]<=d[2];  y[18]<=d[8];  y[19]<=d[24];  y[20]<=d[14];

17 y[21]<=d[32]; y[22]<=d[27];  y[23]<=d[3];  y[24]<=d[9];

18 y[25]<=d[19]; y[26]<=d[13];  y[27]<=d[30]; y[28]<=d[6];

19 y[29]<=d[22]; y[30]<=d[11];  y[31]<=d[4];  y[32]<=d[25];

20 end

21 endmodule

The nonlinear part of the DES algorithm is the S box. This is a set of 6->4 bit transformations that reduce the 48 bits of the expanded word in the DES f function to the 32 bits for the next round. The required row and column are obtained from the data passed into the S box. The data into the S box is a 6-bit binary word. The row is obtained from 2.b1 + b6 and the column is obtained from b2b3b4b5. For example, S(011011) would give a row of 01 (1) and a column of 1101 (13). For S8 this would result in a value returning of 1110 (14).

The basic S-box model can therefore be constructed using the following Verilog, with one approach to define the row and column from the input d word and then calculate the output Y word from that using a lookup table approach or minimize the logic as a truth table. The basic architecture could then look something like this:

1 module sbox ( y , d );

2 // Define the IO

3 input [6:1] d;

4 output [4:1] y;

5

6 // Define the output as a register

7 reg [4:1] y;

8

9 // Assign the permutations on any change in d

10 reg [2:1] r;

11 reg [4:1] c;

12

13 always @ d

14 begin

15 r <= d [2:1];

16 c <= d [6:3];

17

 18 // The remainder of the SBOX logic goes here

19

20 end

21 endmodule

Another approach is to define a simple lookup table with the input d as the unique address and the output y stored in the memory; this is exactly the same as a ROM, so the input is defined as an unsigned integer to look up the required value. In this case the memory is defined in exactly the same way as the ROM is defined separately in this book.

In order to use the table, the appropriate S box is selected and then the 2 bits of the row select the appropriate row and the same for the column. For example, for Sbox S1, if the row is 3 (11) and the Column is 10 (1010) then the output can be read off as 3 (0011). This can be coded in VHDL using nested case statements as follows:

1 case ( row )

2 0: case ( column )

3 0: y =14;

4 1: y =4;

5 // and so on

6 endcase

 7 1: case ( column )

8 0: y =0;

9 1: y =15;

10 // and so on

11 endcase

12 // and so on for all the rows

13 endcase

Obviously this is quite cumbersome, but also very easy to code automatically using a simple code generator and offers the possibility of the synthesis tool carrying out logic optimization and providing a much more efficient implementation than a memory block.

10.3.4 Validation of DES

In order to validate the implementation of DES, a set of test vectors can be used (i.e., plaintext/ciphertext pairs to ensure that the correct processing is taking place). A suitable set of test vectors is given as:

PlaintextCiphertext
4E6F7720697320743FA40E8A984D4815
68652074696D65206A271787AB8883F9
666F7220616C6C20893D51EC4B563B53

In this case the key to be used is 0123456789ABCDEF.

Each of the groups of hexadecimal characters is represented by 7-bit ASCII and adding an extra bit.

10.4 Advanced Encryption Standard

In 1997, the U.S. National Institute of Standards and Technology (NIST) published a request for information regarding the creation of a new Advanced Encryption Standard (AES) for nonclassified government documents. The call also stipulated that the AES would specify an unclassified, publicly disclosed encryption algorithm(s), available royalty-free, worldwide. In addition, the algorithm(s) must implement symmetric key cryptography as a block cipher and (at a minimum) support block sizes of 128 bits and key sizes of 128, 192, and 256 bits.

After an open competition, the Rijndael algorithm was chosen as the winner and implemented as the AES standard. Rijndael allows key and block sizes to be 128, 192, or 256 bits. AES allows the same key sizes, but operates using a block size of 128 bits. The algorithm operates in a similar way to DES, with 10 rounds of confusion and diffusion operators (shuffling and mixing) blocks at a time. Each round has a separate key, generated from the overall key. The round structure is shown in Figure 10.7.

f10-07-9780080971292
Figure 10.7 AES round structure.

The overall AES structure is given in Figure 10.8.

f10-08-9780080971292
Figure 10.8 AES structure.

Each block consists of 128 bits, and these are divided into 16 8-bit bytes. Each of the operations acts upon these 8-bit bytes in a 4 × 4 matrix:

a0,0a0,1a0,2a0,3a1,0a1,1a1,2a1,3a2,0a2,1a2,2a2,3a3,0a3,1a3,2a3,3

si2_e

Note that each element (ai,j) is an 8-bit byte, viewed as elements of GF(28). The arithmetic operators take advantage of the Galois Field (GF) rules defined in the Rijndael algorithm; an example is that of addition that is implemented by XOR.

Multiplication is more complicated, but each byte has the multiplicative inverse such that b.b = 00000001 (apart from 00000000, whose multiplicative inverse is 00000000). The model of the finite field GF(28) depends on the choice of an irreducible polynomial of degree 8, which for Rijndael is:

X8+X4+X3+1

si3_e  (10.2)

Each of the round operations requires a specific mathematical exploration. Taking each in turn we can establish the requirements for each one.

Taking the original matrix:

a0,0a0,1a0,2a0,3a1,0a1,1a1,2a1,3a2,0a2,1a2,2a2,3a3,0a3,1a3,2a3,3

si2_e

each element can be replaced byte-by-byte to generate a new matrix:

b0,0b0,1b0,2b0,3b1,0b1,1b1,2b1,3b2,0b2,1b2,2b2,3b3,0b3,1b3,2b3,3

si5_e

Byte substitution requires that for each input data block a(3,3), we look up a table of substitutions and replace the bytes to produce a new matrix b(3,3). The way it works is that for each input byte abcdefgh, we look up row abcd and column efgh and use the byte at that location in the output.

The complete byte substitution table is defined as shown in Figure 10.9.

f10-09-9780080971292
Figure 10.9 AES byte substitution table.

For example: If the input data byte was 7A, then this in binary terms is:

0111 1010

So the row required is 7 (0111) and the column required is A (1010). From this we can read off the resulting number from the table:

218 = 1101 1010 = DA

This is illustrated in the byte substitution table in Figure 10.9.

We can see that this is a bit shuffling operation that is simply moving bytes around in a publicly defined manner that does not have anything to do with a key.

Also note that the individual bits within the byte are not changed as such. This is a bytewise operation. The Shift Row function is essentially a set of cyclic shifts to the left with offsets of 0, 1, 2, 3, respectively.

c0,0c0,1c0,2c0,3c1,0c1,1c1,2c1,3c2,0c2,1c2,2c2,3c3,0c3,1c3,2c3,3=b0,0b0,1b0,2b0,3b1,1b1,2b1,3b1,0b2,2b2,3b2,0b2,1b3,3b3,0b3,1b3,2

si6_e

The Mix Columns function is a series of specific multiplications:

d0,0d0,1d0,2d0,3d1,0d1,1d1,2d1,3d2,0d2,1d2,2d2,3d3,0d3,1d3,2d3,3=02030101010203010101020303010102*c0,0c0,1c0,2c0,3c1,0c1,1c1,2c1,3c2,0c2,1c2,2c2,3c3,0c3,1c3,2c3,3

si7_e

where 01 = 00000001, 02 = 00000010, 03 = 00000011.

All multiplications are GF(28) and this transformation is invertible.

The final operation in each round is to add the key using the following function:

e0,0e0,1e0,2e0,3e1,0e1,1e1,2e1,3e2,0e2,1e2,2e2,3e3,0e3,1e3,2e3,3=d0,0d0,1d0,2d0,3d1,0d1,1d1,2d1,3d2,0d2,1d2,2d2,3d3,0d3,1d3,2d3,3k0,0k0,1k0,2k0,3k1,0k1,1k1,2k1,3k2,0k2,1k2,2k2,3k3,0k3,1k3,2k3,3

si8_e

The round keys are generated using the following method. The original key of 128 bits is represented as a 4 × 4 matrix of bytes (of 8 bits). This can be thought of as four columns W(0),W(1),W(2),W(3). Adjoin 40 columns W(4),…,W(43). The round key for round i consists of i columns. If i is a multiple of 4:

W(i)=W(i4)T(W(i1))

si9_e  (10.3)

where T is a transformation of a, b, c, d in column W(i − 1) using the following procedure:

 Shift cyclically to get b, c, d, a.

 Replace each byte with S-box entry using ByteSub, to get e, f, g, h.

 Compute round constant r(i) = 00000010(i − 4)/4 in GF(28).

 T(W(i − 1)) = (er(i),f,g,h)

 If i is not a multiple of 4, W(i) = W(i − 4) ⊕ W(i − 1)

10.4.1 Implementing AES in VHDL

We have two options for implementing block cipher operations in VHDL. We can use the structural approach (shown in the DES example previously in this chapter), or sometimes it makes sense to define a library of functions and use those to make much simpler models.

In the example of AES, we can define a top level entity and architecture that has the bare minimum of structure and is completely defined using functions. This can be especially useful when working with behavioral synthesis software as this allows complete flexibility for architectural optimization.

1 library 2 ieee;

2 use ieee.std_logic_1164.all;

3 entity AES is

4 port(

5 plaintext : in std_logic_vector(127 downto 0);

6 keytext : in std_logic_vector(127 downto 0); 

7  encrypt : in std_logic;

8  go : in std_logic;

9  ciphertext : out std_logic_vector(127 downto 0);

10  done : out std_logic := ’0’

11 );

12 end;

13

14 use work.aes_functions.all;

15 architecture behavior of AES is

16 begin

17 process

18 begin

19 wait until go = ’1’;

20 done <= ’0’;

21 ciphertext <= aes_core(plaintext, keytext, encrypt);

22 done <= ’1’;

23 end process;

24 end;

In this example, we have the plaintext and keytext inputs defined as 128-bit-wide vectors and the ciphertext output is also defined as 128 bits wide. The go flag initiates the encryption and the done flag shows when this has been completed.

Notice that we have a work library defined called aes_functions which encapsulates all the relevant functions for the AES algorithm. The set of functions is defined in a package (aes_functions) and the VHDL is given:

1 library ieee;

2 use ieee.std_logic_1164.all;

3 use ieee.numeric_std.all;

4 package aes_functions is

5

6 constant nr : integer := 10;

7 constant nb : integer := 4;

8 constant nk : integer := 4;

9

10 subtype vec1408 is std_logic_vector(1407 downto 0);

11 subtype vec128 is std_logic_vector(127 downto 0);

12 subtype vec64 is std_logic_vector(63 downto 0);

13 subtype vec32 is std_logic_vector(31 downto 0);

14 subtype vec16 is std_logic_vector(15 downto 0);

15 subtype vec8 is std_logic_vector(7 downto 0);

16

17 subtype int9 is integer range 0 to 9;

18

19 function input_output (input : vec128) return vec128;

20 function sbox (pt : vec8) return vec8;

21 function subbytes (plaintext : vec128) return vec128;

22 function shiftrows (plaintext : vec128) return vec128;

23 function ffmul(pt : vec8; mul : vec8) return vec8;

24 function mixcl( l0 : vec8; l1 : vec8; l2 : vec8; l3 : vec8) return vec8;

25 function mixcolumns(pt : vec128) return vec128;

26 function rcon (input : int9) return vec8;

27 function aes_keyexpansion(key : vec128) return vec1408;

28 function aes_core (signal plaintext : vec128; signal keytext : vec128; signal encrypt : std_logic) return vec128;

29

30

31

32 end;

33

34 library ieee;

35 use ieee.std_logic_1164.all;

36 use ieee.numeric_std.all;

37 package body aes_functions is

38

39 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

40 function subbytes (plaintext : vec128)

41 −− moods inline

42 return vec128 is

43 variable ciphertext : vec128;

44 begin

45 ciphertext := sbox(plaintext(127  downto 120)) &

46 sbox(plaintext(119 downto 112)) &

47 sbox(plaintext(111 downto 104)) &

48 sbox(plaintext(103 downto 96)) &

49 sbox(plaintext(95  downto 88)) &

50 sbox(plaintext(87  downto 80)) &

51 sbox(plaintext(79  downto 72)) &

52 sbox(plaintext(71  downto 64)) &

53 sbox(plaintext(63  downto 56)) &

54 sbox(plaintext(55  downto 48)) &

55 sbox(plaintext(47  downto 40)) &

56 sbox(plaintext(39  downto 32)) &

57 sbox(plaintext(31  downto 24)) &

58 sbox(plaintext(23  downto 16)) &

59 sbox(plaintext(15  downto 8)) &

60 sbox(plaintext(7  downto 0));

61 return ciphertext;

62 end;

63

64 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

65 function shiftrows (plaintext : vec128)

66 −− moods inline

67 return vec128 is

68 variable ciphertext : vec128;

69 begin

 70−−line 0 (the first): no shift

71 ciphertext := plaintext(31  downto 24) &

72 plaintext(55  downto 48) &

73 plaintext(79  downto 72) &

74 plaintext(103 downto 96) &

75 plaintext(127 downto 120) &

76 plaintext(23  downto 16) &

77 plaintext(47  downto 40) &

78 plaintext(71  downto 64) &

79 plaintext(95  downto 88) &

80 plaintext(119 downto 112) &

81 plaintext(15  downto 8) &

82 plaintext(39  downto 32) &

83 plaintext(63  downto 56) &

84 plaintext(87  downto 80) &

85 plaintext(111 downto 104) &

86 plaintext(7  downto 0);

87 return ciphertext;

88 end;

89

90 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

91

92 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

93 function tablelog (input : vec8)

94  −− moods inline

95 return vec8 is

96 variable output : vec8;

97 type table256 is array(0 to 255) of vec8;

98 constant pt_256 : table256 := (

99 −− moods rom

100 x ”00”, x ”00”, x ”19”, x ”01”, x ”32”,

101 x ”02”, x ”1a”, x ”c6”, x ”4b”,

102 x ”c7”, x ”1b”, x ”68”, x ”33”,

103 x ”ee”, x ”df”, x ”03”, x ”64”,

104

105 x ”04”, x ”e0”, x ”0e”, x ”34”,

106 x ”8d”, x ”81”, x ”ef”, x ”4c”,

107 x ”71”, x ”08”, x ”c8”, x ”f8”,

108 x ”69”, x ”1c”, x ”c1”, x ”7d”,

109

110 x ”c2”, x ”1d”, x ”b5”, x ”f9”,

111 x ”b9”, x ”27”, x ”6a”, x ”4d”,

112 x ”e4”, x ”a6”, x ”72”, x ”9a”,

113 x ”c9”, x ”09”, x ”78”, x ”65”,

114

115 x ”2f”, x ”8a”, x ”05”, x ”21”,

116 x ”0f”, x ”e1”, x ”24”, x ”12”,

117 x ”f0”, x ”82”, x ”45”, x ”35”,

118 x ”93”, x ”da”, x ”8e”, x ”96”,

119

120 x ”8f”, x ”db”, x ”bd”, x ”36”,

121 x ”d0”, x ”ce”, x ”94”, x ”13”,

122 x ”5c”, x ”d2”, x ”f1”, x ”40”,

123 x ”46”, x ”83”, x ”38”, x ”66”,

124 x ”dd”, x ”fd”, x ”30”, x ”bf”,

125 x ”06”, x ”8b”, x ”62”, x ”b3”,

126 x ”25”, x ”e2”, x ”98”, x ”22”,

127 x ”88”, x ”91”, x ”10”, x ”7e”,

128

129 x ”6e”, x ”48”, x ”c3”, x ”a3”,

130 x ”b6”, x ”1e”, x ”42”, x ”3a”,

131 x ”6b”, x ”28”, x ”54”, x ”fa”,

132 x ”85”, x ”3d”, x ”ba”, x ”2b”,

133

134 x ”79”, x ”0a”, x ”15”, x ”9b”,

135 x ”9f”, x ”5e”, x ”ca”, x ”4e”,

136 x ”d4”, x ”ac”, x ”e5”, x ”f3”,

137 x ”73”, x ”a7”, x ”57”, x ”af”,

138

139 x ”58”, x ”a8”, x ”50”, x ”f4”,

140 x ”ea”, x ”d6”, x ”74”, x ”4f”,

141 x ”ae”, x ”e9”, x ”d5”, x ”e7”,

142 x ”e6”, x ”ad”, x ”e8”, x ”2c”,

143

144 x ”d7”, x ”75”, x ”7a”, x ”eb”,

145 x ”16”, x ”0b”, x ”f5”, x ”59”,

146 x ”cb”, x ”5f”, x ”b0”, x ”9c”,

147 x ”a9”, x ”51”, x ”a0”, x ”7f”,

148

149 x ”0c”, x ”f6”, x ”6f”, x ”17”,

150 x ”c4”, x ”49”, x ”ec”, x ”d8”,

151 x ”43”, x ”1f”, x ”2d”, x ”a4”,

152 x ”76”, x ”7b”, x ”b7”, x ”cc”,

153

154 x ”bb”, x ”3e”, x ”5a”, x ”fb”,

155 x ”60”, x ”b1”, x ”86”, x ”3b”,

156 x ”52”, x ”a1”, x ”6c”, x ”aa”,

157 x ”55”, x ”29”, x ”9d”, x ”97”,

158

159 x ”b2”, x ”87”, x ”90”, x ”61”,

160 x ”be”, x ”dc”, x ”fc”, x ”bc”,

161 x ”95”, x ”cf”, x ”cd”, x ”37”,

162 x ”3f”, x ”5b”, x ”d1”, x ”53”,

163

164 x ”39”, x ”84”, x ”3c”, x ”41”,

165 x ”a2”, x ”6d”, x ”47”, x ”14”,

166 x ”2a”, x ”9e”, x ”5d”, x ”56”,

167 x ”f2”, x ”d3”, x ”ab”, x ”44”,

168

169 x ”11”, x ”92”, x ”d9”, x ”23”,

170 x ”20”, x ”2e”, x ”89”, x ”b4”,

171 x ”7c”, x ”b8”, x ”26”, x ”77”,

172 x ”99”, x ”e3”, x ”a5”, x ”67”,

173

174 x ”4a”, x ”ed”, x ”de”, x ”c5”,

175 x ”31”, x ”fe”, x ”18”, x ”0d”,

176 x ”63”, x ”8c”, x ”80”, x ”c0”,

177 x ”f7”, x ”70”, x ”07” );

178 begin

179 output := pt_256(to_integer(unsigned(input)));

180 return output;

181 end;

182

183 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

184 function tableexp (input : vec8)

185 −− moods inline

186 return vec8 is

187 variable output : vec8;

188 type table256 is array(0 to 255) of vec8;

189 constant pt_256 : table256 := (

190 −− moods rom

191 x ”01”, x ”03”, x ”05”, x ”0f”,

192 x ”11”, x ”33”, x ”55”, x ”ff”,

193 x ”1a”, x ”2e”, x ”72”, x ”96”,

194 x ”a1”, x ”f8”, x ”13”, x ”35”,

195

196 x ”5f”, x ”e1”, x ”38”, x ”48”,

197 x ”d8”, x ”73”, x ”95”, x ”a4”,

198 x ”f7”, x ”02”, x ”06”, x ”0a”,

199 x ”1e”, x ”22”, x ”66”, x ”aa”,

200

201 x ”e5”, x ”34”, x ”5c”, x ”e4”,

202 x ”37”, x ”59”, x ”eb”, x ”26”,

203 x ”6a”, x ”be”, x ”d9”, x ”70”,

204 x ”90”, x ”ab”, x ”e6”, x ”31”,

205

206 x ”53”, x ”f5”, x ”04”, x ”0c”,

207 x ”14”, x ”3c”, x ”44”, x ”cc”,

208 x ”4f”, x ”d1”, x ”68”, x ”b8”,

209 x ”d3”, x ”6e”, x ”b2”, x ”cd”,

210

211 x ”4c”, x ”d4”, x ”67”, x ”a9”,

212 x ”e0”, x ”3b”, x ”4d”, x ”d7”,

213 x ”62”, x ”a6”, x ”f1”, x ”08”,

214 x ”18”, x ”28”, x ”78”, x ”88”,

215

216 x ”83”, x ”9e”, x ”b9”, x ”d0”,

217 x ”6b”, x ”bd”, x ”dc”, x ”7f”,

218 x ”81”, x ”98”, x ”b3”, x ”ce”,

219 x ”49”, x ”db”, x ”76”, x ”9a”,

220

221 x ”b5”, x ”c4”, x ”57”, x ”f9”,

222 x ”10”, x ”30”, x ”50”, x ”f0”,

223 x ”0b”, x ”1d”, x ”27”, x ”69”,

224 x ”bb”, x ”d6”, x ”61”, x ”a3”,

225

226 x ”fe”, x ”19”, x ”2b”, x ”7d”,

227 x ”87”, x ”92”, x ”ad”, x ”ec”,

228 x ”2f”, x ”71”, x ”93”, x ”ae”,

229 x ”e9”, x ”20”, x ”60”, x ”a0”,

230

231 x ”fb”, x ”16”, x ”3a”, x ”4e”,

232 x ”d2”, x ”6d”, x ”b7”, x ”c2”,

233 x ”5d”, x ”e7”, x ”32”, x ”56”,

234 x ”fa”, x ”15”, x ”3f”, x ”41”,

235

236 x ”c3”, x ”5e”, x ”e2”, x ”3d”,

237 x ”47”, x ”c9”, x ”40”, x ”c0”,

238 x ”5b”, x ”ed”, x ”2c”, x ”74”,

239 x ”9c”, x ”bf”, x ”da”, x ”75”,

240

241 x ”9f”, x ”ba”, x ”d5”, x ”64”,

242 x ”ac”, x ”ef”, x ”2a”, x ”7e”,

243 x ”82”, x ”9d”, x ”bc”, x ”df”,

244 x ”7a”, x ”8e”, x ”89”, x ”80”,

245

246 x ”9b”, x ”b6”, x ”c1”, x ”58”,

247 x ”e8”, x ”23”, x ”65”, x ”af”,

248 x ”ea”, x ”25”, x ”6f”, x ”b1”,

249 x ”c8”, x ”43”, x ”c5”, x ”54”,

250

251 x ”fc”, x ”1f”, x ”21”, x ”63”,

252 x ”a5”, x ”f4”, x ”07”, x ”09”,

253 x ”1b”, x ”2d”, x ”77”, x ”99”,

254 x ”b0”, x ”cb”, x ”46”, x ”ca”,

255

256 x ”45”, x ”cf”, x ”4a”, x ”de”,

257 x ”79”, x ”8b”, x ”86”, x ”91”,

258 x ”a8”, x ”e3”, x ”3e”, x ”42”,

259 x ”c6”, x ”51”, x ”f3”, x ”0e”,

260

261 x ”12”, x ”36”, x ”5a”, x ”ee”,

262 x ”29”, x ”7b”, x ”8d”, x ”8c”,

263 x ”8f”, x ”8a”, x ”85”, x ”94”,

264 x ”a7”, x ”f2”, x ”0d”, x ”17”,

265

266 x ”39”, x ”4b”, x ”dd”, x ”7c”,

267 x ”84”, x ”97”, x ”a2”, x ”fd”,

268 x ”1c”, x ”24”, x ”6c”, x ”b4”,

269 x ”c7”, x ”52”, x ”f6”, x ”01”);

270 begin

271 output := pt_256(to_integer(unsigned(input)));

272 return output;

273 end;

274

275 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

276 function ffmul(pt : vec8; mul : vec8)

277 −− moods inline

278 return vec8 is

279 −−variable res : vec8;

280 variable tablogpt : vec8;

281 variable tablogmul : vec8;

282 variable tablogpt8 : unsigned(8 downto 0);

283 variable tablogmul8 : unsigned(8 downto 0);

284 variable carrie : std_logic_vector(8 downto 0);

285 variable power : vec8;

286 variable result: vec8;

287 begin

288 tablogpt := tablelog(pt);

289 tablogmul := tablelog(mul);

290

291 tablogpt8 := unsigned( ”0” & tablogpt);

292 tablogmul8 := unsigned( ”0” & tablogmul);

293

294 carrie := std_logic_vector(tablogmul8 + tablogpt8);

295

296 if pt = x ”00” or mul = x ”00” then

297 result := x ”00”;

298 elsif carrie(8) = ’1’ or carrie(7 downto 0) = x ”ff” then  −− mod 255

299 power := std_logic_vector(unsigned(carrie(7 downto 0)) + 1);  −− power = power − 255

300 result := tableexp(power);

301 else

302 power := carrie(7 downto 0);

303 result := tableexp(power);

304 end if;

305 return result;

306 end;

307

308 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

309 function mixcl( l0 : vec8; l1 : vec8; l2 : vec8; l3 : vec8)

310 −− moods inline

311 return vec8 is

312 variable ct : vec8;

313 begin

314

315 ct := ffmul(l0, x ”02”) xor ffmul(l1, x ”01”) xor ffmul(l2, x ”01”) xor ffmul(l3, x ”03”);

316

317 return ct;

318 end;

319

320 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

321 function mixcolumns(pt : vec128)

322 −− moods inline

323 return vec128 is

324 variable ct : vec128;

325 begin

326

327 ct := mixcl(pt(127 downto 120), pt(119 downto 112), pt(111 downto 104), pt(103 downto 96)) &

328 mixcl(pt(119 downto 112), pt(111 downto 104), pt(103 downto 96), pt(127 downto 120)) &

329 mixcl(pt(111 downto 104), pt(103 downto 96), pt(127 downto 120), pt(119 downto 112)) &

330 mixcl(pt(103 downto 96), pt(127 downto 120), pt(119 downto 112), pt(111 downto 104)) &

331

332 mixcl(pt(95 downto 88), pt(87 downto 80), pt(79 downto 72), pt(71 downto 64)) &

333 mixcl(pt(87 downto 80), pt(79 downto 72), pt(71 downto 64), pt(95 downto 88)) &

334 mixcl(pt(79 downto 72), pt(71 downto 64), pt(95 downto 88), pt(87 downto 80)) &

335 mixcl(pt(71 downto 64), pt(95 downto 88), pt(87 downto 80), pt(79 downto 72)) &

336

337 mixcl(pt(63 downto 56), pt(55 downto 48), pt(47 downto 40), pt(39 downto 32)) &

338 mixcl(pt(55 downto 48), pt(47 downto 40), pt(39 downto 32), pt(63 downto 56)) &

339 mixcl(pt(47 downto 40), pt(39 downto 32), pt(63 downto 56), pt(55 downto 48)) &

340 mixcl(pt(39 downto 32), pt(63 downto 56), pt(55 downto 48), pt(47 downto 40)) &

341

342 mixcl(pt(31 downto 24), pt(23 downto 16), pt(15 downto 8), pt(7 downto 0)) &

343 mixcl(pt(23 downto 16), pt(15 downto 8), pt(7 downto 0), pt(31 downto 24)) &

344 mixcl(pt(15 downto 8), pt(7 downto 0), pt(31 downto 24), pt(23 downto 16)) &

345 mixcl(pt(7 downto 0), pt(31 downto 24), pt(23 downto 16), pt(15 downto 8));

346

347 return ct;

348 end;

349

350 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

351 function input_output (input : vec128)

352 −− moods inline

353 return vec128 is

354 variable output : vec128;

355 function flip(input:vec32) return vec32 is

356 −− moods inline

357 begin

358 return input(7 downto 0) & input(15 downto 8)

359 & input(23 downto 16) & input(31 downto 24);

360 end;

361 begin

362 return flip(input(127 downto 96)) & flip(input(95 downto 64))

363 & flip(input(63 downto 32)) & flip(input(31 downto 0));

364 end;

365

366 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

367 function aes_keyexpansion(key : vec128)

368 −− moods inline

369 return vec1408 is

370 variable iok : vec128;

371 variable er0,er1,er2,er3,er4,er5,er6,er7,er8,er9:vec128;

372  −−variable zero: vec128;

373  −−variable expandedkeys: vec1408;

374

375 function exp_round(input : vec128; round: int9) return vec128 is

376  −− moods inline

377 variable r1,r2,r3,r4,r5: vec32;

378 begin

379 r1 := sbox(input(7 downto 0)) &

380 sbox(input(31 downto 24)) &

381 sbox(input(23 downto 16)) &

382 (sbox(input(15 downto 8)) xor rcon(round));

383

384 r2 := input(127 downto 96) xor r1;

385 r3 := input(95 downto 64) xor r2;

386 r4 := input(63 downto 32) xor r3;

387 r5 := input(31 downto 0) xor r4;

388 return r2 & r3 & r4 & r5;

389 end;

390

391 begin

392  −− first round

393 iok := input_output(key);

394  −− other rounds

395 er9 := exp_round(iok,9);

396 er8 := exp_round(er9,8);

397 er7 := exp_round(er8,7);

398 er6 := exp_round(er7,6);

399 er5 := exp_round(er6,5);

400 er4 := exp_round(er5,4);

401 er3 := exp_round(er4,3);

402 er2 := exp_round(er3,2);

403 er1 := exp_round(er2,1);

404 er0 := exp_round(er1,0);

405

406 return (iok & er9 & er8 & er7 & er6 & er5 & er4 & er3 & er2 & er1 & er0);

407 end;

408

409 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

410

411 function aes_core (signal plaintext : vec128; signal keytext : vec128; signal encrypt : std_logic)

412 −− moods inline

413 return vec128 is

414 variable rk0 : vec128;

415 variable ciphertext, expkey : vec128;

416 variable ct1, ct2,ct3,ct4,ct5,ct6,ct7,ct8: vec128;

417 variable expandedkeys : vec1408;

418 begin

419 −− expanded key schedule

420 expandedkeys := aes_keyexpansion(keytext);

421

422 −− round 0

423 ct1 := input_output(plaintext) xor expandedkeys(1407 downto 1280);

424

425 −− round 1 to nr−1

426 −−for i in 1 to nr−1 loop

427 for i in 1 to 9 loop

428 ct2 := subbytes(ct1);

429 ct3 := shiftrows(ct2);

430 ct4 := mixcolumns(ct3);

431 case(i) is

432 when 1 => expkey := expandedkeys(1279 downto 1152);

433 when 2 => expkey := expandedkeys(1151 downto 1024);

434 when 3 => expkey := expandedkeys(1023 downto 896);

435 when 4 => expkey := expandedkeys(895  downto 768);

436 when 5 => expkey := expandedkeys(767  downto 640);

437 when 6 => expkey := expandedkeys(639  downto 512);

438 when 7 => expkey := expandedkeys(511  downto 384);

439 when 8 => expkey := expandedkeys(383  downto 256);

440 when 9 => expkey := expandedkeys(255  downto 128);

441 when others => null;

442 end case;

443 ct1 := ct4 xor expkey;

444 end loop;

445

446  −− final round nr=10

447 ct5 := subbytes(ct1);

448 ct6 := shiftrows(ct5);

449 ct7 := ct6 xor expandedkeys(1407−128*nr downto 1280−128*nr);

450

451 ciphertext := input_output(ct7);

452

453 return ciphertext;

454 end;

455

456

457

458 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

459 function rcon (input : int9)

460 −− moods inline

461 return vec8 is

462 type rcont_t is array(0 to 9) of vec8;

463 constant table_rcon: rcont_t := (

464 −− moods rom

465 x ”36”, x ”1b”, x ”80”, x ”40”, x ”20”, x ”10”, x ”08”, x ”04”, x ”02”, x ”01”);

466 begin

467 return table_rcon(input);

468 end;

469

470 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

471 function sbox (pt : vec8)

472 −− moods inline

473 return vec8 is

474 variable ct : vec8;

475 type table256 is array(0 to 255) of vec8;

476 constant pt_256 : table256 := (

47 −− moods rom7

478 x ”63”, x ”7c”, x ”77”, x ”7b”, x ”f2”, x ”6b”, x ”6f”, x ”c5”,

479 x ”30”, x ”01”, x ”67”, x ”2b”, x ”fe”, x ”d7”, x ”ab”, x ”76”,

480 x ”ca”, x ”82”, x ”c9”, x ”7d”, x ”fa”, x ”59”, x ”47”, x ”f0”,

481 x ”ad”, x ”d4”, x ”a2”, x ”af”, x ”9c”, x ”a4”, x ”72”, x ”c0”,

482 x ”b7”, x ”fd”, x ”93”, x ”26”, x ”36”, x ”3f”, x ”f7”, x ”cc”,

483 x ”34”, x ”a5”, x ”e5”, x ”f1”, x ”71”, x ”d8”, x ”31”, x ”15”,

484 x ”04”, x ”c7”, x ”23”, x ”c3”, x ”18”, x ”96”, x ”05”, x ”9a”,

485 x ”07”, x ”12”, x ”80”, x ”e2”, x ”eb”, x ”27”, x ”b2”, x ”75”,

486 x ”09”, x ”83”, x ”2c”, x ”1a”, x ”1b”, x ”6e”, x ”5a”, x ”a0”,

487 x ”52”, x ”3b”, x ”d6”, x ”b3”, x ”29”, x ”e3”, x ”2f”, x ”84”,

488 x ”53”, x ”d1”, x ”00”, x ”ed”, x ”20”, x ”fc”, x ”b1”, x ”5b”,

489 x ”6a”, x ”cb”, x ”be”, x ”39”, x ”4a”, x ”4c”, x ”58”, x ”cf”,

490 x ”d0”, x ”ef”, x ”aa”, x ”fb”, x ”43”, x ”4d”, x ”33”, x ”85”,

491 x ”45”, x ”f9”, x ”02”, x ”7f”, x ”50”, x ”3c”, x ”9f”, x ”a8”,

492 x ”51”, x ”a3”, x ”40”, x ”8f”, x ”92”, x ”9d”, x ”38”, x ”f5”,

493 x ”bc”, x ”b6”, x ”da”, x ”21”, x ”10”, x ”ff”, x ”f3”, x ”d2”,

494 x ”cd”, x ”0c”, x ”13”, x ”ec”, x ”5f”, x ”97”, x ”44”, x ”17”,

495 x ”c4”, x ”a7”, x ”7e”, x ”3d”, x ”64”, x ”5d”, x ”19”, x ”73”,

496 x ”60”, x ”81”, x ”4f”, x ”dc”, x ”22”, x ”2a”, x ”90”, x ”88”,

497 x ”46”, x ”ee”, x ”b8”, x ”14”, x ”de”, x ”5e”, x ”0b”, x ”db”,

498 x ”e0”, x ”32”, x ”3a”, x ”0a”, x ”49”, x ”06”, x ”24”, x ”5c”,

499 x ”c2”, x ”d3”, x ”ac”, x ”62”, x ”91”, x ”95”, x ”e4”, x ”79”,

500 x ”e7”, x ”c8”, x ”37”, x ”6d”, x ”8d”, x ”d5”, x ”4e”, x ”a9”,

501 x ”6c”, x ”56”, x ”f4”, x ”ea”, x ”65”, x ”7a”, x ”ae”, x ”08”,

502 x ”ba”, x ”78”, x ”25”, x ”2e”, x ”1c”, x ”a6”, x ”b4”, x ”c6”,

503 x ”e8”, x ”dd”, x ”74”, x ”1f”, x ”4b”, x ”bd”, x ”8b”, x ”8a”,

504 x ”70”, x ”3e”, x ”b5”, x ”66”, x ”48”, x ”03”, x ”f6”, x ”0e”,

505 x ”61”, x ”35”, x ”57”, x ”b9”, x ”86”, x ”c1”, x ”1d”, x ”9e”,

506 x ”e1”, x ”f8”, x ”98”, x ”11”, x ”69”, x ”d9”, x ”8e”, x ”94”,

507 x ”9b”, x ”1e”, x ”87”, x ”e9”, x ”ce”, x ”55”, x ”28”, x ”df”,

508 x ”8c”, x ”a1”, x ”89”, x ”0d”, x ”bf”, x ”e6”, x ”42”, x ”68”,

509 x ”41”, x ”99”, x ”2d”, x ”0f”, x ”b0”, x ”54”, x ”bb”, x ”16”);

510 begin

511 ct := pt_256(to_integer(unsigned(pt)));

512 return ct;

513 end;

514

515 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

516 end;

After the functions and top level entity have been defined, we can implement a test bench that applies a set of test data to the inputs and verifies that the correct output has been obtained. Notice that we use the assertion technique to identify correct operation.

1 library  ieee;

2 use ieee.std_logic_1164.all;

3 entity testAES is

4 end;

5

6

7 library ieee;

8 use ieee.std_logic_1164.all;

9 use work.aes_functions.all;

10 architecture behavior of testAES is

11

12 component aes

13 port(

14 plaintext : in std_logic_vector(127 downto 0);

15 keytext : in std_logic_vector(127 downto 0);

16 encrypt : in std_logic;

17 go : in std_logic;

18 ciphertext : out std_logic_vector(127 downto 0);

19 done : out std_logic

20 );

21 end component;

22

23 for all : aes use entity work.aes;

24

25 signal plaintext : std_logic_vector(127 downto 0);

26 signal keytext : std_logic_vector(127 downto 0);

27 signal encrypt : std_logic;

28 signal go : std_logic := ’0’;

29 signal ciphertext : std_logic_vector(127 downto 0);

30 signal done : std_logic;

31 signal ok : std_logic := ’0’;

32begin

33 plaintext <=  X ” 00000000000000000000000000000000 ” , X ” 3243f6a8885a308d313198a2e0370734 ”  after 50 ns ;

34 keytext <=  X ” 00000000000000000000000000000000 ” , X ” 2b7e151628aed2a6abf7158809cf4f3c ”  after 100 ns;

35 process (ciphertext)

36 variable ct : std_logic_vector(127 downto 0) := X ” 3925841d02dc09fbdc118597196a0b32 ”;

37 begin

38 assert ct = ciphertext

39 report  ” Test vectors do not match ”

40 severity note;

41 assert not (ct = ciphertext)

42 report  ” Test vectors Matched ”

43 severity note;

44 end process;

45

46

47 process

48 begin

49 go <= not go after 20 ns;

50 end process;

51

52 DUT : aes port map (plaintext, keytext, encrypt, go, ciphertext, done);

53end;

10.5 Summary

This chapter shows how standard block ciphers can be implemented in VHDL and Verilog using DES as an example. AES has been developed further using VHDL, but can use the same principles in Verilog. Both of these algorithms are in common usage today and in operational hardware. There are numerous other methods, as security requires a constant evolution of encryption techniques and no doubt more robust and secure methods will emerge that require implementation in VHDL and/or Verilog in the future.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.200.143