© Karan Singh Garewal 2020
K. S. GarewalPractical Blockchains and Cryptocurrencieshttps://doi.org/10.1007/978-1-4842-5893-4_3

3. Symmetric Encryption

Karan Singh Garewal1 
(1)
Toronto, ON, Canada
 

The mathematical foundations of blockchains and cryptocurrencies are rooted in an area of Mathematics called cryptography (or cryptology). In this chapter and the three chapters that follow, we will examine these mathematical foundations. Even though I am not going to present the mathematical theory in a rigorous lemma-theorem-corollary style, I am going to present the theory in a lucid and understandable manner. The goal is to provide you with a clear understanding of the underlying mathematical concepts and theory. This deep exposure will be more than sufficient to enable you to develop blockchain and cryptocurrency applications with confidence.

Alright, let’s begin our deep dive into cryptography.

How Symmetric Encryption Works

Symmetric or single-key encryption has been around for thousands of years. Symmetric encryption works as follows: a message (called plaintext) is encrypted with a secret key. The encryption process converts the plaintext into a garbled message called a ciphertext. The secret key is simply some sequence of characters or symbols. Only a person who has the secret key can decode the ciphertext and recover the origin plaintext. Mathematically, we can represent the encryption process as
c = E(k,p)
Here, E is the encryption algorithm (function), k is the secret key, and p is the plaintext. c is the ciphertext produced by the encryption algorithm. The decryption process can be represented as
p = D(k,c)

where D is the decryption algorithm. Note that the same key is used for encryption and decryption. Hence the term symmetric encryption key.

Figure 3-1 illustrates the encryption and decryption process.
../images/492478_1_En_3_Chapter/492478_1_En_3_Fig1_HTML.jpg
Figure 3-1

Symmetric Encryption and Decryption

A classic example of symmetric encryption is the Caesar cipher presumably used by Julius Caesar's Roman legions. In this encryption algorithm, every alphabetic character in the plaintext is replaced by a character which is three positions to the right (rotating to the beginning of the alphabet, if necessary) and spaces between words are ignored. For example, the letter p is replaced by the letter s and the letter z is replaced by the letter c. Thus, the plaintext the legion has arrived is encrypted as follows:
plaintext:  the legion has arrived
ciphertext: xkhohjlrqkdvduulzhg

In order to decrypt this ciphertext, we simply replace each character in the ciphertext with the character which is three characters to its left.

Design of Symmetric Encryption Algorithms

As we have seen, an encryption algorithm takes plaintext and a secret key as inputs and produces a ciphertext as its output. The encryption algorithm must be independent of both the plaintext and the secret key. Typically, an encryption algorithm consists of a finite number of rounds where each round is a finite sequence of substitution and transposition operations. A substitution operation substitutes some other character for an input character in the text, and a transposition operation rearranges or permutes a portion of the input text. Each round of the encryption algorithm garbles the plaintext to some extent, and this garbled output becomes the input for the next round of the encryption algorithm. In order to recover the plaintext from the ciphertext, the operations of the encryption algorithm must be reversible. Therefore, the decryption algorithm simply executes the rounds and operations of the encryption algorithm in reverse. The Caesar cipher that we discussed previously is a simple substitution cipher, and there are no transposition operations.

Symmetric encryption algorithms come in two flavors. Block ciphers partition the plaintext into blocks of fixed length and encrypt the plaintext one block at a time. Stream ciphers encrypt plaintext one byte at a time and are meant to encrypt data streaming over a network.

Devising a strong symmetric encryption algorithm that can withstand cryptanalysis is a non-trivial endeavor.1 There are two main ways to attack an encryption algorithm and thus obtain the ability to decipher messages encoded by the algorithm. The first technique relies on a brute-force attack on the algorithm. This technique tries all possible secret keys until we obtain intelligible plaintext. The technique is not feasible if the keyspace is very large. Table 3-1 shows the relationship between the size of the encryption key and the time required to decrypt a message with a brute-force attack.
Table 3-1

The Relationship Between Key Size and Decryption Time

Key Size

Number of Keys in Keyspace

Average Time to Recover Plaintext (1016 Decryptions per Second)

8 bits

256

< 1 second

64 bits

264

0.25 hours

128 bits

2128

53.9*1012 years

256 bits

2256

18.3*1051 years

This table shows that larger keys offer greater protection against brute-force attacks than smaller keys.

The second avenue of attack relies upon the statistical analysis of the ciphertext. In the English language, certain letters of the alphabet appear with greater frequency than other letters and certain words appear with greater frequency than other words. For example, in a paragraph of text, the word “the” will ordinarily appear with greater frequency than the word “aardvark.” Similarly, certain two-letter combinations (digraphs) appear with greater frequency than other combinations; and certain three-letter combinations (trigraphs) appear with greater frequency than other combinations. Table 3-2 taken from an analysis of 40,000 words categorizes the frequency with which letters appear in a large sample of text.2
Table 3-2

Frequency Distribution of English Characters

a

8.12

b

1.49

c

2.71

d

4.32

e

12.02

f

2.30

g

2.03

h

5.92

i

7.31

j

0.10

k

0.69

l

3.98

m

2.61

n

6.95

o

7.68

p

1.82

q

0.11

r

6.02

s

6.28

t

9.10

u

2.88

v

1.11

w

2.09

x

9.17

y

2.11

z

0.07

    
An analysis of 40,000 words from the English language also indicates the frequency distribution of the 12 most common digraphs, as shown in Table 3-3.3
Table 3-3

Frequency Distribution of the 12 Most Common Diagraphs

th

1.52

he

1.28

in

0.94

er

0.94

an

0.82

re

0.68

nd

0.63

at

0.59

on

0.57

nt

0.56

ha

0.56

es

0.56

Cryptanalysis takes advantage of these structural language characteristics to decipher ciphertext. This analysis shows that a good encryption algorithm implements several substitution and transposition steps that alter or mask these language characteristics. In the ideal encryption algorithm, all digraphs, trigraphs, and n-word combinations should have an equal probability of appearing in the ciphertext. Other cryptanalysis techniques take advantage of word order and the rules of grammar.

Another desirable property of an encryption algorithm is that a small change in the plaintext should cause a large change in the ciphertext; this is called an avalanche effect. The avalanche effect thwarts differential cryptanalysis which attempts to decode ciphertext by examining how different messages encrypted by the same algorithm differ from each other.

Aside from cryptanalysis, language analysis is important in the field of natural language processing (NLP). NLP has important applications in the design of chatbots, spam filters, language translators, as well as sentiment analysis, behavior prediction, and classification.

Advanced Encryption Standard

The Advanced Encryption Standard (AES) is the US National Institute of Standards’ recommended symmetric encryption algorithm. In 1997, NIST announced a competition to replace the aging Data Encryption Standard. Four years later in 2001, AES was chosen as the winner among several competing encryption designs. AES is widely used by governments and financial institutions across the world. It is a very fast encryptor. Due to its small size, AES finds usage in devices ranging from microcontrollers to supercomputers. AES processes plaintext blocks of 128 bits and supports encryption key lengths of 128, 192, and 256 bits. It uses 10 encryption rounds if the selected key is 128 bits long, 12 rounds if the key is 192 bits, and 14 rounds if the key is 256 bits.

Table 3-4 enumerates some of the prominent symmetric encryption algorithms as well as their characteristics.
Table 3-4

Symmetric Encryption Algorithms

Name

Introduced In

Block Size (Bits)

Key Length (Bits)

Rounds

AES

2001

128

128, 192, 256

10, 12, 14

Serpent4

2001

128

128, 192, 256

32

RC2

1987

64

8–1024

18

Twofish5

2001

128

128, 192, 256

18

The Key Distribution Problem

For a party to decipher a ciphertext, he or she must possess the secret key. The problem arises when a secret key has to be distributed to a large number of recipients. In such a case, there is a significant danger that an adverse party will be able to intercept and appropriate the key. Symmetric key encryption is not a suitable technique when an encryption key has to be distributed to a large number of users. The solution to this key distribution problem lies in a cryptographic technique called public key encryption, which we will examine in a subsequent chapter.

Pseudo-Random Number Generators

Many cryptographic algorithms require the use of one or more random numbers, typically to initialize a process with a seed, generate keys, set a nonce, specify a salt, or randomize a variable. Instead of obtaining a random number by sampling some process in nature or on our computer, it is common practice to use a mathematical function to generate a sequence of pseudo-random numbers. A function that generates such a sequence is called a pseudo-random number generator (PRNG). A PRNG works as follows. The PRNG is seeded with some random number which is called a seed, and it then recursively generates a deterministic sequence of numbers. Note that, given the certain seed, the PRNG will always generate the same sequence of numbers.

A good pseudo-random number generator has statistical properties that make it very difficult to distinguish from a sequence of random numbers. In particular, in a high-quality PRNG, successive numbers are not correlated and the PRNG has a very large period before the numbers start to repeat.

The linear congruential generator (LCG) is a commonly used PRNG. It has a simple definition:6
Xn+1 = (m*xn + a) mod P
P >= 2 is an integer. P is the period of the PRNG.
The integer coefficient m is called the multiplier; m > 0 and m < P.
a is an additive integer constant where a > 0 and a < P.
The initial value of the generator x0 is the seed.
xn  is the nth pseudo-random number generated by the function.

Note that LCG is a recursive function. It is very easy to implement and has the advantage of being a fast pseudo-random number generator. The statistical randomness of LCG is sensitive to the modulus value P and the seed which is selected. The GNU C library, glibc, uses a period of 245, a multiplier value of 25214903917, and a seed value of 11, in the specification of its LCG.

The following code implements the linear congruential generator in Python:7
class LCG:
    def __init__(self,mult,addr,prd,seed):
       self.multiplier = mult
       self.addr = addr
       self.period = prd
       self.lastValue = seed
    def generator(self):
        self.lastValue = (self.multiplier*self.lastValue + self.addr) % self.period
        return self.lastValue
lcg = LCG(11, 37, 1000, 0)
ctr = 0
while ctr < 10:
  ret = lcg.generator()
  print(ret)
  ctr += 1

The Mersenne Twister8 is another high-quality PRNG of interest. In its most common derivation, it is seeded with the Mersenne prime number 219937 – 1 and it has a period of 219937 – 1.

Conclusion

In this chapter, we have examined the design of symmetric encryption algorithms and the cryptanalysis of these algorithms. We have also taken a look at the Advanced Encryption Standard, which is a widely used encryption algorithm. The distribution of encryption keys to a large number of users is a serious issue since it raises the possibility that a key may be stolen during the distribution process. In a subsequent chapter, we will show how public key encryption solves the key distribution dilemma. Finally, we have elucidated upon pseudo-random number generators.

In the next chapter, we will study cryptographic hash functions. These functions are particularly important in blockchain design.

References

Schneier, Bruce. Applied Cryptography: Protocols, Algorithms and Source Code in C, 1st ed., John Wiley & Sons, Inc., 1995

Ferguson Neils, Schneier, Bruce et al. Cryptography Engineering, 1st ed., Wiley, 2010

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.138.141