The mathematical foundations of blockchains and cryptocurrencies are rooted in an area of Mathematics called cryptography (or cryptology). In this chapter and the three chapters that follow, we will examine these mathematical foundations. Even though I am not going to present the mathematical theory in a rigorous lemma-theorem-corollary style, I am going to present the theory in a lucid and understandable manner. The goal is to provide you with a clear understanding of the underlying mathematical concepts and theory. This deep exposure will be more than sufficient to enable you to develop blockchain and cryptocurrency applications with confidence.
Alright, let’s begin our deep dive into cryptography.
How Symmetric Encryption Works
where D is the decryption algorithm. Note that the same key is used for encryption and decryption. Hence the term symmetric encryption key.
In order to decrypt this ciphertext, we simply replace each character in the ciphertext with the character which is three characters to its left.
Design of Symmetric Encryption Algorithms
As we have seen, an encryption algorithm takes plaintext and a secret key as inputs and produces a ciphertext as its output. The encryption algorithm must be independent of both the plaintext and the secret key. Typically, an encryption algorithm consists of a finite number of rounds where each round is a finite sequence of substitution and transposition operations. A substitution operation substitutes some other character for an input character in the text, and a transposition operation rearranges or permutes a portion of the input text. Each round of the encryption algorithm garbles the plaintext to some extent, and this garbled output becomes the input for the next round of the encryption algorithm. In order to recover the plaintext from the ciphertext, the operations of the encryption algorithm must be reversible. Therefore, the decryption algorithm simply executes the rounds and operations of the encryption algorithm in reverse. The Caesar cipher that we discussed previously is a simple substitution cipher, and there are no transposition operations.
Symmetric encryption algorithms come in two flavors. Block ciphers partition the plaintext into blocks of fixed length and encrypt the plaintext one block at a time. Stream ciphers encrypt plaintext one byte at a time and are meant to encrypt data streaming over a network.
The Relationship Between Key Size and Decryption Time
Key Size | Number of Keys in Keyspace | Average Time to Recover Plaintext (1016 Decryptions per Second) |
---|---|---|
8 bits | 256 | < 1 second |
64 bits | 264 | 0.25 hours |
128 bits | 2128 | 53.9*1012 years |
256 bits | 2256 | 18.3*1051 years |
This table shows that larger keys offer greater protection against brute-force attacks than smaller keys.
Frequency Distribution of English Characters
a | 8.12 | b | 1.49 | c | 2.71 | d | 4.32 |
e | 12.02 | f | 2.30 | g | 2.03 | h | 5.92 |
i | 7.31 | j | 0.10 | k | 0.69 | l | 3.98 |
m | 2.61 | n | 6.95 | o | 7.68 | p | 1.82 |
q | 0.11 | r | 6.02 | s | 6.28 | t | 9.10 |
u | 2.88 | v | 1.11 | w | 2.09 | x | 9.17 |
y | 2.11 | z | 0.07 |
Frequency Distribution of the 12 Most Common Diagraphs
th | 1.52 | he | 1.28 | in | 0.94 | er | 0.94 |
an | 0.82 | re | 0.68 | nd | 0.63 | at | 0.59 |
on | 0.57 | nt | 0.56 | ha | 0.56 | es | 0.56 |
Cryptanalysis takes advantage of these structural language characteristics to decipher ciphertext. This analysis shows that a good encryption algorithm implements several substitution and transposition steps that alter or mask these language characteristics. In the ideal encryption algorithm, all digraphs, trigraphs, and n-word combinations should have an equal probability of appearing in the ciphertext. Other cryptanalysis techniques take advantage of word order and the rules of grammar.
Another desirable property of an encryption algorithm is that a small change in the plaintext should cause a large change in the ciphertext; this is called an avalanche effect. The avalanche effect thwarts differential cryptanalysis which attempts to decode ciphertext by examining how different messages encrypted by the same algorithm differ from each other.
Aside from cryptanalysis, language analysis is important in the field of natural language processing (NLP). NLP has important applications in the design of chatbots, spam filters, language translators, as well as sentiment analysis, behavior prediction, and classification.
Advanced Encryption Standard
The Advanced Encryption Standard (AES) is the US National Institute of Standards’ recommended symmetric encryption algorithm. In 1997, NIST announced a competition to replace the aging Data Encryption Standard. Four years later in 2001, AES was chosen as the winner among several competing encryption designs. AES is widely used by governments and financial institutions across the world. It is a very fast encryptor. Due to its small size, AES finds usage in devices ranging from microcontrollers to supercomputers. AES processes plaintext blocks of 128 bits and supports encryption key lengths of 128, 192, and 256 bits. It uses 10 encryption rounds if the selected key is 128 bits long, 12 rounds if the key is 192 bits, and 14 rounds if the key is 256 bits.
The Key Distribution Problem
For a party to decipher a ciphertext, he or she must possess the secret key. The problem arises when a secret key has to be distributed to a large number of recipients. In such a case, there is a significant danger that an adverse party will be able to intercept and appropriate the key. Symmetric key encryption is not a suitable technique when an encryption key has to be distributed to a large number of users. The solution to this key distribution problem lies in a cryptographic technique called public key encryption, which we will examine in a subsequent chapter.
Pseudo-Random Number Generators
Many cryptographic algorithms require the use of one or more random numbers, typically to initialize a process with a seed, generate keys, set a nonce, specify a salt, or randomize a variable. Instead of obtaining a random number by sampling some process in nature or on our computer, it is common practice to use a mathematical function to generate a sequence of pseudo-random numbers. A function that generates such a sequence is called a pseudo-random number generator (PRNG). A PRNG works as follows. The PRNG is seeded with some random number which is called a seed, and it then recursively generates a deterministic sequence of numbers. Note that, given the certain seed, the PRNG will always generate the same sequence of numbers.
A good pseudo-random number generator has statistical properties that make it very difficult to distinguish from a sequence of random numbers. In particular, in a high-quality PRNG, successive numbers are not correlated and the PRNG has a very large period before the numbers start to repeat.
Note that LCG is a recursive function. It is very easy to implement and has the advantage of being a fast pseudo-random number generator. The statistical randomness of LCG is sensitive to the modulus value P and the seed which is selected. The GNU C library, glibc, uses a period of 245, a multiplier value of 25214903917, and a seed value of 11, in the specification of its LCG.
The Mersenne Twister8 is another high-quality PRNG of interest. In its most common derivation, it is seeded with the Mersenne prime number 219937 – 1 and it has a period of 219937 – 1.
Conclusion
In this chapter, we have examined the design of symmetric encryption algorithms and the cryptanalysis of these algorithms. We have also taken a look at the Advanced Encryption Standard, which is a widely used encryption algorithm. The distribution of encryption keys to a large number of users is a serious issue since it raises the possibility that a key may be stolen during the distribution process. In a subsequent chapter, we will show how public key encryption solves the key distribution dilemma. Finally, we have elucidated upon pseudo-random number generators.
In the next chapter, we will study cryptographic hash functions. These functions are particularly important in blockchain design.
References
Schneier, Bruce. Applied Cryptography: Protocols, Algorithms and Source Code in C, 1st ed., John Wiley & Sons, Inc., 1995
Ferguson Neils, Schneier, Bruce et al. Cryptography Engineering, 1st ed., Wiley, 2010