Blockchains are built based on a range of different cryptographic concepts. From safeguarding wallets and securing transactions to protecting consensus protocols and encrypting private data for anonymous accounts, almost everything needs cryptography to ensure proper functioning. Cryptography is the backbone of blockchain security.
This chapter will dive into everything you need to know about cryptography in blockchains, starting with the basics. Then, you will be introduced to the classical symmetric key cryptography, asymmetric cryptography, and more. As you advance, you will become well versed in how digital signatures work.
This chapter will also cover how to utilize hash functions to hash data, and at the end of the chapter, you will get an in-depth look at elliptic curve cryptography (ECC), a key type of encryption cryptography that is used in blockchain.
The basics of cryptography
Symmetric key cryptography
Asymmetric key cryptography
Digital signatures
Hash algorithms
ECC
Derived Ethereum addresses
The Basics of Cryptography
In Chapter 1, we have briefly discussed SHA-256 hashing cryptography which is used to hash block data.
The word “crypto” comes from the Greek word “kryptós,” meaning “hidden or secret.”
“Cryptography” means “secret writing” and allows for the exchange of secure messages between willing parties.
The message is converted into a secret code equivalent called “ciphertext” via an encryption algorithm to prevent unauthorized access. The ciphertext is then sent over a public network, it is decrypted at the receiving end, and the recipient can view its contents.
Cryptography is mainly divided into three categories: symmetric key cryptography, asymmetric key cryptography, and keyless primitives such as hash function.
Symmetric Key Cryptography
Symmetric key cryptography is a cryptographic algorithm that uses a shared secret key between a sender and a receiver to encrypt and decrypt data. This secret key is called a symmetric key.
In the preceding example, Alice and Bob share the same secret key. Alice uses the secret key to encrypt the message, and the secret key is used in the decryption process when Bob reads the message. This shows how symmetric encryption works.
Symmetric encryption is typically more efficient than asymmetric encryption. Therefore, it is often used with large amounts of data encryption, personal data encryption, and decryption.
While there are several symmetric algorithms, the first US standard was DES.
Symmetric key cryptography can use either stream ciphers or block ciphers to encrypt data. Using stream ciphers is the preferred way to encrypt data in most cases.
Stream Ciphers
Stream ciphers encrypt plaintext messages one bit or byte at a time, resulting in a single-character-in, single-character-out cipher. It applies a random keystream of characters and the XOR operation to each binary digit in a data stream.
XOR is a Boolean logic operation and is known as the “exclusive or” or exclusive disjunction. It yields true when only one out of two inputs is true. If both or no inputs are true, the XOR operation output is false.
Truth table for XOR
Input | Output | |
---|---|---|
A | B | A XOR B |
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
10101
XOR 00110
Output 10011
Keystream characters can be random combinations of any letters or numbers. Assume that we have a stream of plaintext bytes (p1, p2, p3, ..., pi), and the keystream generator outputs a stream of bytes (k1, k2, k3, ..., ki). To encrypt the stream of ciphertext bytes, the operand XOR needs to be applied to each plaintext and key to generate and encrypt the stream of ciphertext bytes (c1, c2, ..., ci). This can be expressed through the following:
c1 = p1 XOR k1, ..., ci = pi XOR ki
00110101 (plaintext)
XOR 11100011 (key)
Output 11010110 (plaintext)
11010110 (ciphertext)
XOR 11100011 (key)
Output 00110101 (plaintext)
Block Ciphers
The main difference between a block cipher and a stream cipher is that a block cipher takes a fixed-size block of plaintext bytes as a single unit and encrypts block data as a ciphertext byte. Generally, the block size is the same as the key size.
Assume we have a block of plaintext bytes p and key bytes k. To encrypt the block of ciphertext byte c, we need to encrypt the plaintext with key c = encrypt (p, k) and recover the plaintext by decrypting the ciphertext with key p = decrypt (c, k).
Each block is of equal size. Let’s say the input is larger than the number of blocks, where the input size is 38 bits, and the block size is 6 bits. After 6 blocks, there will be 2 bits left (38 – 6 * 6 = 2). In this case, we typically add padding (two 0) and append it to the end of the block.
The most commonly used types of block ciphers include advanced encryption standard (AES), data encryption standard (DES), and triple DES (3DES or TDEA).
Asymmetric Key Cryptography
Symmetric cryptography is relatively simpler and faster than asymmetric cryptography, only needing one key. It is typically used for big data encryption/decryption and confidentiality of bank transactions. Symmetric cryptography must share the same secret key before data encryption. In the previous example, the case becomes complex when Alice needs to send a private message to many people. If Alice uses a symmetric key (K) to encrypt all these messages and shares this key with Bob and others, there is a great risk that someone can secretly give a copy of K to others without Alice's knowledge. In this case, the entire communication channel has been compromised, and many unintended people can read and modify the messages and send them to any other members. To avoid this security risk, Alice must consider creating a large number of keys for each person that the message is sent to, but Alice will then have to remember all the secret keys. Alice needs to call her friends or be involved in face-to-face meetings over a trusted channel to distribute the secret keys. As a result, symmetric cryptography could quickly become less practical for many participants. This problem affected the industry in relation to the use of encryption for quite a long time, but in 1976, Diffie and Hellman introduced the concept of public key encryption, also known as asymmetric cryptography.1
Public and Private Keys
Whitfield Diffie and Martin Hellman described how public key encryption works when two communicating parties exchange information across an insecure channel using a key pair consisting of a public key and a private key.
A private key is known only to the owner, and a public key is considered public information that is available to anyone. Each key has been designed for a specific purpose.
The public key is used to encrypt a message and convert it into ciphertext. The private key is used to decrypt a message that has been encrypted with the public key.
How the Diffie-Hellman Algorithm Works
The Diffie-Hellman algorithm is based on a mathematic principle and uses the following formula:
ga (mod p)
Modulo is the remainder of a division operation. For example, 5 mod 3 = 2 because 2 would be left over. Mod can also be expressed as %.
With p, g as a prime number, g is a primitive root modulo p. The g and p numbers are public and can be seen and used by anyone.
In math, a g number is a primitive root modulo n if every integer relatively prime to n is congruent to a power of g modulo n.
20 = 1, 1 (mod 5) = 1, so 20 ≡ 1
21 = 2, 2 (mod 5) = 2, so 21 ≡ 2
23 = 8, 8 (mod 5) = 3, so 23 ≡ 3
22 = 4, 4 (mod 5) = 4, so 22 ≡ 4
Now, we will look at how the Diffie-Hellman algorithm works. Let's use a simple example to understand the algorithms of key exchange. Imagine Alice and Bob want to exchange information. Now, assume a hacker named Eve is trying to intercept the message.
Step 3 – Alice and Bob will then use the formula to calculate public values using the p and g parameters and their private values.
65 (mod 13) = 7776 mod 13 = 2
Bob will calculate using the following:
64 (mod 13) = 1296 mod 13 = 9
Step 5 – Alice and Bob both received the other side's public message, and they calculated the shared secret through the formula ga (mod p).
In this step, g is a public message sent from the other side.
95 (mod 13) = 59049 mod 13 = 3
Bob will calculate using the following equation:
24 (mod 13) = 16 mod 13 = 3
Both Alice and Bob have gotten the same values.
From the Diffie and Hellman example, we can see that both Alice and Bob, using the following equation, computed to get the same result:
(ga (mod p))b mod p = (gb (mod p))a mod p
The shared key is gab. Typically, a, b, and p are much larger values. This is needed to make results secure.
In the original Diffie and Hellman description, the algorithm does not provide identity verification for communicating parties. This led to the algorithm being vulnerable to man-in-the-middle attacks.
Look at the following example:
Eve intercepts the message between Alice and Bob and blocks communication between them.
Eve intercepts Alice’s public value (ga(mod p)) and sends Alice her own public value (gc(mod p)).
Eve intercepts Bob’s public value (gb(mod p)) and sends Bob his public value (gd(mod p)).
Diffie and Hellman is the first asymmetric cryptography protocol and provided the basis for many authenticated protocols, such as the elliptic curve Diffie-Hellman asymmetric algorithm and RSA, that are widely used today. Crypto, secure shell (SSH), secure sockets layer (SSL), email, and VPN security are all based on these asymmetric algorithms.
How Digital Signatures Work
- 1.
We begin with the message on Alice's (the sender’s) side.
- 2.
The algorithm generates a one-way hash of the message based on the document checksum.
- 3.
The hash is signed or encrypted using Alice's private key.
- 4.
The message is sent to Bob.
- 5.
Bob receives the message.
- 6.
Bob decrypts the message using Alice's public key and verifies the message authentication that was signed by Alice.
- 7.
The algorithm regenerates a hash of the message.
- 8.
The two hash values are compared. If they are identical, we ensure that the transmitted document has not been altered since signing.
Digital Signatures
A digital signature is an asymmetric public key cryptography technique that verifies digital messages or document owner authenticity.
We often use the public key for message encryption and the private key for message decryption in public key encryption. However, in the case of a digital signature, the message is encrypted with a sender's private key, or in other words, the message is signed.
As the signer's public key is known, anybody can verify a message's digital signature.
Authentication (verification of who signed the origin of the document)
Nonrepudiation (the identity of the signer and document that they have digitally signed should be undeniable)
Integrity (proof that the document has not been altered since signing)
Hash Algorithms
In Chapter 1, “How Blockchain Works” section, the hash concept was introduced, and the way that blockchain uses the SHA-256 hash function to hash transaction data was discussed. In the previous section, the process of digital signatures was discussed. The process indicated that the process needed to apply a hash function.
The Keccak-256 Algorithm
The Keccak-256 algorithm is a hash function that is widely used in an Ethereum blockchain. A few examples include Ethereum addresses, some smart contract functions, and the Ethereum consensus engine known as Ethash that plays an important role in producing blocks and other security actions.
The Keccak-256 is a family of SHA-3 hash functions. The function input can be a variable-length string or number, and generated output will always be a fixed-length, 64-character (letters and numbers) output. The output can be converted to hexadecimal numbers. Like all other hash functions, it is a one-way cryptographic hash function.
Keccak-256 is based on the sponge construction and is a sponge function family. Keccak-256 sponge function (Keccak[r,c]) needs two parameters: one of size r (the bitrate and the amount of data encoded for a single unit of time) and the other of size c (the capacity).
Padding
A padding function will append enough bits to the input data (M), and the length of the padded input can be split into multiple r-bit blocks.
Initialization
The padded input is broken into r-bit blocks, assuming the blocks' names are called M0, M1, M2, and so on.
The Absorbing Phase
The r-bit block is XORed with small chunks of the input data M0. The result is then passed to a compression function f. The output of function f XORed next message M1. The process is repeated until each of the message blocks Mn is processed. In each step, a small chunk of the input data (the bit length of r) is “absorbed” into the buffer.
The Squeezing Phase
The same process is repeated. The r-bit block of the buffer consists of the next r bits of output (Z0, Z1, Z2, and so on). The function f is used to extract r bits of data as the next r bits of the output. The process is repeated until the results are produced.
Elliptic Curve Cryptography (ECC)
ECC was discovered in 1985 by Victor Miller (IBM) and Neil Koblitz (University of Washington) independently and is currently one of the most robust and widely used types of encryption cryptography. Blockchain networks, such as Bitcoin and Ethereum, use ECC.
In 2005, the US National Security Agency (NSA) announced a set of unpublished algorithms known as Suite B protocols and posted a paper titled “The Case for Elliptic Curve Cryptography” in which they recommended that the US government use ECC to secure sensitive and unclassified communications. 2
The NSA commented that analysts should “take advantage of the past 30 years of public key research and analysis and move from first generation public key algorithms and on to elliptic curves.”
Suite B’s protocols included both elliptic curve Diffie-Hellman (ECDH) and elliptic curve Menezes-Qu-Vanstone (ECMQV) for key exchange, the elliptic curve digital signature algorithm (ECDSA) for digital signatures, the advanced encryption standard (AES) for symmetric encryption, and the secure hashing algorithm (SHA). After the release of these protocols, ECC soon became the de facto standard for protecting modern industry communications.
ECC is based on algebraic properties of elliptic curves and provides equivalent security with much smaller key sizes than other asymmetric cryptography systems, such as RSA or DSA. For example, a 256-bit ECC key is equal in power to a 3072-bit RSA key. This also makes elliptic curves significantly faster.
What Is an Elliptic Curve?
An elliptic curve is a curve given through the following equation:
y 2 = x 3 + Ax + B
The curve, as required, is nonsingular and needs to have no repeated roots or self-intersections. To ensure that the curve is nonsingular, the condition can be expressed through the following equation:
4A3 + 27B2 ≠ 0
y2 = x3 - x + 1
The group law for an elliptic curve is expressed through either of the following equations:
P + Q + R = 0, or P + Q = - R
The sum of the points P and Q is equal to the point - R.
With the addition of points P = (x1, y1) and Q = (x2, y2) of an elliptic curve, a third point can be calculated using the following formulas:
P + Q = R = (x3, y3) where
Elliptic curve E over Zp is defined by the following equations, and Zp is field modulo p, meaning that {0, 1, 2, …, n - 1}:
X = 0 → y2 = 3 → no solution (mod 5)
X = 1 → y2 = 6 → 6 (mod 5) = 1
→ y = 1, 4 because 12 (mod 5) = 1, and 42 (mod 5) = 1
X = 2 → y2 = 15 → 15 (mod 5) = 0
→ y = 0 because 02 (mod 5) = 0
X = 3 → y2 = 36 → 36 (mod 5) = 1
→ y = 1, 4 because 12 (mod 5) = 1, and 42 (mod 5) = 1
X = 4 → y2 = 75 → 75 (mod 5) = 0
→ y = 0 because 02 (mod 5) = 0
Then, the elliptic curve has the following seven points:
(1, 1), (1, 4), (2, 0), (3, 1), (3, 4), (4, 0), ∞.
Now, rearranging the modulus operator P = 263 leaves the following equation:
(x3 + 2x + 3) (mod 263)
In the elliptic curve, when if P is (x, y) ∈ E(Zp), then (x, y) + (x, -y) = ∞ (the point at infinity).
So, if adding ∞ to any point P in E(Zp), we can always get P back. This can be expressed through the following equation:
P + ∞ = ∞ + P = P for all ∈ E(Zp)
In Alice’s and Bob’s message communication, P is a (x,y) point on the curve that both Bob and Alice will agree to. Bob’s private key is represented by n, and K is his public key. We multiply P and n together to produce K, as shown as follows:
K = n P
If we multiply K with P, it will get the point on the elliptic curve.
Deriving an Ethereum Address
- 1.
Create a random private key and derive the public key from this private key using EC curve (secp256k1).
The private key is 64 hexadecimal characters long (32 bytes).
Here we use an online tool (https://kjur.github.io/jsrsasign/sample/sample-ecdsa.html) to get a public and private key pair. The generated public and private key pair can be seen in Figure 2-19.
Private key: 8c2b80899dd44981d7f8b38c1f5b13dbf1fbb98c360d9d1cfd63a3aed0d7b498
- 2.
Derive the address from the following public key:
Start with the public key (128 characters) and apply the Keccak-256 hash of the public key. It will generate a string that is 64 characters long (32 bytes). Let’s use the Keccak-256 online tool (https://emn178.github.io/online-tools/keccak_256.html) to see the hash result (see Figure 2-20).
- 3.
Get the Ethereum address
Take the last 20 bytes (40 characters) of the generated hash to get the Ethereum address with prefix 0x. 0x lets the people know the address in hexadecimal format. When prefixed with 0x, it becomes 42 characters long. So, in our example, the Ethereum address is
0x6a6d125cf3029f4ccbe72cc1b1a4c7a8c72467a3
- 4.
Verify the Ethereum address
We use one of the online Ethereum address validator tools (www.rfctools.com/ethereum-address-validator/) to verify our new Ethereum address.
Congratulations! We just generated a valid Ethereum address using our cryptography knowledge. At this level, you should get a good sense of how blockchain cryptography works.
Summary
Cryptography is an essential mechanism for securing blockchain technology. It is used to secure the blockchain consensus mechanism, protect blockchain data, keep user accounts safe, and more. The main purpose of this chapter was to give a more thorough understanding of cryptography by giving a quick overview of how it works.
Although it only scratches the surface of cryptography technologies, this chapter will help you to enrich your knowledge of symmetric key cryptography and asymmetric key cryptography. More importantly, we now know how digital signatures work. We covered the hash algorithm, and we walked-through elliptic curve cryptography to understand how it works. Lastly, we learned how to generate an Ethereum address.
We will continue our journey and learn about Bitcoin, the future of money in the next chapter.