Chapter 3: Asymmetric Encryption

Asymmetric encryption means using different pairs of keys to encrypt and decrypt a message. A synonym of asymmetric encryption is public/private key encryption, but there are some differences between asymmetric and public/private key encryption, which we will discover in this chapter. Starting with a little bit of history of this revolutionary method of encryption/decryption, we will look at different types of asymmetric algorithms and how they help secure our credit cards, identity, and data.

In this chapter, we are going to cover the following topics:

  • Public/private key and asymmetric encryption
  • The Diffie-Hellman key exchange and the related man-in-the-middle problem
  • RSA and an interesting application for international threats
  • Introduction to conventional and unconventional attacks on RSA
  • Pretty Good Privacy (PGP)
  • ElGamal and its possible vulnerabilities

Let's dive in!

Introduction to asymmetric encryption

The most important function of private/public key encryption is exchanging a key between two parties, along with providing secure information transactions.

To fully understand asymmetric encryption, we must understand its background. This kind of cryptography is particularly important in our day-to-day lives. This is the branch of cryptography that's deputed to cover our financial secrets, such as credit cards and online accounts; to generate the passwords that we use constantly in our lives; and, in general, to share sensitive data with others securely and protect our privacy.

Let's learn a little bit about the history of this fascinating branch of cryptography.

The story of asymmetric cryptography begins in the late 1970s, but it advanced in the 1980s when the advent of the internet and the digital economy started to introduce computers to family homes. The late 1970s and 1980s was the period in which Steve Jobs founded Apple Inc. and was during the Cold War between the USA and the USSR. It was also a period of economic boom for many Western countries, such as Italy, France, and Germany. And finally, it was the period of the advent of the internet. The contraposition of the two blocs, Western and Eastern, with US allies on one side and the Soviet bloc on the other side, created opposing networks of spies that had their fulcrum in the divided city of Berlin. During this period, keys being exchanged in symmetric cryptography reached the point that the US Government's Authority for Communications Security (COMSEC), which is responsible for transmitting cryptographic keys, transported tons of keys every day. This problematic situation degenerated to a breaking point. Just to give an example, with the DES algorithm in the 1970s, banks dispatched keys via a courier that were handed over in person. The National Security Agency (NSA) in America struggled a lot with the key distribution problem, despite having access to the world's greatest computing resources. The issue of key distribution seemed to be unsolvable, even for big corporations dedicated to solving the hardest problems related to the future of the world, such as RAND – another powerful institution created to manage the problems of the future and to prevent breakpoint failures. I think that, sometimes, a breakpoint is just a way to clear up the situation instead of just ignoring it. Sometimes, issues have different ways they can be solved. In the case of asymmetric encryption, no amount of government money nor supercomputers with infinite computation and multiple brains at their service could solve a problem that, at a glance, would appear rather easy to solve.

Now that you have an idea of the main problem that asymmetric encryption solves, which is the key exchange (actually, we will see that in RSA, this problem gets translated into the direct transmission of the message), let's go deeper to explore the pioneers involved in the history of this extremely intriguing branch of cryptography.

The pioneers

Cryptographers can often appear to be strange people; sometimes, they could be introverts, sometimes extroverts. This is the case with Whitfield Diffie, an independent freethinker, not employed by the government or any of the big corporations. I met Diffie for the first time at a convention in San Francisco in 2016 while he was discussing cryptography with his famous colleagues, Martin Hellmann and Ronald Rives. One of the most impressive things that remained fixed in my mind was his elegant white attire, counterposed by his tall stature and long white hair and beard. Similar to an ever-young guy still in the 1960s, someone whose contemporary could be an agent at the Wall Street Stock Exchange or a holy man in India. He is one of the fathers of modern cryptography, and his name will be forever imprinted in the history of public/private key encryption. Diffie was born in 1944 and graduated from MIT in Boston in 1965. After his graduation, he was employed in the field of cybersecurity and later became one of the most authentic independent cryptographers of the 1970s. He has been described as cyberpunk, in honor of the new wave science fiction movement of the 1960s and 1970s, where cybernetics, artificial intelligence, and hacker culture combined into a dystopian futuristic setting that tended to focus on a combination of low life and high tech.

Back in the 1960s, the US Department of Defense began funding a new advanced program of research in the field of communication called Defense Advanced Research Projects Agency (DARPA), also called ARPA. The main ARPA project was to connect military computers to create a more resilient grade of security in telecommunications. The project intended to prevent a blackout of communications in the event of a nuclear attack, but also, the network allowed dispatches and information to be sent between scientists, as well as calculations that had been performed, to exploit the spare capacity of the connected computers. ARPANET started officially in 1969 with the connection of only four sites and grew quickly so much that in 1982, the project spawned the internet. At the end of the 1980s, many regular users were connected to the internet, and thereafter their number exploded.

While ARPANET was growing, Diffie started to think that one day, everyone would have a computer, and with it, exchange emails with each other. He also imagined a world where products could be sold via the internet and real money was abandoned in favor of credit cards. His great consideration of privacy and data security led to Diffie being obsessed with the problem of how to communicate with others without having any idea who was at the opposite end of the cable. Moreover, encrypting messages and documents is often done when sending highly valuable information; data encryption was starting to be used by the general public to hide information and share secrets with others. This was the time when the usage of cryptography became common, and it was not just for the military, governments, or academics.

The main issue to solve was that if two perfect strangers meet each other via the internet, how would it be possible to encrypt/decrypt a shared document without exchanging any additional information except for the document itself, which is encrypted/decrypted through mathematical parameters? This is the key exchange problem in a nutshell.

One day in 1974, Diffie went to visit IBM's Thomas Watson laboratory, where he was invited to give a speech. He spoke about various strategies for attacking the key distribution problem but the people in the room were very skeptical about his solution. Only one shared his vision: he was a senior cryptographer for IBM who mentioned that a Stanford professor had recently visited the laboratory and had a similar vision about the problem. This man was Martin Hellman.

Diffie was so enthusiastic that the same evening, he drove to the opposite side of America to Palo Alto, California to visit Professor Hellman. The collaboration between Diffie and Hellman will remain famous in cryptography for the creation of one of the most beautiful algorithms in the field: the Diffie-Hellman key exchange.

We'll analyze this pioneering algorithm in the next section.

The Diffie-Hellman algorithm

To understand the Diffie-Hellman (D-H) algorithm, we can rely on the so-called thought experiments or mental representation of a theory often used by Einstein.

A thought experiment is a hypothetical scenario where you mentally transport yourself to a more real situation than in the purely theoretical way of facing an issue. For example, Einstein used a very popular thought experiment to explain the theory of relativity. He used the metaphor of a moving train observed by onlookers from different positions, inside and outside of the train.

I will often apply these mental figurative representations in this book.

Let's imagine that we have our two actors, Alice and Bob, who want to exchange a message (on paper) but the main post office in the city examines the contents of all letters. So, Alice and Bob struggled a lot with different methods to send a letter secretly while avoiding any intrusion; for example, putting a key inside a metallic cage and sending it to Bob. But because Bob wouldn't have the key to open the cage, Alice and Bob would have to meet somewhere first so that Alice could give the key to Bob. Again, we return to the problem of exchanging keys.

After many, many attempts, it seemed to be impossible to arrive at a logical solution that would solve this problem, but finally, one day, Diffie, with Hellmann's support, found the solution, about which Hellmann later said, "God rewards fools!"

Let's explore a mental representation of what Alice and Bob should do to exchange the key:

  • Step 1: Alice puts her secret message inside a metallic box closed with an iron padlock and sends it to Bob, but holds on to the key herself. Now, remember that Alice locks the box using her key and doesn't give it to Bob.
  • Step 2: Bob applies one more lock to the cage using his private key and resends the box to Alice. So, after Bob has received the box for the first time, he can't open it. He just adds one more lock to the box.
  • Step 3: When Alice receives the box the second time, she is free to remove her padlock since the box remains secured with Bob's key, as shown in the following diagram, when Alice resends the box for the last time. Remember that the message is always inside the box. Right now, the box is only locked with Bob's padlock.
  • Step 4: When Bob receives the box this time, he can open it, because the box only remains locked with his padlock. Finally, Bob can read the content of the message sent by Alice that has been preserved inside the box.

As you can see, Alice and Bob never met each other to exchange any padlock keys. Note that in this example, the box was sent twice from Alice to Bob, while in the algorithm, it is not:

Figure 3.1 – The D-H algorithm using the Alice and Bob example

Figure 3.1 – The D-H algorithm using the Alice and Bob example

The preceding explanation and representation are not exactly what the algorithm does, but it provides a practical solution to a problem believed unsolvable: the key exchange problem.

Now, we have to transpose this practical argumentation into a logical-mathematical representation.

First of all, we will return to using modular mathematics while taking advantage of some particular properties of the operations made in finite fields.

The discrete logarithm

I will try to explain the math behind cryptography without using excessive notations, just because I don't want you to get confused, or to load this book with heavy mathematical dissertations. It's not within the scope of this book.

When we talk about finite fields, we are considering a finite group of (n) integer numbers, (Z), laying in a ring – let's say, (Zn). This group of numbers will be subjected to all the same mathematical laws, such as operations with standard integers. Since we are working in a finite field called (modulo n) here, we have to consider some critical issues that involve modular mathematics. As we saw in the previous chapter, operating in modulus means wrapping back to the first number each time we arrive at the end of the set. This is just like the clock's math, where we wrap back to 1 when we reach 12.

Essentially, remember that in a finite field, there is a numerical period in which the numbers and the results of the operations of the field recur. For example, if we have a set of seven integers, {0, 1, 2, 3, 4, 5, 6}, often abbreviated as (Z7), we have all the operations that have been performed inside this finite field wrapping back inside the integers of the field.

Here is a short example of operations within a finite field, (Z7, +, x), of addition and multiplication. Since all the operations, (modulo 7), will work, we have to consider that the numbers will wrap back to 0 each time the operation exceeds the number 7:

1 x 1 ≡ 1 (mod 7)

2 x 4 ≡ 1 (mod 7)

3 + 5 ≡ 1 (mod 7)

3 x 5 ≡ 1 (mod 7)

Therefore, let's use this modality of counting and consider that the = notation is equivalent to , which is the mathematics we learned at elementary school, where 2 + 2 = 4 doesn't properly work if we consider, for example, a finite field of modulo 3:

2 + 2 ≡ 1 (mod 3)

From high school mathematics, we recall that the [log a(z)] logarithm is a function where (a) is the base. We have to determine the exponent to give to (a) to obtain the number (z). So, for example, if a = 10 and z = 100, we find that the logarithm is 2 and we say that logarithm base 10 of 100 is 2. If we use Mathematica to calculate a logarithm, we have to compose a different notation, that is, Log[10, 100]= 2.

While working in the discrete field, things became more complicated, so instead of using a normal logarithm, we started working with a discrete logarithm.

So, let's say we have to solve an equation like this:

a^[x] ≡ b (mod p)

This would be a very hard problem, even if we know the value of (a) and (b), because there is no efficient algorithm known to solve the discrete logarithm, that is, [x].

Important Note

I have used square brackets to say that [x] is secret. Technically [x] is a discrete power, but the problems of searching for discrete logarithm and discrete power have the same computational difficulty.

Let's go a little bit deeper now to explain the dynamics of this operation. Let's consider computing the following:

2^4 ≡ (x) (mod 13)

First, we calculate 2^4 = 16, then divide 16:13 = 1 by the remainder of 3 so that x = 3.

The discrete logarithm is just the inverse operation:

2^[y] ≡ (x) (mod 13)

In this case, we have to calculate [y] while knowing the base is 2. It's possible to demonstrate that there are infinite solutions for [y] that generate (x).

Taking the preceding example, we have the following:

2^[y] ≡ 3 (mod 13) for [y]

One solution is y = 4, but it is not the only one.

The result of 3 is also satisfied for all the integers, (n), of this equation:

[y] = [y + (p-1)*n]

Let's prove n = 1:

2^[4+(13-1)*1] ≡ 2^16 (mod 13)   

2^16 ≡ 3 (mod 13)

But it is also valid for n = 2:

2^[4 +(13-1)*2] ≡ 2^28 (mod 13)      

2^28 ≡ 3 (mod 13)

And so on…

Hence, the equation has infinite solutions for all the integers; that is, (n ≥ 0):

[y] ≡ 2^[4 + 12 n] (mod 13)

There is no method yet for solving the discrete logarithm in polynomial time. So, in mathematics, as in cryptography, this problem is considered very hard to solve, even if the attacker has a lot of computation power.

Finally, we have to introduce the definition and the notation of a generator, (g), which is a particular number where we say that (g) generates the entire group, (Zp). If (p) is a prime number, this means that (g) can take on any value between 1 and p-1.

Explaining the D-H algorithm

D-H is not exactly an asymmetric encryption algorithm, but it can be defined properly as a public/private key algorithm. The difference between asymmetric and public/private keys is not only formal but substantial. You will understand the difference better later, in the RSA section, which covers a pure asymmetric algorithm. Instead, D-H gets a shared key, which works to symmetrically encrypt the message, [M].

The encryption that's performed with a symmetric algorithm is combined with a D-H shared key transmission to generate the cryptogram, C:

Symmetric-Algorithm E((D-H[k]), M) = C

In other words, we use the D-H algorithm to generate the shared secret key, [k], then with AES or another symmetric algorithm, we encrypt the message, [M].

D-H doesn't directly encrypt the secret message; it can only determine a shared key between two parties. This is a critical point, as we will see in the next paragraph.

However, for working in discrete fields and applying a discrete logarithm problem to shield the key from attackers when sharing it, Diffie and Hellman implemented one of the most robust and famous algorithms in cryptography.

Let's see how D-H works:

  • Step 1: Alice and Bob first agree on the parameters: (g) as a generator in the ring, (Zp), and a prime number, p (mod p).

    Step 2: Alice chooses a secret number, [a], and Bob chooses a secret number, [b].

    Alice calculates A ≡ g^a (mod p) and Bob calculates B ≡ g^b (mod p).

  • Step 3: Alice sends (A) to Bob and Bob sends (B) to Alice.
  • Step 4: Alice computes ka ≡ B^a (mod p) and Bob computes kb ≡ A^b (mod p).

    [ka = kb] will be the secret that's shared key between Alice and Bob.

The following is a numerical example of this algorithm:

  • Step 1: Alice and Bob agree on the parameters they will use; that is, g = 7 and (mod 11).

    Step 2: Alice chooses a secret number, (3), and Bob chooses a secret number, (6).

    Alice calculates 7^3 (mod 11) ≡ 2 and Bob calculates 7^6 (mod 11) ≡ 4.

  • Step 3: Alice sends (2) to Bob and Bob sends (4) to Alice.
  • Step 4: Alice computes 4^3 (mod 11) ≡ 9 and Bob computes 2^6 (mod 11) ≡ 9.

The number [9] is the shared secret key, [k], of Alice and Bob.

Analyzing the algorithm

In Step 2 of the algorithm, Alice calculates A ≡ g^a (mod p) and Bob calculates B ≡ g^b (mod p). Alice and Bob have exchanged (A) and (B), the public parameters of the one-way function. A one-way function has this name because it is impossible, (-x%), to return from the public parameter, (A), to calculate the secret private key, [a] (and the same for (B) with [b]), for the robustness of the discrete logarithm (see the The discrete logarithm section).

Another property of the modular powers is that we can write B^a and A^b (mod p), as follows:

B ^a ≡ (g^b)^a (mod p)

A ^b ≡ (g^a)^b (mod p)

So, for the property of modular exponentiations, we have the following:

g^(b*a) ≡ g^(a*b) (mod p)

For example, we have the following:

  • Alice: (7^6)^3 ≡ 7^(6*3) ≡ 9 (mod 11)
  • Bob: (7^3)^6 ≡ 7^(3*6) ≡ 9 (mod 11)

This is the mathematical trick that makes it possible for the D-H algorithm to work.

Now that we have understood the algorithm, let's highlight its defects and the possible attacks that can be performed on it.

Possible attacks and cryptanalysis on the D-H algorithm

The most common attack that's performed on the D-H algorithm is the man-in-the-middle (MiM) attack.

A MiM attack is when the attacker infiltrates a channel of communication and spies on, blocks, or alters the communication between the sender and the receiver. Usually, the attack is accomplished by Eve (the attacker) pretending to be one of the two true actors in the conversation:

Figure 3.2 – Eve is the man in the middle

Figure 3.2 – Eve is the man in the middle

Recalling what happened in Step 3 of the D-H algorithm, Alice and Bob exchanged their public parameters, (A) and (B).

Here, Alice sends (A) to Bob, then Bob sends (B) to Alice.

Now, Eve (the attacker) interferes within the communication by pretending to be Alice.

So, a MiM attack looks like this:

  • Step 3: Alice sends (A) to Bob and Bob sends (B) to Alice.
  • Here, we have the MiM attack: Eve sends (E) to Bob and then Bob sends (B) to Eve, assuming it is Alice.

Let's analyze Alice's function, A ≡ g^a (mod p) and Bob's function B ≡ g^b (mod p). This passage would be crucial if it was done in normal arithmetic and not in finite fields.

There is another possible attack, also known as the birthday attack, which is one of the most famous attacks performed on discrete logarithms. It is consolidated that among a group of people, at least two of them will share a birthday so that in a cyclic group, it will be possible to find some equal values (collisions) to solve a discrete logarithm.

Important Note

(E) here represents Eve's public parameter, which has been generated by her private key, not encryption.

Proceeding with the final part of the algorithm, you can see the effects of the MiM trick.

Suppose that Alice and Bob are using D-H to generate a shared private key to encrypt the following message:

Alice's message: Bob, please transfer $10,000.00 to my account number 1234567.

After Step 3, in which Bob and Eve (pretending to be Alice) have exchanged their public parameters, Eve sends the modified message to Bob (intercepted from Alice), which has been encrypted with the shared key.

Suppose, the encrypted message from Eve is bu3fb3440r3nrunfjr3umi4gj57*je.

Bob receives the preceding encrypted message (supposedly from Alice) and decrypts it with the D-H shared key, obtaining some plaintext.

Eve's modified message after the MiM attack is Bob, please transfer $10,000.00 to my account number 3217654.

As you will have noticed, the account number is Eve's account. This attack is potentially disruptive.

Analyzing the attack, Step 3 is not critical (as we have said) because (A) and (B) have been communicated in clear mode, but the question is: how can Bob be sure that (A) is coming from Alice?

The answer is, by using the D-H algorithm, Bob can't be sure that (A) comes from Alice and not from Eve (the attacker); similarly, Alice can't know that (B) comes from Bob either. In the absence of additional information about the identity of the two parties, relying only on the parameters received, the D-H algorithm suffers from this possible substitution-of-identity attack called MiM

This example shows the need for the sender (Alice) and the receiver (Bob) to have a way to be sure that they are who they say they are, and that their public keys, (A) and (B), do come from Alice and Bob, respectively. To prevent the problem of a MiM attack and identify the users of the communication channel, one of the most widely used techniques is a digital signature. We will look at digital signatures in Chapter 4, Introducing Hash Functions and Digital Signatures, which is entirely dedicated to explaining these cryptographic methods.

Moreover, it's possible for a public/private algorithm to identify the parties and overcome the MiM attack. In Part 4 of this book, I will show you some public/private key algorithms of the new generation that, although not asymmetric, have multiple ways to be signed.

Finally, a version of D-H can be implemented using elliptic curves. We will analyze this algorithm in Chapter 7, Elliptic Curves.

RSA

Among the cryptography algorithms, RSA shines like a star. Its beauty is equal to its logical simplicity and hidden inside is such a force that after 40 years, it's still used to protect more than 80 percent of commercial transactions in the world.

Its acronym is made up of the names of its three inventors – Rivest, Shamir, and Aldemar. RSA is what we call the perfect asymmetric algorithm. Actually, in 1997, the CESG, an English cryptography agency, attributed the invention of public-key encryption to James Allis in 1970 and the same agency declared that in 1973, a document was written by Clifford Cocks that demonstrates a similar version of the RSA algorithm.

The essential concept of the asymmetric algorithm is that the keys for encryption and decryption are different.

Recalling the analogy to padlocks I made in the The Diffie-Hellman algorithm section, when I described the D-H algorithm, we saw that anybody (not just Alice and Bob) could lock the box with a padlock. This is the true problem of MiM because the padlock can't be recognized as specifically belonging to Bob or Alice.

To overcome this problem, another interesting mental experiment can be done using a similar analogy but making things a little different.

Suppose that Alice makes many copies of her padlock, and she sends these copies to every postal office in the country, keeping the key that opens all the padlocks in a secret place.

If Bob wishes to send a secret message to Alice, he can go to a postal office and ask for Alice's padlock, put the message inside the box, and lock it.

In this way, Bob (the sender), from the moment he locks the box, will be unable to unlock it, but when Alice receives the box, she will be able to open it with her unique secure key.

Basically, in RSA (as opposed to D-H), Bob encrypts the message with Alice's public key. After the encryption process, even Bob is unable to decrypt the message, while Alice can decrypt it using her private key.

This was the step that transformed the concept of asymmetric encryption from mere theory to practical use. RSA discovered how to encrypt a message with the public key of the receiver and decrypt it with the private key. To make that possible, RSA needs a particular mathematic function that I will explain further later on when we explore the algorithm in detail.

As we mentioned previously, there were three inventors of this algorithm. They were all researchers at MIT, Boston, at the time. We are talking about the late 1970s. After the invention of the D-H algorithm, Ronald Rivest was extremely fascinated by this new kind of cryptography. He first involved a mathematician, Leonard Adleman, and then another colleague from the Computer Science department, Aid Shamir, who joined the group. What Rivest was trying to achieve was a mathematical way to send a secret message encrypted with a public key and decrypt it with the private key of the receiver. However, in D-H, the message can only be encrypted once the key has been exchanged, using the same shared key. Here, the problem was to find a way to send the message that had been encrypted with a public key and decrypted through the private key. But as I've said, it needed a very particular inverse mathematical function. This is the real added value of the RSA invention that we are going to discover shortly.

The tale of this discovery, as Rivest told it, is funny. It was April 1977 when Rivest and Adleman met at the home of a student for Easter. They drank a little too much wine and at about midnight, Rivest went back home. He started to think over the problem that had been tormenting him for almost a year. Laying on his bed, he opened a mathematics book and discovered the function that could be perfect for the goal that the group had.

The function he found was a particular inverse function in modular mathematics related to the factorization problem.

As I introduced in Chapter 1, Deep Diving into Cryptography, the problem of factorizing a large number made by multiplying two big prime numbers is considered very hard to solve, even for a computer with immense computation power.

Explaining RSA

To understand this algorithm, we will consider Alice and Bob exchanging a secret message.

Let's say that Bob wants to send a secret message to Alice, given the following:

  • M: The secret message
  • e: A public parameter (usually a fixed number)
  • c: The ciphertext or cryptogram
  • p, q: Two large random prime numbers (the private keys of Alice)

The following is the public and private key generation. As you will see, the core of RSA (its magic function) is generating Alice's private key, [d].

Key generation:

Alice's public key, (N), is given by the following code:

N = p*q

As we mentioned earlier, multiplying two big prime numbers makes (N) very difficult to factorize, and makes it generally very difficult for an attacker to find [p] and [q]. Alice's private key, [d], is given by the following code:

[d] * e ≡ 1 (mod[p-1]*[q-1])

Bob performs the encryption:

c ≡ M^e (mod N)

Bob sends the ciphertext, (c), to Alice. She can now decrypt (c) using her private key, [d]:

c^dM (mod N)

And that's it!

Note

The bold elements – M, d, p, and q – in the algorithm are protected and secret.

Numerical example:

M = 88

e = 9007

p = 101

q = 67

N = 6767

Step 1: Bob's encryption is as follows:

88^9007 ≡ 6621 (mod 6767)

Alice receives a cryptogram, that is, c = 6621.

Step 2: Alice's decryption is as follows:

9007* d ≡ 1 (mod (101-1)*(67-1))

d = 3943

6621^3943 ≡ 88 (mod 6767)

As you can see, the secret message, [M] = 88, comes back from Alice's private key, [d] = 3943.

Analyzing RSA

There are several elements to explain but the most important is to understand why this function, which is used for decrypting (c) and obtaining [M], works:

M ≡ c^[d] (mod N)

This is Step 2; that is, the decryption function. I have just inverted the notation by putting [M] on the left.

The reason it works is hidden in the key generation equation:

[d] * e ≡ 1 (mod [p-1]*[q-1])

[d] is Alice's private key. For Euler's theorem, the function will probably be verified because the numbers [p] and [q] are very big and [M] is probably a co-prime of (N). If this equation is verified, then we can rewrite the encryption stage as follows:

(M^e)^d (mod N)

For the properties of the powers and Euler's theorem, we have the following:

M^(e*d) (mod N)

de ≡ 1 (mod (p-1)*(q-1))

That is the same as writing M^1 = M (mod N).

So, by inserting [d] inside the decryption stage, Alice can obtain [M].

Conventional attacks on the algorithm

All the attacks that will be explained in the first part of this section are recognized and well known. That is why we are talking about conventional attacks on RSA.

The first three methods of attack on RSA are related to the (mod N) public parameter. To perform an attack on N = p*q, the attacker could do the following:

  • Use an efficient algorithm of factorization to discover p and q.
  • Use new algorithms that, under certain conditions, can find the numbers.
  • Use a quantum computer to factorize N (in the future).

Let's analyze the following three cases:

  • In the first case, an efficient algorithm of factorization is not yet known. The most common methods are as follows:
    • The general number field sieve algorithm
    • The quadratic sieve algorithm
    • The Pollard algorithm
  • In the second case, if (n) is the number of digits of N = p*q and the attacker knows the first (n/4) digits or the last (n/4) digits of [p] or [q], then it will be possible to factorize (N) in an efficient way. Anyway, there is a very remote possibility of knowing it. For example, if [p] and [q] have 100 digits and the first (or the last) 50 digits of [p] are known, then it's possible to factorize N.

    For more information, you can refer to Coppersmith's method of factorization. More cases related to Coppersmith attacks, as explained later in this section, are where the exponents, (e) or [d], and even the plaintext, [M], are too short.

  • If an attacker uses a quantum computer, it will be theoretically possible to factorize N in a short time with Shor's algorithm, and I am convinced that in the future, other, more efficient quantum algorithms will arise. I will explain this theory in more detail in Chapter 8, Quantum Cryptography, where we talk about quantum computing and Q-cryptography.

Finally, if we have a very short piece of plaintext, [M], and even the exponent, (e), is short, then RSA could be breakable. This is because the power operation, M^e, remains inside modulo N. So, in this phase of encryption, let's say we have the following:

M^e < N

Here, it's enough to use the e-th root of (c) to find ]M[.

Important Note

I have used open brackets, ]M[, to denote that the message has been decrypted.

Numerical example:

M = 2

e = 3

N = 77

2^3 ≡ 8 (mod 77)

Since e = 3, by performing a simple cubic root, , we can obtain the message in cleartext:

8^(1/3) = 2

Here, we are working in linear mathematics and no longer in modular mathematics.

A way to overcome this problem is to lengthen the message by adding random bits to it. This method is very common in cryptography and cybersecurity and is known as padding.

There are different ways to perform padding, but here, we are talking about bit padding. As we covered in Chapter 1, Deep Diving into Cryptography, we can use ASCII code to convert text into a binary system, so the message, [M], is a string of bits. If we add random bits (usually at the end, though they could also be added at the start), we will obtain something like this:

... | 1011 1001 1101 0100 0010 0111 0000 0000 |

As you can see, the bold digits represent the padding.

This method can be used to pad messages that are any number of bits long, not necessarily a whole number of bytes long; for example, a message of 23 bits that is padded with 9 bits to fill a 32-bit block.

Now that we are more familiar with RSA and modular mathematics properties, we'll explore the first interesting application that was implemented with this algorithm.

The application of RSA to verify international treaties

Let's say that Nation Alpha wants to monitor seismic data from Nation Beta to be sure that they don't experiment with any nuclear bombs in their territory. A set of sensors has been installed on the ground of Nation Beta to monitor its seismic activity, recorded, and encrypted. Then, the output data is transmitted to Nation Alpha, let's say, via a satellite.

This interesting application of RSA works as follows:

  • Nation Alpha, (A), wants to be sure that Nation Beta, (B), doesn't modify the data.
  • Nation Beta wants to check the message before sending it (for spying purposes).

We name the data that's collected from the sensors [x]; so, the protocol works as follows:

Key generation:

  1. Alpha chooses the parameters, (N= p*q), as the product of two big prime numbers, and the (e) parameter.
  2. Alpha sends (N, e) to Beta.
  3. Alpha keeps the private key, [d], secret.

The protocol for threat verification on atomic experiments is developed as follows:

  • Step 1: A sensor located deep in the earth collects data, [x], performing encryption using the private key, [d]:

    x^d ≡ y (mod N)

  • Step 2: Initially, both the (x) and (y) parameters are sent by the sensor to Beta to let them verify the truthfulness of the information. Beta checks the following:

    y^e ≡ x (mod N)

  • Step 3: After the positive check, Beta forwards (x, y) to Alpha, who can control the result of (x):

    x ≡ y^e (mod N)

    Important Note

    (x) is the collected set of data from the sensor, while [d] is the private key of Alpha stored inside the protected software sensor that collects the data.

    This encryption is performed in the opposite way to how RSA usually works.

If the y^e ≡ x (mod N) equation is verified, Alpha can be confident that the data that's been sent from Beta is correct and they didn't modify the message or manipulate the sensor. That's because the encrypted message, (x), corresponding to the cryptogram, (y), can truly be generated by only those who know the private key, [d].

If Beta has previously attempted to manipulate the encryption inside the box that holds the sensor by changing the value of (x), then it will be very difficult for Beta to get a meaningful message.

As we mentioned previously, in this protocol, RSA is inverted, and the encryption is performed with the private key, [d], instead of (e), the public parameter, which is what normally happens.

Essentially, the difficulty here for Beta in modifying the encryption is to get a meaningful number or message. Trying to modify the cryptogram, (y), even if the (x) parameter is previously known, has the same complexity to perform the discrete logarithm (which is a hard problem to solve, as we have already seen).

We visualize the process by referring to the following diagram:

Figure 3.3 – The sensor buried underground with the data transmitted via satellite

Figure 3.3 – The sensor buried underground with the data transmitted via satellite

Now that we've learned how to use the international treaties that are performed with the RSA algorithm, I want to introduce a section that will be discussed later in Chapter 6, New Algorithms in Public/Private Key Encryption, related to unconventional attacks on the RSA algorithm and its most famous library, OpenSSL.

Unconventional attacks

I have called these algorithms unconventional because they have been implemented by me and not tested and published until now.

We will see as we continue through this book, these unconventional attacks against RSA are even valid for other asymmetric encryption algorithms. These unconventional attacks, implemented between 2011 and 2014, have the scope to recover the secret message, [M], without knowing Alice's decryption key, [d], and the prime secret numbers, [p] and [q], behind (N). I will showcase these unconventional attacks here in this section, but I will present these methods in more detail in Chapter 6, New Algorithms in Public/Private Key Encryption.

Among these algorithms, we have some attacks that are used against RSA, but they can be transposed to most of the asymmetric algorithms covered in this chapter.

A new algorithm that could break the factorization problem in the future is NextPrime. It derives from a genetic algorithm discovered by a personal dear friend who explained the mechanism to me in 2009, Gerardo Iovane. In his article, The Set of Prime Numbers, Gerardo Iovane describes how it is possible to get all the prime numbers starting from a simple algorithm, discarding the non-prime numbers from the pattern.

Note

For a more accurate and thorough discussion of Gerardo Iovane's genetic algorithm, you can read The Set of Prime Numbers at https://arxiv.org/abs/0709.1539.

After years of work and many headaches, I have arrived at a mathematical function that represents a curve; each position on this curve represents a prime number, and between many positions lie the semiprimes (N) generated by two primes. This curve geometrically represents all the primes of the universe. It turns out that the position of (N) lies always in-between the positions on the curve of the two prime numbers, [p] and [q], and (N) is almost equidistant from their positions. It's also possible to demonstrate that the prime numbers have a very clear order and are not randomly positioned and disordered as believed.

The distance between the two prime objects, [p] and [q], of the multiplication that determines (N) is equivalent to the number of primes lying between [p] and [q]. For example, the distance prime between 17 and 19 is zero, the distance between 1 and 100 is 25, and the distance between 10,000 and 10,500 is only 55.

Right now, this algorithm is only efficient under determinate conditions: for example, when [p] and [q] are rather close to each other (at a polynomial distance). However, the interesting thing is that it doesn't matter how big the two primes are. I did some tests with this algorithm using primes of the caliber of 10^1,000 digits.

Just to clarify how big such numbers are, you can consider that 10^80 represents the number of particles in the Universe. For instance, a semi-prime with an order of 10^1,000 digits corresponds to RSA's public key length of around 3,000 bits (one of the biggest public keys that's used in cryptography right now). It could be processed in an elapsed time of a few seconds using the NextPrime algorithm if the two primes are close to each other.

At the time of writing this book, I am working on a version of the NextPrime algorithm based on a quantum computer. It could be the next generation of quantum computing factorization algorithms (similar to the Shor algorithm, which we will look at in Chapter 8, Quantum Cryptography).

Now, let's continue to analyze how else it is possible to attack RSA. As shown in the following diagram, there are two points of attack in the algorithm: one is the factorization of (N), while the other is the discrete power, [M^e]:

Figure 3.4 – Points where it is possible to attack RSA

Figure 3.4 – Points where it is possible to attack RSA

Most of the conventional analysis of cryptologists is focused on factorizing (N), as we have seen. But RSA suffers not only from the factorization problem; there is also another problem linked to the exponent, (e), which we will see in Chapter 6, New Algorithms in Public/Private Key Cryptography. These methods are essentially backdoors that can recover the message, [M], without knowing the secret parameters of the sender: [d], [p], and [q]. What we will understand in Chapter 6, New Algorithms in Public/Private Key Cryptography, when we analyze these methods of attack, is that they are equivalent to creating a backdoor inside the RSA algorithm and its main library, OpenSSL.

The RSA paradigm that we've already examined states that Bob (the sender) cannot return the message once it has been delivered. However, if we apply those unconventional methods to break RSA, this paradigm is no longer valid. Bob can create his "fake encryption" by himself to return the message, [M], encrypted with Alice's public key, while the fake cryptogram can be decrypted by Alice using RSA's decryption stage.

Is this method an unreasonable model, unrepresentative of practical situations, or does it have practical uses?

I will leave the answer to this for Chapter 6, New Algorithms in Public/Private Key Cryptography. Now, let's explore another protocol based on the RSA implementation that has become a popular piece of software: PGP.

PGP

Pretty Good Privacy (PGP) is probably the most used cryptographic software in the world.

PGP was implemented by Philip Zimmermann during the Cold War. Philip started planning to take his family to New Zealand because he believed that, in the event of a nuclear attack, the country, so isolated from the rest of the world, would be less impacted by atomic devastation. At some point, while planning to move to New Zealand, something changed his mind and he decided to remain in the US.

To communicate with his friends, Zimmermann, who was an anti-nuclear activist, developed PGP to secure messages and files transmitted via the internet. He released the software as open source, free of charge for non-commercial use.

At the time, cryptosystems larger than 40 bits were considered to be like munitions. That is to say that cryptography is still considered a military weapon. There is a requirement that you must obtain if you decide to patent a new cryptosystem, you must have authorization from the Department of Defense to publish it. This was a problem that was encountered by PGP, which never used keys smaller than 128 bits. Penalties and criminal prosecutions for violating this legal requirement are very severe, which is why Zimmermann had a pending legal status for many years until the American government decided to close the investigation and clear him.

PGP is not a proper algorithm, but a protocol. The innovation here is just to merge asymmetric with symmetric encryption. The protocol uses an asymmetric encryption algorithm to exchange the key, and symmetric encryption to encrypt the message and obtain the ciphertext. Moreover, a digital signature is required to identify the user and avoid a MiM attack.

The following are the steps of the protocol:

  • Step 1: The key is transmitted using an asymmetric encryption algorithm (ElGamal, RSA).
  • Step 2: The key that's been transmitted with asymmetric encryption becomes the session key for the symmetric encryption (DES, Triple DES, IDEA, and AES – we covered this in Chapter 2, Introduction to Symmetric Encryption).
  • Step 3: A digital signature is used to identify the users (we will look at RSA digital signatures in Chapter 4, Introducing Hash Functions and Digital Signatures).
  • Step 4: Decryption is performed using the symmetric key.

PGP is a good protocol for very good privacy and for securing the transmission of commercial secrets.

The ElGamal algorithm

This algorithm is an asymmetric version of the D-H algorithm. ElGamal aims to overcome the problems of MiM and the impossibility of the signatures for key ownership in D-H. Moreover, ElGamal (just like RSA) is an authentic asymmetric algorithm because it encrypts the message without previously exchanging the key.

The difficulty here is commonly related to solving the discrete logarithm. As we will see later, there is also a problem related to factorization.

ElGamal is the first algorithm we'll explore that presents a new element: an integer random number, [k], that's chosen by the sender and kept secret. It's an important innovative element because it makes its encryption "ephemeral," in the sense that [k] makes the encryption function unpredictable. Moreover, we will frequently see this new element related to the zero-knowledge protocol in Chapter 5, Introduction to Zero-Knowledge Protocols.

Let's look at the implementation of this algorithm and how it is used to transmit the secret message, [M].

Alice and Bob are always the two actors. Alice is the sender and Bob is the receiver.

The following diagram shows the workflow of the ElGamal algorithm:

Figure 3.5 – Encryption/decryption of the ElGamal algorithm

Figure 3.5 – Encryption/decryption of the ElGamal algorithm

As shown in the last step of Bob's decryption, we can see an inverse multiplication in (mod p). This kind of operation is essentially a division that's performed in a finite field. So, if the inverse of A is B, we have A*B = 1 (mod p). The following example shows the implementation of this inverted modular function with Mathematica.

Now, having explained the algorithm, let's look at a numerical example to understand it.

Publicly defined parameters:

The public parameters are p (a large prime number) and g (a generator):

p = 200003

g = 7

Key generation:

Alice chooses a random number, [k], and keeps it secret.

k = 23 (Alice's private key).

Bob computes his public key, (B), starting from his private key, [b]:

b = 2367 (Bob's private key):

B ≡ 7^2367 (mod 200003)

B = 151854

Alice's encryption:

Alice generates a secret message, [M]:

M = 88

Then, Alice computes (y1, y2), the two public parameters that she will send to Bob:

y1 ≡ 7^23 (mod 200003)

y1 = 90914

y2 ≡ 88 * 151854^ 23 (mod 200003)

y2 = 161212

Alice sends the parameters to Bob (y1 = 90914; y2 = 161212).

Bob's decryption:

First, Bob computes (Kb) by taking y1 = 90914, which is elevated to his private key, [b] = 2367:

Kb ≡ 90914^2367 (mod 200003)

Kb = 10923

Inverted Kb in (mod p) [Reduce[Kb*x == 1, x, Modulus -> p] (performed with Wolfram Mathematica):

Inverted Kb ≡ 192331 (mod 200003)

Finally, Bob can return the message, [M].

Bob takes (y2) and multiplies it by [Inverted Kb], returning the message, [M]:

y2* InvKb ≡ M (mod 200003)

The final result is the message [M]:

161212 * 192331 ≡ 88 (mod 200003).

ElGamal encryption is used in the free GNU Privacy Guard (GnuPG) software. Over the years, GnuPG has gained wide popularity and become the de facto standard free software for private communication and digital signatures. GnuPG uses the most recent version of PGP to exchange cryptographic keys. For more information, you can go to the web page of this software: https://gnupg.org/software/index.html.

Important Note

As I have mentioned previously, it's assumed that the underlying problem behind the ElGamal algorithm is the discrete logarithm. This is because, as we have seen, the public parameters and keys are all defined by equations that rely on discrete logarithms.

For example, B ≡ g^b (mod p); Y1 ≡ g^k (mod p) and Kb ≡ y1^b (mod p) are functions related to the discrete logarithm.

Although the discrete logarithm problem is considered to be the main problem in ElGamal, we also have the factorization problem, as shown here. Let's go back to the encryption function, (y2):

y2 ≡ M*B^k (mod p)

Here, you can see that there is multiplication. So, an attacker could also try to arrive at the message by factorizing (y2).

This will be clearer if we reduce the function:

H ≡ B^k (mod p)

Then, we have the following:

y2 ≡ M* H (mod p)

This can be rewritten like so:

M ≡ y2/H (mod p)

As you can see, (y2) is the product of [M*H]. If someone can find the factors of (y2), they can probably find [M].

Summary

In this chapter, we analyzed some fundamental topics surrounding asymmetric encryption. In particular, we learned how the discrete logarithm works, as well as how some of the most famous algorithms in asymmetric encryption, such as Diffie-Hellmann, RSA, and ElGamal, work. We also explored an interesting application of RSA related to exchanging sensitive data between two nations. In Chapter 4, Introducing Hash Functions and Digital Signatures, we will learn how to digitally sign these algorithms.

Now that we have learned about the fundamentals of asymmetric encryption, it's time to analyze digital signatures. As you have already seen with PGP, all these topics are very much related to each other.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.21.109