Chapter 32. Entropy and Uncertainty

Entropy is an information-theoretic measure of the amount of uncertainty in a variable. Beginning with Shannon's seminal works [908, 909], cryptographers and information theorists used entropy to determine how well transformations on messages obscure their meaning. Entropy has applications in a wide variety of disciplines, including cryptography, compression, and coding theory. This chapter reviews the basics of entropy, which has its roots in probability theory.

Conditional and Joint Probability

  • Definition 32–1. A random variable is a variable that represents the outcome of an event.

A word about notation. We write p(X = x1) for the probability that the random variable X has value x1. When the specific value does not matter (for example, when all values are equiprobable), we abbreviate this as p(X).

Sometimes the results of two different events are of interest.

  • Definition 32–2. The joint probability of X and Y, written p(X, Y), is the probability that the random variables X and Y will simultaneously assume particular values.

The example above involves the probability p(X = 6, Y = heads). If the two random variables are independent (that is, if the value of one does not affect the value of the other), then

  • p(X, Y) = p(X)p(Y)

Now suppose that the two random variables are not independent.

This example shows that if two events are not independent, then the formula for joint probability is more complicated. The next definition captures the notion of dependence.

  • Definition 32–3. The conditional probability of X given Y (written p(X | Y)) is the probability that X takes on a particular value, given that Y has a particular value.

We can now write the formula for joint probability in terms of conditional probability:

  • p(X, Y) = p(X | Y)p(Y) = p(X)p(Y | X)

In fact, the formula for joint probability of two independent events is a special case of the formula above. When X and Y are independent random variables,

  • p(X | Y) = p(X)

Entropy and Uncertainty

  • Definition 32–4. Let the random variable X take values from some set { x1, ... xn }. The value xi occurs with probability p(X = xi), where Definition 32–4. = 1.

The entropy, or uncertainty, of x is

  • H(X) = -Definition 32–4. lg p(X = xi)

where “lg x” is the base 2 logarithm of x. (For purposes of this definition, we define 0 lg 0 to be 0.)

This definition measures the uncertainty of a message in bits.

Joint and Conditional Entropy

Joint and conditional entropy are analogous to joint and conditional probability.

Joint Entropy

  • Definition 32–5. Let the random variable X take values from some set { x1, ... xn }. The value xi occurs with probability p(X = xi), where Definition 32–5. p(X = xi) = 1. Let the random variable Y take values from some set { y1, ... ym }. The value yj occurs with probability p(Y = yj), where Definition 32–5. p(Y = yj) = 1. The joint entropy of X and Y is

  • H(X, Y) = Definition 32–5. Definition 32–5. p(X = xi, Y = yj) lg p(X = xi, Y = yj)

Conditional Entropy

  • Definition 32–6. Let the random variable X take values from some set { x1, ... xn }. The value xi occurs with probability p(X = xi), where Definition 32–6. p(X = xi) = 1. Let the random variable Y take values from some set { y1, ... ym }. The value yj occurs with probability p(Y = yj), where Definition 32–6. p(Y = yj) = 1. The conditional entropy, or equivocation, of X given Y = yj is

  • H(X | Y = yj) = Definition 32–6. p(X = xi | Y = yj) lg p(X = xi | Y = yj)

The conditional entropy of X given Y is

  • H(X | Y) = Definition 32–6. p(Y = yj)[Definition 32–6. p(X = xi | Y = yj) lg p(X = xi | Y = yj)]

The latter is the weighted mean of the conditional entropies of X given Y = yj for the possible values of Y.

Perfect Secrecy

Perfect secrecy arises when knowing the ciphertext does not decrease the uncertainty of the plaintext. More formally:

  • Definition 32–7. Let M be a random variable that takes values from the set of messages m1, …mn. The cipher C = E(M) achieves perfect secrecy if H(M | C) = H(M).

Exercises

1:

Let X represent the roll of a red die, and let Y represent the sum of the values from rolling the red die and a blue die. Prove that p(X = 3| Y = 8) = 1/5.

2:

Prove that the maximum entropy for an unknown message chosen from the set of possible messages { “yes”, “no” } occurs when the probability of each message is 1/2.

3:

Let X and Y be random variables that take values from finite sets. Prove that H(X, Y) ≤ H(X) + H(Y), with equality holding when X and Y are independent.

4:

Let X and Y be random variables that take values from finite sets. Prove that H(X, Y) = H(X | Y) + H(Y).

5:

Let M and C be random variables that take values from the set of possible plaintexts and the set of possible ciphertexts for some cryptosystem. Prove that the cryptosystem provides perfect secrecy if and only if p(M | C) = p(M).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.210.102