Arithmetic in Finite Fields

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

3.5. Arithmetic in Finite Fields

Many cryptographic protocols are based on the (apparent) intractability of the discrete logarithm problem (Section 4.2) in the multiplicative group of a finite field . The arithmetic of the finite fields , , and , , is easy to implement and run efficiently. In view of this, these two kinds of finite fields are most popular in cryptography and we concentrate our algorithmic study on these fields only.

A prime field is the quotient ring . In Section 3.3.4, we have already made a thorough study of the arithmetic of the rings , . We recall that the elements of are represented as integers from the set {0, 1, . . . , p – 1} and the arithmetic in is the modulo p integer arithmetic. Since p is typically multiple-precision, the characteristic p of is odd. The fields of even characteristic that we will study are the non-prime fields .

Section 2.9.3 explains several representations of extension fields. The most common one is the polynomial-basis representation for an irreducible polynomial f(X) of degree n in . In that case, an element of has the canonical representation as a polynomial a₀ + a₁X + · · · + a_n–1X^n–1, , of degree < n. An arithmetic operation on two elements of is the same operation in followed by reduction modulo the defining polynomial f(X). So we start with the implementation of the polynomial arithmetic over .

3.5.1. Arithmetic in the Ring

A polynomial over (or any field) is identified by its coefficients of which only finitely many are non-zero. Thus for storing a polynomial g(X) = a_dX^d + a_d–1X^d–1 + · · · + a₁X + a₀ it is sufficient to store the finite ordered sequence a_da_d–1 . . . a₁a₀. It is not necessary to demand a_d ≠ 0, but the shortest sequence representing a non-zero polynomial corresponds to a_d ≠ 0 and in this case deg g = d. On the other hand, as we see later it is often useful to pad such a sequence with leading zero coefficients. As an example, the polynomial is representable as 101 or as 0101 or as 00101 or · · ·.

Since can be viewed as the set {0, 1} with operations modulo 2, a polynomial in is essentially a bit string unique up to insertion (and deletion) of leading zero bits. As in the case of multiple-precision integers, we pack these coefficients in an array of 32-bit words and maintain the number of coefficients belonging to the polynomial. For example, the polynomial g(X) = X⁶⁴ + X³¹ + X⁷ + 1 can be stored in an array w₂w₁w₀ of three 32-bit words. w₀ consists of the coefficients of X⁰, X¹, . . . , X³¹, w₁ consists of the coefficients of X³², X³³, . . . , X⁶³, and w₂ consists of the coefficient of X⁶⁴. It is up to the implementation scheme to decide whether the coefficients are to be stored from left to right or from right to left in the bits of a word. We assume that less significant coefficients go to the less significant bits of a word. For the polynomial g above, the word w₀ viewed as an unsigned integer will then be w₀ = 2³¹ + 2⁷ + 1, whereas we have w₁ = 0. The least significant bit of w₂ would be 1. The remaining 31 bits of w₂ are not important and can be assigned any value as long as we maintain the information that only the coefficients of Xⁱ, 0 ≤ i ≤ 64, need to be considered. On the other hand, if we want to store the coefficients of g upto that of X⁸⁰, then the bits of w₂ at locations 1, . . . , 16 must be zero, whereas those at locations 17, . . . , 31 may be of any value. We, however, always recommend the use of leading zero-bits to fill the portion of the leading word not belonging to the polynomial.

Such a representation of elements of , in addition to being compact, facilitates efficient implementation of arithmetic functions. As we will shortly see, we need not often extract the individual coefficients of a polynomial but apply bit operations on entire words to process 32 coefficients simultaneously per operation. We usually do not need polynomials of degrees > 4096 for cryptographic applications. It is, therefore, sufficient to declare a static array capable of storing all the 8193 coefficients of a product of two such largest polynomials. The zero polynomial may be represented as one with zero word size, whereas the degree of the zero polynomial is taken to be –∞ which may be representable as –1.

We now describe the arithmetic functions on two non-zero polynomials

Equation 3.3

Under our implementation, a and b demand ρ := ⌈(r + 1)/32⌉ and σ := ⌈(s + 1)/32⌉ machine words α_{ρ – 1} . . . α₁α₀ and β_{σ – 1} . . . β₁β₀. We also assume paddings with leading zero bits in the areas not belonging to the operands.

Note that the addition of is the same as the XOR (⊕) of two bits. Applying this bit operation on words α_i and β_i adds 32 coefficients of the operand polynomials simultaneously (see Algorithm 3.18). Finally note that –1 = 1 in any field of characteristic 2, that is, subtraction is the same as addition in such a field.

The product a(X)b(X) can be computed as in Algorithm 3.19. Once again, using wordwise operations yields faster implementation. By AND and OR, we denote the bit-wise and and or operations on 32-bit words. The easy verification of the correctness of this algorithm is left to the reader. As in the case of addition, one might want to make the polynomial c compact after its words γ_{τ – 1}, . . . , γ₀ are computed.

Algorithm 3.18. Polynomial addition

Input: a(X), as in Equation (3.3).

Output: c(X) = a(X) + b(X) (to be stored in the array γ_{τ – 1} . . . γ₁γ₀).

Steps:

τ := max(ρ, σ).
for (i = 0, . . . , min(ρ, σ) – 1) γ_i := α_i ⊕ β_i.
if (ρ > σ) for (i = σ, . . . , ρ – 1) γ_i := α_i,
else if (ρ < σ) for (i = ρ, . . . , σ – 1) γ_i := β_i.
while (τ > 0) and (γ_{τ – 1} = 0) τ – –. /* Make c compact (optional) */

Algorithm 3.19. Polynomial multiplication

Input: a(X), as in Equation (3.3).

Output: c(X) = a(X)b(X) (to be stored in the array γ_{τ – 1} . . . γ₁γ₀).

Steps:

τ := ρ + σ – 1.     /* The size of the product */
for (i = 0, . . . , τ – 1) γ_i := 0.     /* Initialize the product */

/* The quadratic multiplication loop */
for (k = 0, . . . , 31) {    /* For each bit position in a word */
   for (j = 0, . . . , σ – 1) {     /* For each word of b */
      if (b_j AND 2^k) {     /* if the k-th bit of b_j is 1 */
         for (i = 0, . . . , ρ – 1) {    /* For each word of a */
            set γ_{i + j} := γ_{i + j} ⊕ (a_i ≪ k) and γ_{i + j + 1} := γ_{i + j + 1} ⊕ (a_i ≫ (32 – k)).
         }
      }
   }
}

The square of can be computed very easily using the fact that

a(X)² = (a_rX^r + · · · + a₁X + a₀)² = a_rX^2r + · · · + a₁X² + a₀.

This gives us a linear-time (in terms of r or ρ) algorithm instead of the quadratic general-purpose multiplication Algorithm 3.19. We leave the implementational details to the reader.

Division with remainder in is implemented in Algorithm 3.20. As before, we continue to work with the operands a(X) and b(X) as in Equation (3.3). But now we make a further assumption that b_s = 1, so that β_σ–1 ≠ 0, and also that s ≤ r. When the Euclidean division loop of Algorithm 3.20 terminates, the array locations δ_σ–1, . . . , δ₁, δ₀ contain the remainder. The arrays γ and δ may be made compact to discard the leading zero bits, if any.

Algorithm 3.20. Euclidean division of polynomials

Input: a(X), as in Equation (3.3).

Output: c(X) = a(X) quot b(X) (to be stored in the array γ_{τ – 1} . . . γ₁γ₀) and d(X) = a(X) rem b(X) (to be sored in the array δ_ρ–1 . . . δ₁δ₀).

Steps:

τ := ⌈(s – r + 1)/32⌉.    /* The size of the quotient */
for i = 0, . . . , τ – 1 { γ_i := 0 }    /* Initialize c(X) to 0 */

for i = 0, . . . , ρ – 1 { δ_i := α_i }   /* Copy a(X) to d(X) */

/* Euclidean division loop */
for i = r, . . . , s {
   if (the coefficient of Xⁱ in d(X) is 1) {
       j := (i – s) quot 32, k := (i – s) rem 32.

       /* Set the coefficient of X^{i – s} of c(X) */
       γ_j := γ_j OR 2^k.

       /* Update d(X) := d(X) – X^{i – s}b(X) */
       for l = 0, . . . , σ – 1 {
          δ_{l + j} := δ_{l + j} ⊕ (b_l ≪ k).
          δ_{l + j + 1} := δ_{l + j + 1} ⊕ (b_l ≫ (32 – k)).
       }
    }
}

Computing modular inverses requires computation of extended gcds of polynomials in . We again start with the non-zero polynomials a(X), and compute polynomials d(X), u(X) and v(X) in with d(X) = gcd(a(X), b(X)) = u(X)a(X) + v(X)b(X), deg u < deg b and deg v < deg a. For polynomials, we do not have an equivalent of the binary gcd algorithm (Algorithm 3.8). We use repeated Euclidean divisions instead.

The proof for the correctness of Algorithm 3.21 is similar to that for Algorithm 3.8. Here, we introduce the variables r_k, U_k and V_k for k = 0, 1, 2, . . . . The initialization goes as: r₀ := a, r₁ := b, U₀ := 1, U₁ := 0, V₀ := 0 and V₁ := 1. During the k-th iteration (k = 1, 2, . . .), we first use Euclidean division to get r_k–1 = q_kr_k + r_{k + 1} which gives r_{k + 1} = r_k–1 – q_kr_k. We also compute U_{k + 1} = U_k–1 – q_kU_k and V_{k + 1} = V_k–1 –q_kV_k using the values available from the previous two iterations so as to maintain the relation r_{k + 1} = U_{k + 1}r₀ + V_{k + 1}r₁ for all k = 1, 2, . . . . In Algorithm 3.21, the k-th iteration of the while loop begins with x = r_k–1, y = r_k, u₁ = U_k and u₂ = U_k–1 and ends after updating the values to x = r_k, y = r_{k + 1}, u₁ = U_{k + 1} and u₂ = U_k. It is not necessary to maintain the values V_k in the main loop. After the loop terminates, one computes V_k = (r_k – U_kr₀)/r₁.

Modular arithmetic in is very much similar to the modular arithmetic in . If f(X) is a non-constant polynomial of (not necessarily irreducible), we represent elements of as polynomials in of degrees < n. Given two such polynomials a and b, we compute the sum a + b simply as the sum in . The product ab is computed by first computing the product ab in and then computing the remainder of Euclidean division of this product by f. Inverse of a modulo f exists if and only if gcd(a, f) = 1 (in ). In that case, extended gcd computation gives us polynomials u, v such that 1 = ua + vf, so that ua ≡ 1 (mod f). If a ≠ 0, then Algorithm 3.21 computes u with deg u < deg f = n, so that we take this u to be the canonical representative of a^–1 in . Finally, for the computation of the modular exponentiation a^e (mod f) can be done using an algorithm very similar to Algorithm 3.9 or Algorithm 3.10. We leave the details to the reader.

Algorithm 3.21. Extended gcd of polynomials

Input: Nonzero polynomials a, .

Output: Polynomials d, u, satisfying

d = gcd(a, b) = ua + vb, deg u < deg b, deg v < deg a.

Steps:

/* Initialize */
x := a, y := b, u₁ := 1, u₂ := 0.

/* Repeated Euclidean division */
while (y ≠ 0) {
   Simultaneously compute q := x quot y and r := x rem y (Algorithm 3.20).
   u := u₂ – qu₁, u₂ := u₁, u₁ := u,
   x := y, y := r.
}
d := x, v := (d – ua)/b.

3.5.2. Finite Fields of Characteristic 2

For the polynomial basis representation , we need an irreducible polynomial of degree n. We shortly present a probabilistic algorithm that generates a random monic irreducible polynomial in of given degree . Although we are interested only in the case q = 2, this algorithm holds even if q is any arbitrary prime or an arbitrary prime power.

First, we describe a deterministic polynomial-time algorithm for checking the irreducibility of a non-constant polynomial (over ). If f is reducible, it has a factor of degree i ≤ ⌊n/2⌋. Also recall (Theorem 2.40, p 82) that X^qⁱ –X is the product of all monic irreducible polynomials of of degrees dividing i. Therefore, if f has an irreducible factor of degree i, then gcd(f, X^qⁱ – X) = gcd(f, X^qⁱ – X rem f) will be a non-constant polynomial. Algorithm 3.22 employs these simple observations.

Now, recall from Section 2.9.2 that a random monic polynomial of of degree n is irreducible with probability approximately 1/n. Therefore, if we keep on checking for irreducibility random monic polynomials in of degree n, then after O(n) checks we expect to find an irreducible polynomial. This leads to the Las Vegas probabilistic Algorithm 3.23.

Algorithm 3.22. Check for irreducibility of a polynomial

Input: A non-constant polynomial .

Output: A (deterministic) certificate whether f is irreducible or not.

Steps:

n := deg f, g := X.
for i = 1, . . . , ⌊n/2⌋ {
g := g^q (mod f). /* Here g = X^qⁱ rem f */
if (deg(gcd(f, g – X)) > 0) { Return “f is reducible”. }
}
Return “f is irreducible”.

Algorithm 3.23. Generation of a random irreducible polynomial

Input: , n ≥ 2.

Output: A random monic irreducible polynomial of degree n.

Steps:

while (1) {
f := a random monic polynomial in of degree n.
if (f is irreducible) { Return }
}

Once the defining irreducible polynomial f is available, we carry out the arithmetic in as modular polynomial arithmetic with respect to the modulus f. This is described at the end of Section 3.5.1. Since this modular arithmetic involves taking the remainder of Euclidean division by f, it is sometimes expedient to choose f to be an irreducible polynomial of certain special types. The randomized algorithm described above gives a random monic irreducible polynomial f of degree n having on an average ≈ n/2 non-zero coefficients. The division algorithm (Algorithm 3.20) in that case takes time O(n²). On the other hand, if f is a sparse polynomial (like a trinomial), the Euclidean division loop can be rewritten to exploit this sparsity, thereby bringing down the running time of the division procedure to O(n). (See Exercise 3.34. Also see Exercise 3.38 for computing isomorphisms between different polynomial-basis representations of the same field.)

Let p be a prime and let . We have seen how to implement arithmetic in and hence by Exercise 3.35 that in too. If is an irreducible polynomial of degree n and if q = pⁿ, then and we implement the arithmetic of as the polynomial arithmetic of modulo f. Again by Exercise 3.35, this gives us the arithmetic of . Now, for and a monic irreducible polynomial we have a representation . Instead of having such a two-way representation of we may also represent as , where is a monic irreducible polynomial of degree nm. It usually turns out that the second representation of is more efficient. However, there are some situations where the two-way representation performs better. This is, in particular, the case when the arithmetic of can be made more efficient than the modular polynomial arithmetic of . For example, we might precompute tables of arithmetic operations of and use table lookups for performing the coefficient arithmetic of . This demands O(q²) storage and is feasible only when q is small. On the other hand, if we find a primitive element γ of and precompute a table that maps i ↦ γⁱ and another that maps γⁱ → i, then products in can be computed in time O(1) using table lookups. If, in addition, we store the Zech’s logarithm table (Section 2.9.3 ) for , then addition in can also be performed in O(1) time with table lookup. Both these three tables take O(q) memory which (though better than O(q²) storage of the previous scheme) is feasible only for small q.

3.5.3. Selecting Suitable Finite Fields

Not all finite fields are suitable for cryptographic applications. In this section, we discuss the desirable properties of a field so that secured protocols on can be developed. We first note that such protocols are usually based on the apparent intractability of the so-called discrete logarithm problem (DLP) (Section 4.2). As a result, selections of suitable fields are dictated by the known cryptanalytic algorithms to solve the DLP (See Section 4.4). We shall mostly concentrate on with either q = p a prime or q = 2ⁿ for some . By the bit size of q, denoted |q|, we mean the number of bits in the binary representation of q, that is, |q| = ⌈lg q⌉. As we have seen, each element of is representable using O(|q|) bits and, therefore, |q| is often also called the size of .

The first requirement on a cryptographically suitable field is that the size |q| should be sufficiently large. Recent cryptanalytic studies show that sizes |q| ≤ 512 are not secure enough. Sizes |q| ≥ 768 are recommended for secure applications. For long-term security, one might even require |q| ≥ 2048.

Any field of the recommended size is, however, not adequately secure. The cardinality #F_q = q must be such that q – 1 has at least one large prime divisor q′ (See the Pohlig–Hellman method in Section 4.4). By large, we usually mean |q′| ≥ 160. In addition, this prime factor q′ of q – 1 should be known to us. If q = p is a prime, then a safe prime or a strong prime serves our purpose (Definition 3.5, Algorithm 3.14). Also see Exercise 3.25. On the other hand, if q = 2ⁿ, the only way to obtain q′ is by factorizing the Mersenne number M_n := q – 1 = 2ⁿ – 1. Factorizing M_n for n ≥ 768 is a very difficult task. Luckily, extensive tables of complete or partial factorizations of M_n are available. For example, for n = 769 (a prime number), we have

M₇₆₉ = 2⁷⁶⁹ – 1 = 1,591,805,393 × 6,123,566,623,856,435,977,170,641 × q′,

where q′ is a 657-bit prime. These tables should be consulted for choosing a suitable value of n.

The multiplicative group is cyclic (Theorem 2.38). If the complete integer factorization of q – 1 is known, then it is possible to find, in polynomial time (in |q|), a primitive element of . Algorithm 3.24 computes r = O(lg n) exponentiations in G in order to conclude whether a given element is a generator of G. For , we have polynomial-time exponentiation algorithms, so Algorithm 3.24 runs in deterministic polynomial time. By Exercise 2.47 , the probability of a randomly chosen element of G being primitive is φ(m)/m. In view of the lower bound on φ(m)/m, given in Theorem 3.1 and proved by Rosser and Schoenfield [253], Algorithm 3.25 is expected to return a random primitive element of G after O(ln ln m) iterations.

Theorem 3.1.

Let , m ≥ 5. Then φ(m)/m ≥ 1/(6 ln ln m).

Algorithm 3.24. Check for primitive element

Input: A cyclic group G of cardinality #G = m with known factorization and an element .

Output: A deterministic certificate that a is a generator of G.

Steps:

/* We assume that G is multiplicatively written and has the identity e */
for i = 1, . . . , r {
if (a^(n–1)/p_i = e) { Return “a is not a generator of G”. }
}
Return “a is a generator of G”.

Algorithm 3.25. Computation of a generator of a finite cyclic group

Input: A cyclic group G of cardinality #G = m with known factorization .

Output: A generator g of G.

Steps:

while (1) {
g := a random element of G.
if (g is a generator of G) /* Algorithm 3.24 */ { Return }
}

If, however, the factorization of #G = m is not known, there are no known (deterministic or probabilistic) algorithms for finding a random generator of G or even for checking if a given element of G is primitive. This is indeed one of the intractable problems of computational algebraic number theory. This problem for can be bypassed as follows.

Recall that we have chosen q in such a way that has a large known prime factor q′. Let H be the unique subgroup of G of order q′. Then H is also cyclic and we choose to work in H (using the arithmetic of G). It turns out that if q′ ≥ 2¹⁶⁰ and if H is not contained in a proper subfield of , the security of cryptographic protocols over does not degrade too much by the use of H (instead of the full G) as the ground group. But we now face a new problem, that is, the problem of finding a generator of H. Since #H = q′ is a prime, every element of H {1} is a generator of H. So the problem essentially reduces to that of finding any non-identity element of H. This latter problem has a simple probabilistic solution. First of all, if q – 1 = q′ is itself prime, choosing any random non-identity element of will do. So assume q′ < q – 1. Choose a random and let b := a^{(q – 1)/q′}. By Lagrange’s theorem (Theorem 2.2, p 24), b^q′ = a^q–1 = 1 and, therefore, by Proposition 2.5 . Now, being a field, the polynomial can have at most (q – 1)/q′ roots in (that is, in ) and hence the probability that b = 1 is ≤ ((q – 1)/q′)/(q – 1) = 1/q′. This justifies the randomized polynomial running time of the Las Vegas Algorithm 3.26. Indeed if q′ ≥ 2¹⁶⁰, the while loop of the algorithm is executed only once almost always.

Algorithm 3.26. Computation of an element of given order

Input: A finite field and an (odd) prime factor q′ of q – 1 with q′ < q – 1.

Output: An element of multiplicative order q′.

Steps:

while (1) {
   a := a random element of   {0, ±1}.
   b := a^{(q – 1)/q′}.
   if (b ≠ 1) { Return }
}

3.5.4. Factoring Polynomials over Finite Fields

Polynomial factorization over finite fields is an interesting computational problem. All deterministic algorithms known for this purpose are quite poor: that is, fully exponential in the size of the field. However, if randomization is allowed, we have reasonably efficient (polynomial-time) algorithms. In this section, we outline the basic working of the modern probabilistic algorithms for polynomial factorization over finite fields. We assume that a non-constant polynomial is to be factored. Without loss of generality, we can take f to be monic. We assume further that the arithmetic of and that of is available. We work with a general value of q = pⁿ, p prime and , though in some cases we have to treat the case p = 2 separately. Irreducibility (or otherwise) in this section means the same over .

The factorization algorithm we are going to discuss is a generalization of the root finding algorithm (see Exercise 3.36) and consists of three steps:

Square-free factorization (SFF) Decompose the input polynomial f into a product of square-free polynomials.

Distinct-degree factorization (DDF) Given a square-free polynomial f of degree d, compute f = f₁ . . . f_d with each f_i being a product of irreducible polynomials of degree i.

Equal-degree factorization (EDF) Given a product f of irreducible polynomials of the same degree, find out the irreducible factors of f.

We now provide a separate detailed discussion for each of these three steps.

Square-free factorization

Theorem 3.2 is at the very heart of the square-free factorization algorithm and is a generalization of Exercise 2.61.

Theorem 3.2.

Let K be a field and a non-constant monic polynomial. Then the polynomial f / gcd(f, f′) is square-free, where f′ is the formal derivative of f. In particular, f is square-free if and only if gcd(f, f′) = 1.

Proof

Let be the factorization of f with pairwise distinct monic irreducible polynomials f₁, . . . , f_r, , with and with . In order to determine v_f₁(f′), we employ the usual rules for derivatives to get for some . If , then v_f₁(f′) ≥ α₁. Otherwise, v_f₁(f′) = α₁ – 1, since , i > 1. Similar is the case for v_fi(f′) for i = 2, . . . , r. It follows that gcd, where each , so that , , is square-free.

The algorithm for SFF over is now almost immediate except for one subtlety, namely, the consideration of the case f/gcd(f, f′) = 1, or equivalently, f′ = 0. In order to see when this case can occur, let us write the non-zero terms of f as f = a₁X^e₁ + · · · + a_tX^e_t with distinct exponents e₁, . . . , e_t and . Then f′ = a₁e₁X^{e₁ – 1} + · · · + a_te_tX^{e_t – 1} = 0 if and only if , that is, if p divides all of e₁, . . . , e_t. But then f(X) = h(X)^p, where , since for all i. These observations motivate the recursive Algorithm 3.27. It is easy to check that this (deterministic) algorithm runs in time polynomially bounded by deg f and log q.

Algorithm 3.27. Square-free factorization

Input: A monic non-constant polynomial , q = pⁿ, p prime, .

Output: A square-free factorization of f.

Steps:

Compute f′.
if (f′ = 0) {
    Compute  such that f = h^p.
    Recursively compute a SFF h = h₁ · · · h_s of h.
    Return the SFF of f as f = (h₁ · · · h_s)(h₁ · · · h_s) · · · (h₁ · · · h_s) (p times).
} else {
    Recursively compute a SFF gcd(f, f′) = g₁ · · · g_s of gcd(f, f′).
    Return the SFF of f as f = (f/ gcd(f, f′))g₁ · · · g_s.
}

Distinct-degree factorization

Let be a square-free polynomial of degree d. We can write f = f₁ · · · f_d, where for each i the polynomial is the product of all the irreducible factors of f of degree i. If f does not have an irreducible factor of degree i, then we take f_i = 1 as usual.^[5] In order to compute the polynomials f_i, we make use of the fact that is the product of all monic irreducible polynomials in whose degrees divide i (see Theorem 2.40 on p 82). It immediately follows that . Thus a few (at most d) gcd computations give us all f_i. But the polynomials are of rather large degrees. But since , keeping polynomials reduced modulo f implies that we take gcds of polynomials of degrees ≤ d. This, in turn, implies that the DDF can be performed in (deterministic) polynomial time (in d and ln q).

^[5] Conventionally, an empty product is taken to be the multiplicative identity and an empty sum to be the additive identity.

Algorithm 3.28 shows an implementation of the DDF. Though the algorithm does not require f to be monic, there is no harm in assuming so.

Algorithm 3.28. Distinct-degree factorization

Input: A (non-constant) square-free polynomial .

Output: The DDF of f, that is, the polynomials f₁, . . . , f_d as explained above.

Steps:

g := f.   /* Make a local copy of f */
h = X. i = 1.
while (deg g ≠ 0) {
   h := h^q (mod f).   /* Modular exponentiation */
   f_i := gcd(h – X, g).
   g := g/f_i.    /* Factor out f_i from g */
   i++.
}
if (i < d) { f_{i + 1} := 1, . . . , f_d := 1. }

This simple-minded implementation of the DDF is theoretically not the most efficient one known. In fact, it turns out that the DDF (and not the seemingly more complicated EDF) is the bottleneck of the entire polynomial factorization process. Therefore, making the DDF more efficient is important and there are lots of improvements suggested in the literature. All these improved algorithms essentially do the same thing as above (that is, the computation of ), but they optimize the computation of the polynomials rem f. The best-known method (due to Kaltofen and Shoup) is based on the observation that, in general, most of the f_i are 1. Therefore, instead of computing each , one may break the interval 1, . . . , d into several subintervals I₁, I₂, . . . , I_l and compute , j = 1, . . . , l. Only those F_j that turn up to be non-constant are further decomposed.

For cryptographic purposes, we will, however, deal with rather small values of d = deg f. (Typically d is at most a few thousands.) The asymptotically better algorithms usually do not outperform the simple Algorithm 3.28 for these values of d.

Equal-degree factorization

Equal-degree factorization, the last step of the polynomial factorization process, is the only probabilistic part of the algorithm. We may assume that f is a (monic) square-free polynomial of degree d and that each irreducible factor of f has the same (known) degree, say δ. If d = δ, then f is irreducible. So we assume that d > δ, that is, d = rδ for some . Theorem 3.3 provides the basic foundations for the EDF.

Theorem 3.3.

Let g be any polynomial in and let . Then X^{q^δ} – X divides g^{q^δ} – g.

Proof

If g = 0, there is nothing to prove. If g = a_lX^l + · · · + a₁X + a₀ ≠ 0 with , then g^{q^δ} – g = a_l(X^{lq^δ} – X^l) + · · · + a₁(X^{q^δ} – X). It is easy to verify that X^{q^δ} – X divides X^{iq^δ} – Xⁱ for every .

Now, we have to separate two cases, namely, q is odd and q is even. Theorem 3.3 is valid for any q, even or odd, but taking q as odd allows us to write g^{q^δ} – g = g(g^{(q^δ –1)/2}–1)(g^{(q^δ –1)/2} + 1). With the above assumptions on f we have f|(X^{q^δ} –X) and, therefore, f|(g^{q^δ} – g), so that f = gcd(g^{q^δ} – g, f) = gcd(g, f) gcd(g^{(q^δ –1)/2} – 1, f) gcd(g^{(q^δ –1)/2} + 1, f). If g is randomly chosen, then gcd(g^{(q^δ –1)/2} – 1, f) is with probability ≈ 1/2 a non-trivial factor of f. The idea is, therefore, to keep on choosing random g and computing until one gets . One then recursively applies the algorithm to and . It is sufficient to choose g with deg g < 2δ. Obviously, the exponentiation g^{q^δ} has to be carried out modulo f. We leave the details to the reader, but note that trying O(1) random polynomials g is expected to split f and, therefore, the EDF runs in expected polynomial time.

For the case q = 2ⁿ, essentially the same algorithm works, but we have to use the split g^{q^δ} + g = g^{2^nδ} + g = (g^{2^nδ–1} + g^{2^nδ–2} + · · · + g² + g)(g^{2^nδ–1} + g^{2^nδ–2} + · · · + g² + g + 1). Once again computing gcd(g^{2^nδ–1} + g^{2^nδ–2} + · · · + g² + g, f) for a random splits f with probability ≈ 1/2 and, thus, we get an EDF algorithm that runs in expected polynomial time.

Exercise Set 3.5

3.33

Find a (polynomial-basis) representation of

. Compute a primitive element in this representation.

3.34

Show that the running time of Algorithm 3.20 is O(s(r – s)) which reaches the maximum order of O(r²) = O(s²), when s ≈ r/2.
Suppose b is known to have e non-zero coefficients. Modify the Euclidean division loop of Algorithm 3.20 so that the algorithm runs in time O((r –s)e). [H] In particular, if e = O(1), the running time of Algorithm 3.20 becomes linear, namely O(r).

3.35

Implement the polynomial arithmetic of

given that of

3.36

Let q = pⁿ (p prime and

a non-constant polynomial and let g := gcd(f, X^q – X).

If S is the set of all roots of f in , show that . Thus, g is a square-free polynomial which splits over and has the same roots (over ) as f. If deg g = 0 or 1, then we know all the roots of g and hence of f. So, for the rest of this exercise, we assume that deg g ≥ 2.

Consider the case that p is odd. Let be arbitrary. Show that

(X + b)((X + b)^(q–1)/2 – 1)((X + b)^(q–1)/2 + 1) = X^q – X

and that

g = gcd(g, X + b) gcd(g, (X + b)^(q–1)/2 – 1) gcd(g, (X + b)^(q–1)/2 + 1).

Explain how Algorithm 3.29 produces two non-trivial factors of g (over ) in probabilistic polynomial time. [H] Write an algorithm to compute all the roots of f in .

Algorithm 3.29. Computing roots of a polynomial: odd characteristic

Input: A square-free polynomial that splits over .

Output: Polynomials g₁, with g = g₁g₂ and deg g_i ≥ 1 for i = 1, 2.

Steps:

if (g(0) = 0) { (g₁, g₂) := (X, g(X)/X), return. }
while (1) {
  Select a random element .
  h := (X + b)^(q–1)/2 – 1 (mod g).
  g₁ := gcd(g, h).
  if (1 ≤ deg g₁ < deg g) { g₂ := g/g₁, return. }
}

Now, assume that p = 2 and define the polynomial

Let be arbitrary. Show that
H(X + b)(H(X + b) + 1) = X^q – X
[H] and that
g(X) = gcd(g(X), H(X + b)) gcd(g(X), H(X + b) + 1).

Explain how Algorithm 3.30 produces two non-trivial factors of g (over ) in probabilistic polynomial time. Write an algorithm to compute all the roots of f in .

Algorithm 3.30. Computing roots of a polynomial: characteristic 2

Input: A square-free polynomial that splits over .

Output: Polynomials g₁, with g = g₁g₂ and deg g_i ≥ 1 for i = 1, 2.

Steps:

if (g(0) = 0) { (g₁, g₂) := (X, g(X)/X), return. }
while (1) {
   Select a random element .
   h := (X + b) + (X + b)² + (X + b)⁴ + · · · + (X + b)^{2^n–1} (mod g).
   g₁ := gcd(g, h).
   if (1 ≤ deg g₁ < deg g) { g₂ := g/g₁, return. }
}

3.37

Use Exercise 3.36 to compute all the roots of the following polynomials:

X⁶ + 6X⁴ + 4X² + 6 in .
X³ + (α² + α)X² + (α² + α + 1) in , where is represented as , α being a root of the polynomial X³ + X + 1.

3.38

Let f and g be two monic irreducible polynomials over

and of the same degree

. Consider the two representations

. In this exercise, we study how we can compute an isomorphism between these two representations. The polynomial f(Y) splits into linear factors over

. Consider a root α = α(Y) of f(Y) in

. Show that 1, α, α², . . . , α^n–1 is an

-basis of (the

-vector space)

. For i = 0, . . . , n – 1, write (uniquely)

with

, and consider the matrix A = (α_ij)_{0≤i≤n–1, 0≤j≤n–1}. Show that the map

that maps (the equivalence class of) a₀ + a₁X + · · · + a_n–1X^n–1 to (the equivalence class of) b₀ + b₁Y + · · · + b_n–1Y^n–1, where (b₀b₁ . . . b_n–1) = (a₀a₁ . . . a_n–1)A, is an

-isomorphism.

3.39

Let q = pⁿ for a prime p and

. We have seen that the elements of

can be represented as integers between 0 and p – 1, whereas the elements of

can be represented as polynomials modulo some irreducible polynomial

of degree n, that is, as polynomials of

of degrees < n. Show that the substitution X = p in the polynomial representation of elements of

gives a representation of elements of

as integers between 0 and q – 1. We call this latter representation of elements of

the packed representation. Compare the advantages and disadvantages of the packed representation over the polynomial representation.

3.40

Let G be a cyclic multiplicatively written group of order m (and with the identity element e). Assume that the factorization of

is known. Devise an algorithm that computes the order of an arbitrary element in G. [H]

3.41

Berlekamp’s Q-matrix factorization Let be a monic square-free polynomial of degree d, that admits a factorization f(X) = f₁(X) . . . f_r(X) with each monic, non-constant and irreducible. (Note that f_i are pairwise distinct, since f is square-free.) Let d_i be the degree of f_i.

Consider the ring

Show that . [H] A is an -vector space of dimension d.
Consider the map that maps x = X + 〈f(X)〉 to x^q – x. Show that is an -linear transformation with Ker , and so the nullity of equals the number of irreducible factors of f.
Let Q be the matrix of with respect to the basis 1, x, . . . , x^d–1. Describe an algorithm to compute Q. Also design an algorithm to compute a basis of Ker .
Show that if , then

For a suitable h(X), this is a non-trivial factorization of f. This procedure is efficient, when q is small.
Use Berlekamp’s method to factor X⁶ + X⁵ + X² + 1 over .