Appendices

Appendix A Complex Numbers

The set img of real numbers has deficiencies. For example, the equation x2 + 1 = 0 has no real root; that is, no real number u exists such that u2 + 1 = 0. This type of problem also exists for the set img of natural numbers. It contains no solution of the equation x + 1 = 0, and the set img of integers was invented to solve such equations. But img is also inadequate (for example, 2x − 1 = 0 has no root in img), and hence the set img of rational numbers was invented. Again, img contains no solution to x2 − 2 = 0, so the set img of real numbers was created. Similarly, the set img of complex numbers was invented that contains a root of x2 + 1 = 0. More precisely, there is a complex number i such that

i2 = − 1.

However, the process ends here. The complex numbers have the property that every nonconstant polynomial with complex coefficients has a (complex) root. In 1799, at the age of 22, Carl Friedrich Gauss first proved this result, which is known as the Fundamental Theorem of Algebra. We give a proof in Section 6.6.

In this appendix, we describe the set img of complex numbers. The set of real numbers is usually identified with the set of all points on a straight line. Similarly, the complex numbers are identified with the points in the euclidean plane by labeling the point with cartesian coordinates (a, b) as

(a, b) = a + bi.

Then the set img of complex numbers is defined by

img

When this is done, the resulting Euclidean plane is called the complex plane.

Each real number a is identified with the point a = a + 0i = (a, 0) on the x-axis in the usual way, and for this reason the x-axis is called the real axis. The points

img

bi = 0 + bi = (0, b) on the y-axis are called imaginary numbers, and the y-axis is called the imaginary axis.138 The diagram shows the complex plane and several complex numbers.

Identification of the complex number a + bi = (a, b) with the ordered pair (a, b) immediately gives the following condition for equality:

Equality Principle. a = bi = a′ + b′ i if and only if a = a′ and b = b′.

For a complex number z = a + bi, the real numbers a and b are called the real part of z and the imaginary part of z, respectively, and are denoted by a = rez and b = imz. Hence, the equality principle becomes as follows: Two complex numbers are equal if and only if their real parts are equal and their imaginary parts are equal.

With the requirement that i2 = − 1, we define addition and multiplication of complex numbers as follows:

img

These operations are analogous to those for linear polynomials a + bx, with one difference: i2 = − 1. These definitions imply that complex numbers satisfy all the arithmetic axioms enjoyed by real numbers. Hence, they may be manipulated in the obvious fashion, except that we replace i2 by −1 whenever it occurs.

Example 1. If z = 2 − 3i and img

img

Example 2. Find all complex numbers z such that z2 = − i.

Solution. Write z = a + bi, where a and b are to be determined. Then the condition z2 = − i becomes

(a2b2) + 2abi = 0 + (− 1)i.

Equating real and imaginary parts gives a2 = b2 and 2ab = − 1. The solution is

img

Theorem 1 collects the basic properties of addition and multiplication of complex numbers. The verifications are straightforward and left to the reader.

Theorem 1. If z, u, and img are complex numbers, then

img

The following two notions are indispensable when working with complex numbers. If z = a + bi is a complex number, the conjugate img and the absolute value (or modulus) |z| are defined by

img

Thus, img is a complex number and is the reflec-tion of z in the real axis (see the diagram), whereas |z| is a nonnegative real number and equals the distance between z and the origin. Note that the absolute value of a real number a = a + 0i is img using the definition of absolute value for complex num-bers, which agrees with the absolute value of a regarded as a real number.

img

Theorem 2. Let z and img denote complex numbers. Then

1. img

2. img

3. img

4. z is real if and only if img

5. img

6. |z| ≥ 0 and |z| = 0 if and only if z = 0

7. img

Proof. (1) We prove (2), (5), and (7) and leave the rest to the reader. If z = a + bi and img we compute

img

which proves (2). Next, (5) follows from

img

Finally (2) and (5) give

img

Then (7) follows when we take positive square roots.

Let z be a nonzero complex number. Then (6) of Theorem 2 shows that |z| ≠ 0, and so img by (5). As a result, we call the complex number img the inverse of z and denote it z−1 = 1/z, which proves Theorem 3.

Theorem 3. If z = a + bi is a nonzero complex number, then z has an inverse given by

img

Hence, for real numbers, dividing by any nonzero complex number is possible. Example 3 shows how division is done in practice.

Example 3. Express img in the form a + bi.

Solution. We multiply the numerator and denominator by the conjugate 2 − 5i of the denominator:

img

The addition of complex numbers has a geometric description. The diagram shows plots of the complex numbers z = a + bi and img and their sum img. These points, together with the origin, form the vertices of a parallelogram, so we can find the sum img geometrically by completing the parallelogram. This method is the Parallelogram Law of Complex Addition and is a special case of vector addition, as students of linear algebra will recognize.

img

The geometric description of complex multiplication requires that complex num-bers be represented in polar coordinates. The circle with its center at the origin and radius 1 shown in the diagram below is called the unit circle. An angle θ measured counterclockwise from the real axis is said to be in standard po-sition. The angle θ determines a unique point P on this circle. The radian measure of θ is defined to be the length of the arc from 1 to P. Hence, the radian measure of a right angle is π/2 radians and that of a full circle is 2π radians. We define the cosine and sine of θ (written cos θ and sin θ) to be the x and y coordinates of P. Hence, P is the point (cos θ, sin θ) = cos θ + i sin θ in the complex plane. These complex numbers cos θ + i sin θ on the unit circle are denoted e = cos θ + i sin θ.

img

img

A complete discussion of why we use this notation lies outside the scope of this book.139

The fact that e is actually an exponential function of θ is confirmed by verifying that the law of exponents holds, that is,

img

This law is analogous to the exponent rule eaeb = ea+b for real exponents a and b, and it is an immediate consequence of the identities for sin (θ + img) and cos (θ + img):

img

We can now describe complex multiplication geometrically. We let z = a + bi be any complex number. The distance r from z to 0 is the modulus r = |z|. If z ≠ 0, it determines an angle θ, as shown in the diagram, called an argument of z. This angle is not unique (θ + 2πk would do as well for any k = 0, ± 1, ± 2, . . .,) but, as the diagram clearly shows,

img

img

always hold. Hence, in any case

img

This expression is the polar form of the complex number z. The geometric description of complex multiplication follows from the law of exponents.

Theorem 4. Multiplication Rule. If z = re and img are two complex numbers in polar form, then

img

In other words, to multiply two complex numbers, simply multiply the absolute values and add the arguments. This method simplifies calculations and is valid for any arguments θ and img.

Example 4. Multiply img by first converting the factors to polar form.

Solution. The polar forms (see the diagram) are

img

Hence, the multiplication rule gives

img

img

Of course, direct multiplication gives img so equating real and imaginary parts gives the (somewhat unexpected) formulas

img

If z = re is given in polar form, z2 = r2e2 by the multiplication rule. Hence, z3 = (re)(r2e2) = r3e3. In general, we have Theorem 5 for any n ≥ 1 (we leave the proof for n ≤ 0 as Exercise 15(b)). The name honors Abraham DeMoivre (1667-1754).

Theorem 5. DeMoivre's Theorem. If θ is any angle and r > 0, then (re)n = rneinθ for all integers n.

Example 5. Verify that img

The polar form is img Hence DeMoivre's theorem gives

img

If n ≥ 1, a complex number u is called an nth root of unity if un = 1. DeMoivre's theorem gives a way to find all possibilities (there are n). If we write u = re in polar form and use DeMoivre's theorem, the condition un = 1 becomes

img

Comparing absolute values gives rn = 1, so r = 1 (because r is real and positive). However, the arguments may differ by integral multiples of 2π, so all we can conclude is that = 2, where k is an integer; that is,

img

These arguments give distinct values of u on the unit circle for k = 0, 1, 2, . . ., n − 1, as shown in the di-agram. But every choice of k yields a value of θ differing from one of these by a multiple of 2π, so they give all the possible roots. This proves Theorem 6.

img

Theorem 6. The nth roots of unity are img for k = 0, 1, 2, . . ., n − 1.

We find these roots geometrically as the distinct points on the unit circle, starting at 1, that cut the circle into n equal sectors. Note that if n = 2, the roots are 1 and −1, whereas the four 4th roots of unity are 1, i, − 1, and −i. In general, if we write img then the nth then img for each k ≥ 1, so the nth roots of unity are just the powers of img

img

For this reason, img is called a primitive nth root of unity. It follows easily (Exercise 16) that img that is, the sum of the nth roots of unity is zero.

Exercises A

1. Solve each equation for the real number x.

(a) x − 4i = (2 − i)2

(b) (2 + xi)(3 − 2i) = 12 + 5i

(c) (2 + xi)2 = 4

(d) (2 + xi)(2 − xi) = 5

2. Convert each expression to the form a + bi.

img

3. In each case, find the complex number z.

img

4. Let rez and imz denote the real and imaginary parts of z. Show that

img

5. In each case, describe the graph of the equation, where z denotes a complex number

img

6. Verify img directly for z = a + bi and img

7. Prove that img for all complex numbers img and z.

8. Show that (1 + i)n + (1 − i)n is real for all integers n ≥ 1.

9.

(a) Complex Distance Formula. Show that img is the distance between the complex numbers z and img

(b) Triangle Inequality. Show that img for all complex numbers z and img [Hint: Consider the triangle with vertices img and img]

10. Write each expression in polar form.

img

11. Write each expression in the form a + bi.

img

12. Write each expression in the form a + bi.

img

13. Use DeMoivre's theorem to show that

img

14. Find all complex numbers such that

img

15. Let z = re in polar form.

img

16. Show that the sum of the nth roots of unity is 0.

img

17.

(a) Suppose that z1, z2, z3, z4, and z5 are equally spaced around the unit circle. Show that z1 + z2 + z3 + z4 + z5 = 0. [Hint: (1 − z)(1 + z + z2 + z3 + z4) = 1 − z5 for any complex number z.]

(b) Repeat (a) for any n ≥ 2 points placed equally around the unit circle.

18. If z = a + bi, show that img [Hint: (|a| − |b|)2 ≥ 0.]

19. Let f(x) = a0 + a1x + a2x2 + img + anxn be a polynomial with real coefficients ai. If z is a complex root of f(x), that is, f(z) = 0, show that img is also a root.

20. If f(x) is a polynomial with complex coefficients, let img be the polynomial obtained from f(x) by taking the conjugate of every coefficient. Show that img is a polynomial with real coefficients.

21. Let z ≠ 0 be a complex number. If t is real, describe tz geometrically if

(a) t > 0

(b) t < 0

22. If z and img are nonzero complex numbers, show that img if and only if one is a positive real multiple of the other. [Hint: Consider the parallelogram with vertices img and img Use Exercise 21 and the fact that, if t is real, |1 + t| = 1 + |t| is impossible if t < 0.]

23. If a and b are rational numbers, let p and q denote numbers of the form img If img define img and [p] = a2 − 2b2. Show that each of the following expressions holds.

img

Appendix B Matrix Algebra

Matrix algebra will be familiar to most readers as it is standard fare in beginning linear algebra courses. The new ingredient here is that the matrices we consider will have entries drawn from an arbitrary commutative ring R, rather than from the real numbers img A ring is an algebraic system in which there are two operations, addition and multiplication, for which the usual laws of arithmetic are valid (see Section 3.1). A ring R is called commutative if ab = ba for all a, b img R. Examples include the familiar number systems img and img It is worth noting that the set img of all 2 × 2 matrices over img is a ring that is not commutative.

In this appendix, the standard results from linear algebra about matrices, adjugates, inverses, determinants, and so on, will be stated, with suitable minor modifications, over an arbitrary commutative ring R. We omit most proofs. When img many proofs from linear algebra remain valid, but this is often not the case. It is important to note that it is essential that R is commutative in many arguments, especially when dealing with inverses and determinants. Hence,

R denotes a commutative ring throughout Appendix B.

Matrix Algebra

A rectangular array of elements of R is called a matrix and the elements themselves are called the entries of the matrix. Thus,

img

are matrices over img The shape of a matrix depends on the number of rows and columns, and an m × n matrix is one with m rows and n columns. Two matrices are the same size if they have the same number of rows and the same number of columns. Thus, the preceding matrices A, B, and C are of size 2 × 2, 2 × 3, and 3 × 1, respectively. An n × n matrix is called a square matrix.

The rows and columns of a matrix are numbered from the top down and from left to right, respectively. Then the entry in row i and column j of a matrix A is called the (i, j)-entry of A. If the (i, j)-entry is denoted aij, then A has the form

img

which usually is abbreviated as A = [aij]. Two m × n matrices A = [aij] and B = [bij] are equal (written A = B) if they have the same size and corresponding entries are equal, that is

img

The set of all m × n matrices with entries from R is denoted Mmn(R). For A and B in Mmn(R), we obtain their sum A + B by adding corresponding entries. If A = [aij] and B = [bij], this is

img

This addition enjoys many of the properties of numerical addition. For example, if A, B, and C are in Mmn(R), then

A + B = B + A and A + (B + C) = (A + B) + C.

The matrix in Mmn(R) each of whose entries is zero is called the zero matrix of size m × n and is denoted 0 (or 0mn if the size must be emphasized). Clearly,

A + 0 = A, for all A in Mmn(R).

So 0 plays the role in Mmn(R) that the number zero plays in img We obtain the negative −A of a matrix A in Mmn(R) by negating every entry of A. Hence,

A + (− A) = 0, for all A in Mmn(R).

Finally we define subtraction by AB = A + (− B). If A = [aij] and B = [bij],

A = [− aij] and AB = [aijbij].

With these definitions, the additive arithmetic in Mmn(R) is entirely analogous to numerical arithmetic.

We also use the following notation: If A is a matrix and r img R, the matrix rA is obtained by multiplying every entry of A by r. More formally

If A = [aij] then rA = [raij].

This is called scalar multiplication and one verifies that it enjoys the following useful properties for all scalars r, s and matrices A, B:

1. r(A + B) = rA + rB

2. (r + s)A = rA + sA

3. r(sA) = (rs)A

4. 1A = A

Example 1. Given img and img in img we have

img

Example 2. If img and img in M23(R), find X in M23(R) such that X + A = B.

Solution. We proceed as in numerical arithmetic and subtract A from both sides:

img

Multiplication of matrices is less natural than addition. To describe it, we define the dot product of a row matrix and a column matrix as follows:

img

Now let A be an m × k matrix and B be a k × n matrix, chosen so that the rows of A and the columns of B have the same number k of entries. Then the product AB is defined to be the m × n matrix whose

(i, j)-entry is the dot product of row i of A and column j of B.

Thus to compute the (i, j)-entry, go across the ith row of A and down the jth column of B and form the dot product. Note: If A is m × k and B is k′ × n, then

AB is defined only if k = k′, and then the product AB is m × n.

Example 3. For img and img compute AB and BA.

Solution. We write out the dot products explicitly.

img

Example 4. If img and img compute A2, AB, and BA.

Solution. img

so A2 = 0 can occur even when A ≠ 0. Next

img

Hence ABBA is possible even though they are both the same size.

Example 4 shows that two familiar properties of numerical algebra fail for matrices. Hence, it is surprising to learn that the following property does hold.

Theorem 1. Let A, B, and C be of sizes m × p, p × q, and q × n, respectively. Then (AB)C = A(BC).

Proof. Write A = [aij], B = [bij], and C = [cij]. Then AB = [xij] where we have img and (AB)C = [yij] where img Hence,

img

This last expression is the (i, j)-entry of A(BC), and the theorem follows. Note that we needed the associativity of R to get aik(bktctj) = aikbktctj = (aikbkt)ctj.

We express this result by saying that matrix multiplication is associative when the matrix sizes are such that the products involved are all defined.

The number 1 plays a neutral role in numerical multiplication in the sense that 1a = a and a 1 = a for every number a. The analogous role in matrix algebra is played by the identity matrices In. For each n ≥ 1, the matrix In is defined to be the n × n matrix with 1s along the main diagonal (upper left to lower right), and 0s elsewhere. Thus,

img

We use I without a subscript for the identity matrix when there is no need to emphasize the size. The reader can verify that the relations

img

hold whenever the matrix products are defined.

Note that rA = (rI)A for all r img R and all matrices A. Moreover, since R is commutative we have (when the matrix multiplications are defined)

r(AB) = (rA)B = A(rB), for all r img R.

Square Matrices and Inverses

We are interested primarily in square matrices. For convenience, we use the notation

Mn(R) = Mnn(R), for any n ≥ 2.

If A and B lie in Mn(R) then A + B and AB are both in Mn(R). Theorem 2 collects several properties of Mn(R) for reference later.

Theorem 2. Let A, B, and C be matrices in Mn(R). Then

1. A + B = B + A

2. (A + B) + C = A + (B + C)

3. A + 0 = A

4. A + (− A) = 0

5. (AB)C = A(BC)

6. AI = A = IA

7. A(B + C) = AB + AC and (B + C)A = BA + CA

Proof. The only property not discussed previously is (7), and we leave the verification as Exercise 12.

The reader may have noted that Theorem 2 shows that Mn(R) is a ring (noncommutative if n ≥ 2). It is also noteworthy that Theorem 2 holds even if the ring R is not commutative.

If A is a square matrix, a matrix B is called an inverse of A if

img

If it exists, this matrix B is uniquely determined by A. For if AC = I also holds, then AC = AB (both equal to I) so left multiplication by B gives C = B. A square matrix A is called invertible if it has an inverse, and in this case the (unique) inverse is denoted A−1. Note that 0 has no inverse; in fact if AB = 0 for some B ≠ 0 then A has no inverse.

Write Rn for the set of n × 1 column matrices with entries from R. If img it is known that A is invertible if and only if the system AX = B of linear equations has a solution X for any img This holds in general because AB = I in Mn(R) implies BA = I (Exercise 13). On the other hand, if img it is known that A is invertible if and only if the homogeneous linear system AX = 0 has only the trivial solution X = 0. But this condition fails for img (consider A = 2I). Hence if A is a square matrix, we need other ways to determine when A−1 exists.

Again, it is well known that img is invertible if and only if det A ≠ 0. If R is any commutative ring it turns out that the determinant of A img Mn(R) can be defined in such a way that A is invertible if and only if det A is a unit in R (u img R is a unit if uv = 1 = vu for some img The idea is to define det A inductively. If n = 1 and A = [a], this holds if we define

det [a] = a.

If n ≥ 2, assume inductively that det A has been defined for all (n − 1) × (n − 1) matrices over R. If A is n × n write Aij for the (n − 1) × (n − 1) matrix obtained from A by deleting row i and column j. Then, given 1 ≤ i, jn we define the (i, j) -cofactor of A by

cij(A) = (− 1)i+j det Aij.

With this, we define det A as follows:

(*) det A = a11 c11(A) + a21 c21(A) + img + an 1 cn 1(A).

Then one shows that the following properties of determinants hold for rows:

1. If B is formed by multiplying a row of A by u img R, then det B = u det A.

2. If A contains a row of zeros then det A = 0.

3. If two rows of A are interchanged then det A changes sign.

4. If a multiple of a row of A is added to a different row, then det A is unchanged.

5. If two rows of A are identical then det A = 0.

With this one can prove a result first given (for img in 1772 by Pierre Simon de Laplace.

Theorem 3. Cofactor Expansion Theorem. If A = [aij] is an n × n matrix over a commutative ring R, then

1. img, for each j = 1, 2, . . ., n,

2. img, for each i = 1, 2, . . ., n.

Furthermore, (a)–(e) above hold for rows and for columns.

We omit the details.140 The expressions in (1) and (2) of Theorem 3 are called the cofactor expansions along column j and row i, respectively. In words, to find the cofactor expansion of det A along a row or column, multiply the entries of the row (column) by the corresponding cofactors, and add the results.

The cofactor expansion theorem has many important consequences, and the following result is necessary to describe them. Given an n × n matrix A = [aij], the transpose AT of A is defined by

AT = [bij], where bij = aji for all i and j.

This is an n × n matrix obtained from A by interchanging elements symmetric about the main diagonal. For example, if img then img The familiar elementary properties of the transpose hold

1. (A + B)T = AT + BT

2. (rA)T = rAT

3. (AB)T = BTAT)

when the matrix operations are defined (even for nonsquare matrices). We omit the proof of the following theorem.

Theorem 4. Transpose Theorem. If A is a square matrix then det AT = det A.

Surprisingly, the determinant function preserves matrix multiplication. Again we omit the proof.

Theorem 5. Multiplication Theorem. If A and B are n × n matrices then

det (AB) = det A det B.

Given an n × n matrix A = [aij] over a commutative ring R, we define the adjugate of A (also called the classical adjoint of A) as follows:

adj A = [cij(A)]T.

This is also n × n and the cofactor expansion theorem (with some ingenuity) gives

Theorem 6. Adjugate Theorem. If A is n × n and d = det A, then

A (adj A) = dIn = (adj A) A.

Again we omit the details.

If it happens that d = det A is a unit in R, then (as R is commutative) the adjugate theorem gives

A[d−1 adj (A)] = In = [d−1 adj (A)]A,

from which A is invertible and A−1 = d−1 adj (A). This proves part of

Theorem 7. Invertibility Theorem. If A is n × n then A is invertible if and only if det A is a unit in R. In this case det (A−1) = (det A)−1.

The rest of the proof follows from Theorem 5 (Exercise 14).

The case of 2 × 2 matrices is simple and so arises frequently in examples. The relevant facts are displayed next.

Example 5. If n = 2 and img then we have adj img and det A = adbc. Hence, if adbc is a unit in R then img is a convenient formula for the inverse of A.

These results are enough to do most of the calculations in this book. A good reference source for this material (and much more) is the book by McDonald, B.R. Linear Algebra over Commutative Rings, Marcel Dekker Inc., New York, 1984.

Multilinear Approach

An n × n matrix A can be thought of as a row of columns in Rn, and it is instructive to view matrices that way. Hence, we write A = [A1, . . ., Ak, . . ., An], where Ak denotes column k of A. The determinant function det: Mn(R) → R has two basic properties from this point of view.

First, det is a multilinear function of the columns of a matrix, that is, it is a linear function of column k for each k (when we fix all other columns of A). More precisely, if we define

img

the requirement is that, for each k,

δk(rX + sY) = k(X) + k(Y) for all r, s img R and X, Y img Rn.

Second, det is an alternating function of the columns of a matrix; that is if two distinct columns of A are interchanged to form B, then det B = − det A.

Before proceeding, write Xn = {1, 2, . . ., n} and recall the symmetric group Sn (see Section 1.4) of all permutations of Xn, that is all bijections σ: XnXn. Each such σ is a product of transpositions, that is permutations that interchange two members of Xn. And σ is called even or odd according as it can be expressed as a product of an even, respectively odd, number of transpositions (the parity theorem ensures that this is well defined). The sign of σ is defined to be 1 or −1 according as σ is even or odd, and is written (− 1)σ.

Hence, the fact that det is alternating means that if B is obtained from A by a series of column transpositions, then det B = (− 1)σ det A where σ is the corresponding column permutation. With this it can be shown that d = det is the only multilinear, alternating function d: Mn(R) → R that satisfies d(I) = 1. Moreover, if we write σ(k) = σk when σ img Sn and k img Xn, the following characterization of det A can be proved.

Theorem 8. If A = [aij] is n × n then img where the sum ranges over all n! elements σ of Sn.

Theorem 8 leads to other important properties of the determinant, and is often taken as the definition.

Exercises B

Throughout these exercises, R is assumed to be a commutative ring.

1. If A is square, AB = 0, and B ≠ 0, show that A cannot be invertible.

2.

(a) If B = [B1, . . ., Bk, . . ., Bn] where Bk is column k of B, show that AB = [AB1, . . ., ABk, . . ., ABn].

(b) If AX = 0 for every X img Rn, show that A = 0.

3. If img and img show that AB = I2 but BAI3.

4. Show that img is invertible in M2(R) if and only if a and c are both units. In this case find A−1.

5. Find invertible 2 × 2 matrices A and B such that A + B is not invertible.

6. Find a matrix X such that AX = B if img and img

7. If A and B are n × n, show that (AB)(A + B) = A2B2 if and only if AB = BA.

8. If A2 = 0, show that I + A is invertible and find (I + A)−1 in terms of A.

9. If img show that A3 = I and use this result to find A−1 in terms of A.

10. Show that img satisfies A2 − 3A − 10I = 0 and use this result to find A−1 in terms of A.

11. Let A and B denote n × n matrices.

(a) If A and B are invertible, show that A−1 and AB also are invertible, and find formulas for the inverses.

(b) If A, B and A + B are invertible, show that A−1 + B−1 is also invertible find a formula for the inverse.

(c) If I + BA is invertible show that I + AB is invertible and find a formula for (I + AB)−1. [Hint: A(I + BA) = (I + AB)A.]

12. Prove (7) of Theorem 2. (These expressions are called the distributive laws.)

13.

(a) If A, B img Mn(R), show that AB = I implies that BA = I.

(b) Show that A is invertible if and only if AX = B has a solution for every B img Rn. [Hint: Write I = [E1, . . ., Ek, . . ., En], use (a) Exercise 2(a).]

14. Prove Theorem 7 using Theorems 5 and 6.

15. Let Eij denote the matrix in Mn(R) with (i, j) -entry 1 and all other entries 0.

(a) Show that EikEmj = δkmEij where δkm = 0 or 1 according as k = m or km.

(b) Show that E11 + E22 + img + Enn = I.

(c) Show that if A = [aij] img Mn(R) then A = Σi,jaijEij.

Here the Eij are called matrix units in Mn(R), and δkm is called the Kronecker delta.

Appendix C Zorn's Lemma

The independent sets in a vector space are undeniably important, but the largest independent sets are the most important of all: they are the bases of the space. This theme that the “ largest” objects of a given type are the most interesting is universal in mathematics, certainly in algebra. Zorn's lemma is a set-theoretical principle that shows that maximal objects of various types exist and has become indispensable in many parts of mathematics. Clarifying what “ maximal” means is best formulated using the concept of a partial ordering.

A partial order on a nonempty set P is a relation141 ≤ on P that satisfies the following conditions (where x, y, and z denote elements of P):

img

A set P with a partial ordering is called a partially ordered set (poset for short). We say that (P, ≤) is a poset to assert that ≤ is a partial ordering on the set P. Posets occur everywhere in mathematics; here are a few examples.

The following are easily verified to be partial orders:

Inclusion ⊆ on any nonempty collection of sets.

Divisibility img on any nonempty set of positive integers.

The usual ordering ≤ on the set img of real numbers.

Nonempty subsets of a poset are again posets with the same partial order.

Other examples will occur later.

If ≤ is a partial order on a set P, we write x < y to mean xy and xy. An element m img P is called maximal in P if there is no element x of P such that m < x, or equivalently,

img

For example, if U is any nonempty set, let img denote the set of proper subsets XU. Then the maximal members of img (under inclusion) are the sets U {a} omitting one element a img U. Here is a more algebraic example.

Example 1. If R is a ring, the maximal ideals of R (Section 3.3) are defined to be the maximal members of the poset img is an ideal of R} partially ordered by inclusion. If R is commutative, corollary 1 of Theorem 6 §3.3 shows that an ideal A is maximal if and only if the factor ring R/A is a field.

Zorn's lemma gives a condition that guarantees that maximal elements exist in certain posets. To state it, we need some terminology. Let (P, ≤) be a poset. If XP is a nonempty subset, an element u img P is called an upper bound for X if xu for every x img X. Note that u need not be an element of X. For example, in the poset img the interval img has no maximal member, but any number u ≥ 1 is an upper bound on X.

If we are looking for maximal elements in a poset P, choose x1 img P. If x1 is maximal, we are done. Otherwise, x1 < x2 for some x2 img P. If x2 is maximal, we are finished; otherwise x1 < x2 < x3 for some x3 img P. Hence, either we find a maximal element at some stage, or we create elements x1, x2, x3, . . . in P such that x1< x2 < x3 < img is a strictly increasing sequence. So it is plausible that guaranteeing the existence of maximal elements in P will require some restriction on such ascending sequences from P. In 1935, Max Zorn found a condition on P that does this and is reasonably easy to verify in specific situations.

Two elements x and y in a poset P are called comparable if either xy or yx, and the poset P is called a chain if any two elements are comparable. Thus, img is a chain with respect to the usual partial ordering. A partially ordered set P is said to be inductive if every chain in P has an upper bound in P.

Theorem. Zorn's Lemma. Every inductive partially ordered set has a maximal element.

Note that to show that a poset is inductive, it is not enough to check that all countable142 chains x1< x2 < x3 < img have upper bounds in P. For example, consider the set img of all countable subsets of img partially ordered by inclusion. Then every countable chain in img has an upper bound in img, namely, its union (since unions of countable sets are again countable). But img has no maximal member because such a maximal set would have to be img itself, and img is not a countable set.

Zorn's lemma has a wide variety of applications throughout mathematics; we give three examples from algebra. We begin with vector spaces. In Section 6.1, we proved that every finite dimensional vector space has a basis, but the proof that this holds in the infinite dimensional case requires Zorn's lemma. If V is a vector space over a field F, let X be a nonempty (possibly infinite) subset of V. Then X is called independent if every finite subset of X is independent (in the sense of Section 6.1), and X is said to span V if every element of V is a linear combination of (a finite number of) elements of X, Finally, X is called a basis of V if it is independent and spans V.

Example 2. If F is a field, show that every nonzero vector space V has a basis.

Solution. The idea is to show that any maximal independent set is a basis. Let img denote the set of all independent sets in V and partially order img by inclusion. We begin by using Zorn's lemma to show that img has maximal members. First, img is not empty since img is in img for all img in V. Suppose that img is a chain in img if img, we claim that X is independent. To see this, let {x1, x2, . . ., xn} be a finite subset of X. Then each xk is in some Xi, so, since Xi form a chain, {x1, x2, img, xn} ⊆ Xm for some m. Hence, {x1, x2, . . ., xn} is independent (since Xm is in img), as required. This shows that X is in img and so is an upper bound for img in img

Now Zorn's lemma shows that img has a maximal member B, and we claim that B is a basis of V. Since it is independent (being in img), it remains to prove that span(B) = V, where span(B) consists of all linear combinations of vectors in B. Assume on the contrary that img we show that img is independent, contradicting the maximality of B.

So let X be a finite subset of img we must show that X is independent. If XB, this follows because img If img, then img, so write img, xi img B. Let img ai img F. Then a0 = 0 because otherwise img, contrary to our choice (since img exists in the field F). Hence, a1x1 + a2x2 + img + anxn = 0, so ai = 0 for i ≥ 1, as required.

It is worth noting that Example 2 is true for any division ring in place of F. In fact, most of the theory of vector spaces goes through for division rings.

The next two examples of Zorn's lemma come from ring theory. An additive subgroup L of a ring R is called a left ideal if RaL for all a img L, where Ra = {ra img r img R}. Right ideals are defined similarly, and the ideals of R (see Section 3.3) are just the left and right ideals. A maximal left ideal M is defined to be a maximal member of {LR img L is a left ideal}, partially ordered by inclusion.

Example 3. If LR is any left ideal, then L is contained in a maximal left ideal.

Solution. Let img is a left ideal and LXR}, partially ordered by inclusion. Then img is not empty (img so it suffices to show that img contains a maximal element. By Zorn's lemma, it is enough to show that img is inductive. Hence, let {Xi img i img I} be a chain from img; we show that img is an upper bound for {Xi}. Clearly, X is a left ideal containing L, and we claim that XR. For if X = R, then 1 img X, say 1 img Xk for some k img I. Since Xk is a left ideal, it follows that R = R 1 ⊆ XkR, so Xk = R, contrary to the fact that img So X is an upper bound in img on the chain {Xi}, as required.

As our last example of the use of Zorn's lemma, we prove a theorem that is of central importance in the theory of commutative rings. Let R be a commutative ring. An ideal P of R is called a prime ideal of R if R/P is an integral domain, or equivalently, if rs img P, where r, s img R, then either r img P or s img P (see Theorem 3 §3.3). Every commutative ring has at least one prime ideal by Example 3 (since maximal ideals are prime in any commutative ring).

An element a img R is said to be nilpotent if an = 0 for some n ≥ 1, and the set nil(R) of all nilpotents in a commutative ring R is called the nil radical of R. It is easy to verify that nil(R) ⊆ P for every prime ideal P, and hence that nil(R) ⊆ ∩ {P img P is a prime ideal of R}. The following example uses Zorn's lemma to show that this is in fact equality.

Example 4. If R is a commutative ring, nil(R) = ∩ {P img P is a prime ideal of R}.

Solution. Let a img P for every prime ideal P; by the above remarks, we must show that a is nilpotent. So we assume that a is not nilpotent and show (using Zorn's Lemma) that aP for some prime ideal P. To this end, let

img is an ideal of R and anA for every n ≥ 1}.

Then img is not empty as img since a is not nilpotent. Suppose {Ai img i img I} is a chain from img, and let A = img iimgIAi. Then A is an ideal and if an img A, then an img Ak for some k, contradicting the fact that img Hence, anA for each n, and so img This shows that img is inductive, so, by Zorn's lemma, let img be a maximal member of img Then certainly a = a1P, so it remains to show that P is a prime ideal. To that end, let rs img P, where r, simg R; we must show that r img P or s img P. Suppose, on the contrary, that rP and sP. Then

img

and PRr + P because rP. Since P is maximal in img it follows that Rr + P is not in img and hence that an img Rr + P for some n ≥ 1, say an = t1r + p1, where t1 img R and p1 img P. Similarly, since sP, there exists m ≥ 1 such that am = t2s + p2, where t2 img R and p2 img P. But then

am+n = (t1r + p1)(t2s + p2) = (t1t2)rs + (t1r)p2 + (t2s)p1 + p1p1.

Hence, am+n img P because rs, p1, and p2 are all in P. This contradiction completes the proof.

The proof of Zorn's lemma is difficult and will be omitted.143 It requires (and is in fact equivalent to) the Axiom of Choice, which asserts that if img is any family of nonempty sets, we can form a set containing one element of each of the sets in img The axiom was first proposed in 1904 by Ernst Zermelo. At first glance, it seems self-evident, but there may be infinitely many choices of elements to make. For example, if img consists of all the bounded intervals from img, we can choose (say) the midpoint of each interval; but if img consists of all nonempty subsets of img, then it is not clear how to make all the choices. Bertrand Russell illustrates the point as follows: If a man has infinitely many pairs of shoes, and infinitely many pairs of socks, then he can easily choose one shoe from each pair (choose the left one, say), but choosing one sock from each pair requires the axiom of choice.

Mathematician's attitudes about the axiom of choice vary from never using it to making no distinction between mathematics assuming the axiom and mathematics not assuming it. Irving Kaplansky takes a middle position: “ I try to remember to make a note of it when I use it, but I do not hesitate to use it.” Whatever your attitude, Kurt Gödel showed in 1940 that the axiom of choice is consistent with the other axioms of set theory; that is, it cannot be disproved using these axioms. Then in 1963 Paul Cohen showed that the axiom of choice is independent of the other axioms; that is, it cannot be proved from these axioms.

Exercises C

1. If M is a finitely generated module (see Section 6.1), show that every submodule KM is contained in a maximal submodule N (that is, a submodule NM such that if NX, where XM is a submodule, then N = X).

2. Let KM be modules (see Section 6.1).

(a) Show that there exists a submodule N maximal such that KN = 0.

(b) If N is as in (a), show that (K + N) ∩ X ≠ 0 for every submodule X ≠ 0. [Hint: Consider the cases XN and X img N separately.]

3. Show that every commutative ring R contains a minimal prime ideal Q, that is a prime ideal Q such that, if PQ and P is a prime ideal, then P = Q.

Appendix D Proof of the Recursion Theorem

If A is a set, a mapping img is called a sequence from A. Sequences are usually described as follows: If we write α(n) = an for each img then the sequence is denoted a0, a1, a2, a3, . . ., an, . . . . The recursion theorem is concerned with recursively defined sequences wherein the first term a0 is specified and the later terms are uniquely determined by the earlier ones. Such sequences are unique if they exist (see Theorem 3 §1.1); our task here is to prove existence.

Theorem. Recursion Theorem. Given a set A and a img A, there is exactly one sequence a0, a1, a2, a3, . . ., an, . . . from A that satisfies the following requirements:

1. a0 = a.

2. For n ≥ 1, the term an is uniquely determined by a0, a1, a2, . . ., and an−1.

Proof. For each n ≥ 1, let βn: AnA be a mapping. We want to show that a sequence a0, a1, a2, img exists such that

a0 = a and an = βn(a0, a1, . . ., an−1), for each n ≥ 1.

The sequence is just a mapping img such that α(n) = an, and we construct α as a set of ordered pairs in img (see Section 0.3). Call a subset λ img “ nice” if it satisfies the following two conditions:

a. (0, a) is in λ.

b. If (k, xk) is in λ for k = 0, 1, . . ., n − 1, so also is (n, βn(a0, a1, . . ., an−1)).

img is clearly “ nice”. Let α denote the intersection of all “ nice” subsets λ, that is

img

It is a routine verification that α is itself “nice”.

Claim 1. Given img there exists x img A such that (n, x) is in α.

Proof. If n = 0 then (0, a) is in α because α is nice. If n > 0 let (k, xk) be in α for k = 0, 1, . . ., n − 1. Write x = βn(x0, . . ., xn−1). Then (n, x) is in every “ nice” set λ by the definition of α, so (n, x) is in λ by (b). It follows that (n, x) is in α. This proves Claim 1.

Claim 2. If both (n, x) and (n, y) are in α then x = y.

Proof. If n = 0 suppose that (0, y) is in α where ya. Consider α′ = α {(0, y)}. It suffices to show that α′ is “ nice” since this contradicts the choice of α. Clearly (0, a) is in α′. If (k, xk) is in α′ for k = 0, 1, . . ., n − 1 then all these pairs are in α, so (n, βn(x0, . . ., xn−1)) is also in α. But n ≠ 0 so this is actually in α′. This shows that α′ is “ nice”. Now assume that Claim 2 is true for k = 0, 1, . . ., n − 1, and that (n, x) and (n, y) are both in α where xy. Then α′′ = α {(n, y)} is “ nice” just as before, contrary to the choice of α. This proves Claim 2.

Finally let img Then (n, an) is in α for some an img A by Claim 1, and an is unique by Claim 2. Hence a0, a1, a2, . . . is the desired sequence.

Notes

138 As the terms complex and imaginary suggest, these numbers met withsome resistance when they were first introduced. The names are misleading: Thesenumbers are no more complex than the real numbers, and i is nomore imaginary than −1. Descartes introduced the term imaginarynumbers.

139 An entire theory exists for the study of functions such as ez, sin z, and cos z, where z is a complex variable. Many theorems can be proved in this theory, including the Fundamental Theorem of Algebra mentioned previously.

140 The argument when img appears in Section 3.6 of Nicholson, W.K. Linear Algebra with Applications, 7th ed., McGraw-Hill Ryerson, 2012.

141 See Section 0.4.

142 A set X is called countable if it can be enumerated by the set img of natural numbers, that is, if img.

143 See Kaplansky, I. Set Theory and Metric Spaces, Boston: Allyn & Bacon, 1972, Section 3.3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.240.243