CHAPTER THREE Matrix Operations

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER THREE
Matrix Operations

3.1 MATRICES

If an assertion about matrices is false, there is usually a 2 × 2 matrix that reveals this.

—OLGA TAUSKY-TODD (1906–1995)

I am sorry to say that the subject that I most disliked was mathematics…. I think the reason was that mathematics leave no room for argument. If you made a mistake, that was all there was to it.

—MALCOLM X (1925–1965)

Matrix Addition and Scalar Multiplication

Matrices play a central role in linear algebra. We can add matrices to each other, multiply a matrix by a scalar, and multiply matrices. Some matrices have inverses; square matrices have eigenvalues and eigenvectors. Many of these matters lie ahead of us in the unfolding of linear algebra.

The word matrix now has many commercial and linguistic spinoffs for movies, computer games, computer memory, and medical search engines. There is even a matrix market, which is a repository for matrix test data that can be used in comparative studies of linear algebra algorithms.

It is customary to write A = (a_ij) to signify that the generic elements in A are named a_ij. Sometimes it is more convenient to write the generic elements of A as A_ij or (A)_ij. If A is m × n, then 1 ≤ i ≤ m and 1 ≤ j ≤ n in this discussion. The notation denotes the set of all m × n matrices. For a fixed pair (m, n), the set becomes a vector space if we agree to use the ordinary definitions of adding matrices and multiplying matrices by scalars. Thus, if A and B are in , then the definitions of matrix addition and multiplication by a scalar are

The zero element in this space is designated 0, and defined by 0_ij = 0. Notice that is a doubly indexed infinite sequence of distinct vector spaces. In fact, the parameters n and m range independently over the set of all positive integers.

EXAMPLE 1

To review the addition of two matrices, use a pair of matrices from .

SOLUTION

EXAMPLE 2

Using addition and multiplication by scalars in the space , form an arbitrary linear combination in the same space.

SOLUTION

In summary, the following are the salient properties of the vector space :

• A = B if and only if A_ij = B_ij for all i and j. (Equality of two matrices.)

• (A + B)_ij = A_ij + B_ij (Addition of two matrices.)

• (cA)_ij = cA_ij for any scalar c (Scalar multiplication of a matrix.)

Here we take all values such that 1 ≤ i ≤ m and 1 ≤ j ≤ n. Once these definitions have been made, it is an easy matter to prove all the vector–space properties in Section 2.4. For example, A + 0 = A, and so forth. Thus, is a vector space.

We have already gone far beyond this stage in matrix algebra. Recall that the products Ax and AB have been defined in Sections 1.2 and 1.3. Specifically, if A is m × n and x is n × 1, then

Notice that the product in this case is an m × 1 matrix, or in other words a column vector containing m components.

EXAMPLE 3

What is the product

SOLUTION This is an example of a matrix–vector product, obeying the preceding rule. The calculation produces

Matrix–Matrix Multiplication

The matrix–matrix product was defined in Section 1.3, and goes like this: The general matrix–matrix product AB exists whenever the number of columns in A matches the number of rows in B. Let the columns in B be denoted by b₁, b₂, …, b_k. Then AB is the matrix whose columns are Ab₁, Ab₂, …, Ab_k. Notice that the matrix–vector product is a special case of the matrix–matrix product, since a column vector is an n × 1 matrix.

If A is m × n and if B is n × k, then AB will have k columns, each being a vector in . In short, AB is m × k. The pattern of dimensions here should be remembered like this:

The common index n is necessitated by the fact that we must multiply each column b_j in B by the matrix A, and Ab_j makes sense only when the number of columns in A matches the number of components in each column vector b_j.

EXAMPLE 4

What is the product

SOLUTION We denote the columns of B by the notation b₁, b₂, b₃ and obtain

There are several ways to interpret matrix multiplication.¹ We have just seen the first way, which is taken as the definition. If it is desired to know the (i, j)-element in the product AB, we must first concentrate on the jth column of the product, which is Ab_j. Here (as previously) we use b_j to denote the jth column in B. Now, what is this vector Ab_j? It is a linear combination of the columns of A with coefficients taken from the vector b_j. (This was the matrix-product definition given in Section 1.3.) Suppose A is an m × n matrix and B is an n × k matrix. Note the critical entry of n in this calculation. Thus, using a_r for column r in A, we have

¹ James Joseph Sylvester (1814–1897) and Arthur Cayley (1821–1895) are considered joint founders of matrix theory because of their many years of contributing to this branch of mathematics. Cayley was the first to recognize how matrices should be multiplied so that the correspondence between linear transformations and matrices given by S ↔ A and T ↔ B resulted in (S T)x = (AB)x. We will mention Cayley again in Section 8.1.

The term matrix was first used by Sylvester in 1850 to mean a rectangular array of numbers. (By this time, the study of determinants had been going on for more than 150 years. See Sections 4.1 and 4.2.) Sylvester was the Second Wrangler in the Tripos exams at Cambridge University in the year 1837. He was, however, barred from taking a degree or teaching at Cambridge because he was Jewish! (Being a religious entity, Cambridge University required all degree candidates and professors to belong to the Church of England.) He immigrated to the United States in 1841 and taught at the University of Virginia from 1841 to 1845. Sylvester was a fiery and passionate person who once had an altercation with a badly behaved student during a lecture! In 1845, Sylvester returned to England and for ten years was an actuary and lawyer in London. Between 1854 and 1871 he taught at the Royal Military Academy at Woolwich. Then he became a professor at Johns Hopkins University in Baltimore, where he wrote many papers, especially on quadratic forms (which we treat in a later chapter). In 1878 (at the age of 64), he founded the American Journal of Mathematics, the first mathematical journal published in the United States. From 1884 to the end of his life, he was a professor at Oxford University.

Here, the entries a₁, a₂, …, a_n are the columns of A. Because we asked for the ith element in this vector, we have

We prefer to write this as

Here, 1 ≤ i ≤ m and 1 ≤ j ≤ k. This formula provides a second way of interpreting the product AB.

Each summation in computing the product AB contains n multiplications, and there are mk of them because we need all values of (i, j). Consequently, matrix–matrix multiplication can be a computationally expensive operation, involving nmk scalar multiplications. If the matrices are square, say n × n, then the number of multiplications is n³. (This does not include the additions involved.) Of course, these considerations come into play only when the matrices are enormous, as occurs in many practical problems. Think of n = 10⁸, for which we obtain n³ = 10²⁴.

Pre-multiplication and Post-multiplication

Another example is given here to illustrate in complete detail that we can carry out a matrix–matrix product as either a post-multiplication by columns or a pre-multiplication by rows. First, let us use post-multiplication of the columns b_i of the matrix B into the matrix A. In this example, A is 3 × 2 and B is 2 × 2.

Next, we illustrate pre-multiplication by the rows r_i of matrix A with the same matrices B.

Of course, the two procedures lead to identical results!

Dot Product

The dot product between two vectors u = (u₁, u₂, u₃, …, u_n) and v = (v₁, v₂, v₃, …, v_n) is

which is the standard inner product in the Euclidean vector space . In matrix notation, the dot product can also be written as

If we use this notation, then we can say that (AB)_ij is the dot product of row i in A with column j in B. This is a third way of interpreting AB. Let us use r_i to designate row i in A. Then we have the formula

(Here we continue to denote column j in B by b_j.) This formula enables us to compute any element in the product AB with one simple dot product.

THEOREM 1

The (i, j)-element in the product of two matrices A and B can be written in summation notation or as the dot product of row i in A with column j in B.

EXAMPLE 5

What is the (2, 4)-element in this product? That is, what is (AB)_2,4?

SOLUTION We need not compute the entire product AB to answer this question. We simply compute the dot product of row 2 in A with column 4 in B, which is

Special Matrices

A number of special categories of matrices will be enumerated here. A matrix is square if it has the same number of rows as columns. The diagonal of a square matrix A is the sequence of elements A_ii. The other elements are called off-diagonal elements. If all off-diagonal elements are 0, we say that A is a diagonal matrix. Among the diagonal matrices, the identity matrix is especially important: it has 1’s on the diagonal, and 0’s elsewhere. Here are examples of a square matrix, a diagonal matrix, and an identity matrix:

The n × n identity matrix is denoted by I_n or by I, if its size is clear from the context. We often denote the elements of an identity matrix by δ_ij. This is called the Kronecker delta.² The formal definition is

EXAMPLE 6

If we compute the product of a 3 × 3 matrix and a 3 × 3 identity matrix, what do we expect to get?

² Leopold Kronecker (1823–1891) was born in Liegniz, near Breslau (in modern-day Poland). The family was independently wealthy by its ownership and management of a large estate. Kronecker himself participated in this activity but reserved time for his mathematical research. This saved him from having to hold a regular academic position. When he did teach, his lectures were demanding and stimulating for the best students but not popular with the average students. He is famous for the dictum concerning mathematics: ‘‘God made the integers; all else is the work of man.’’

SOLUTION Here is a case in point:

THEOREM 2

For any n × n matrix A, we have AI_n = I_nA = A.

PROOF We compute the (i, j)-element in the product AI_n using methods discussed previously:

Note that in the summation, the term δ_rj will be zero except when r = j. Then its value is 1. Hence, only one term in the sum survives, namely the one where r = j. The other equation is proved in the same manner.

Matrices with special structure occur in many applications. A square matrix A is said to be upper triangular if a_ij = 0 when i > j. Similarly, we have a concept of lower triangular: a_ij = 0 when i < j. (These definitions are also meaningful for nonsquare matrices.) See Figure 3.1 for illustrations of some matrices with special structure.

FIGURE 3.1 Matrices with special structure.

EXAMPLE 7

If we add two upper triangular matrices together, what do we expect to find?

SOLUTION Here is a numerical case:

THEOREM 3

The set of upper triangular matrices in is a subspace of . That is, it is a vector space in its own right.

Subspaces are taken up systematically in Section 5.1.

EXAMPLE 8

If we multiply two upper triangular matrices of the same size, what do we expect to obtain?

SOLUTION Here is a simple case to guide us:

THEOREM 4

The product of two upper triangular matrices (of the same size) is also upper triangular.

PROOF Let A and B be n × n upper triangular matrices. Let 1 ≤ j < i ≤ n. It is to be proved that (AB)_ij = 0. We have

In this calculation, use first the fact that a_ir = 0 if r < i. Then only terms having r ≥ i enter the sum. Next, use the fact that b_rj = 0 when r > j. The restriction on r is then i ≤ r ≤ j, which is impossible, because we assumed j < i.

A similar result is true for lower triangular matrices, and is left as General Exercise 62.

Matrix Transpose

When we exchange the rows and columns of a matrix, we obtain the transpose of the matrix, which is of particular importance.

DEFINITION

The transpose of a matrix A is the matrix A^T whose rows are the columns of A, in the same order. In symbols: (A^T)_ij = A_ji.

EXAMPLE 9

Give a 3 × 4 matrix and its 4 × 3 transpose.

SOLUTION

THEOREM 5

If the product AB of two matrices A and B exists, then (AB)^T = B^TA^T.

PROOF We use the subscript notation for elements of matrices and (not unexpectedly) the definition of a matrix product:

There is another way to describe what is happening in the preceding proof: To determine the (i, j)-element in B^TA^T, we should take the dot product of row i in B^T with column j in A^T. But this is the same as the dot product of column i in B with row j in A. That is the same as the dot product of row j in A with column i in B. Finally, this is the (j, i)-element in AB or the (i, j)-element in (AB)^T.

Here are four computational rules involving the matrix transpose:

Symmetric Matrices

A matrix A such that A = A^T is said to be symmetric. Here is a symmetric matrix whose elements are random numbers from the interval [0, 1], generated by mathematical software:

The symmetric property here depends upon three equations being true: a₂₁ = a₁₂, a₃₁ = a₁₃, and a₃₂ = a₂₃.

Skew–Symmetric Matrices

While a symmetric matrix A has the property A^T = A, a skew–symmetric matrix has the property A^T = −A, and this contains a minus sign! If a matrix is simultaneously symmetric and skew–symmetric, then it must be the zero matrix. (If A^T = A = −A^T, then A = 0.) Moreover, there is a representation of a square matrix as the sum of a symmetric matrix and a skew–symmetric matrix. The equation that proves this is

The first matrix on the right is symmetric and the second matrix is skew– symmetric.

The unicity of the splitting of A into symmetric and skew–symmetric parts can be easily established. Suppose that

where B is symmetric and C is skew–symmetric. We shall prove first that . By the first hypothesis, we obtain

Because C is skew–symmetric by hypotheses, . Using the symmetry of B, we see that

Then C is determined by the equation

EXAMPLE 10

Express this matrix as the sum of a symmetric matrix and a skew–symmetric matrix:

SOLUTION We let the symmetric part be

and let the skew–symmetric part be

Verifying, we find .

Non-commutativity of Matrix Multiplication

Now we come to the question of commutativity of matrix multiplication. Let us choose at random several square matrices A and B and see whether AB = BA.

Because these matrices seem typical, can we not assume that, in general, AB = BA? Most emphatically, the answer is no! In the real world, Baconian inductive reasoning is often used. It does proceed by looking at many examples, from which some tentative conclusion can be drawn. But this can be used in mathematics only to suggest something that may be true. In the present case, the suspected theorem is false, and generally AB ≠ BA. For example, we have

Of course, one example is sufficient to demolish the conjecture that matrix multiplication is commutative. The apparently random integer matrices displayed at the beginning of this section show that sometimes a pair of matrices will commute with each other. But those matrices are not actually random; they were carefully constructed to illustrate the existence of commuting pairs. Maybe you can find others.

CAUTION

Matrix multiplication is not commutative; that is, in general, one must expect that

Associativity Law for Matrix Multiplication

What can be said about associativity of matrix multiplication? Here the result is simple (and very useful).

THEOREM 6

Matrix multiplication is associative. Thus, if the products exist, we have

PROOF We calculate (and get practice in handling summations):

An example of the associative law for matrix multiplication is given here:

Calculating the product in a different order, we have

Linear Transformations

The next theorem shows why matrix multiplication is defined in the manner explained in Section 1.3.

THEOREM 7

Let S and T be two linear transformations defined by S(y) = Ay and T(x) = Bx. (A and B are matrices.) If the composition S T exists, then its matrix is AB.

PROOF We use the associativity of matrix multiplication in this quick calculation:

Another interpretation of this proof is as follows. Column i of matrix A, say a_i, is the image of the ith standard unit vector e_i = (0, …, 1, 0, …, 0) under the linear transformation S; namely, S(e_i) = a_i. We obtain similar results for matrix B and linear transformation T; that is, T(e_i) = b_i, where b_i is the ith column of matrix B. Consequently, we obtain

which is the ith column of matrix AB. Now associativity of matrix multiplication is a corollary of associativity of functional composition.

Elementary Matrices

When a single row operation is applied to the identity matrix, the result is called an elementary matrix. Thus, if denotes any row operation, we can set and get an elementary matrix, E. It is easy to recognize an elementary matrix because it differs very little from an identity matrix. Here, we interpret a row operation as a function that applies to a matrix and produces another matrix. Examples of elementary matrices are

Why are elementary matrices useful? They enable us to interpret row operations on a matrix as multiplication on the left by one or more elementary matrices. For example, using E₁, E₂, and E₃ as in the preceding display, we have

Now we have a more sophisticated way of thinking about row equivalence. If A ~ B, then a sequence of row operations can be applied to A, producing B. Hence, there will exist elementary matrices E_i such that

THEOREM 8

If the matrix Ã is the result of applying a row operation to the matrix A, and if E is the matrix that results from applying to I, then Ã = EA.

PROOF We give the proof for one type of row operation—namely, the addition of a scalar multiple of one row onto another row. (This is the most tiresome of the three cases.) Suppose specifically that is the operation of adding c times row s in A onto row r. (Of course, s ≠ r.) The action of on the n × n identity matrix produces the elementary matrix E with these entries:

(To verify this, begin by noticing that E and I differ only in row r because of the factor δ_ir.) The matrix Ã that results from applying to A has this definition:

Next, we compute EA, element-by-element:

More on the Matrix–Matrix Product

We have been using the product Ax in many situations involving systems of equations. This product exists if A is, say, m × n and x is n × 1. An analogous product y^TA exists when y^T is 1 × m. This will certainly arise naturally if we find it necessary to proceed from Ax = b to (Ax)^T = b^T, because this is the same as x^TA^T = b^T. (Remember that taking the transpose of a product requires the order of the factors to be reversed.) We can prove theorems about this product y^TA that are logically parallel to the results about the product Ax. We have

where the column vector b is m × 1 and the row vector c^T is 1 × n.

Recall, from Section 1.2, that the product Ax was defined to be

where a_j denotes the jth column of A. The vector x is a column vector, and n is the number of components in x as well as the number of columns in A. This definition of a matrix multiplying a vector led us to the appropriate definition of AB. Namely, if the matrix B has columns b_j, then

EXAMPLE 11

Use the preceding definition of matrix multiplication to determine the entry occupying position (i, j) in the matrix AB.

SOLUTION The desired entry is in column j of AB, and this column is Ab_j, by the definition of the product AB (as quoted previously). The ith component in this vector is the desired entry in AB. Thus, it is

Just as we expect, it is the dot product of row i of matrix A (denoted here by r_i) and column j of matrix B (written here as b_j).

Vector–Matrix Product

Suppose that we have the m × n matrix A, the 1 × n row vector c, and the 1 × m row vector y^T. Formulas similar to those obtained previously can be given for products

when y^T is a row vector. Namely, we write

In this equation, the rows of A are the vectors r_i. The matrix A is m × n. We can verify the correctness of the displayed formula by computing the jth component of the vectors on the right and on the left. The one on the left is

The one on the right is

Consequently, we have the following two important theorems.

THEOREM 9

If y^T is a row vector, then y^TA is a linear combination of the rows of A, with coefficients taken to be the components of the row vector y^T.

where r_i are the rows of A.

THEOREM 10

If the rows of A are denoted by r₁, r₂, …, r_m, then

PROOF The (i, j)-element in AB is .

The (i, j)-element in B is (r_iB)_j = r_ib_j, where b_j is column j in B.

This is just the dot-product formula for any entry in the product AB.

EXAMPLE 12

If A and B are n × n nonzero matrices, can it happen that AB = 0? If so, this would be contrary to what can happen with real numbers. In that more elementary setting the equation xy = 0 implies that at least one of the factors x and y is zero.

SOLUTION As an example to exhibit this phenomenon in matrix theory, we offer

Thus, one must resist the temptation in matrix algebra to go from the equation AB = AC to the equation B = C, solely on the strength of the fact that A ≠ 0. See General Exercises 3, 12, 34, and 44.

Application: Diet Problems

Linear algebra can play a role in dietetics, whether for humans, animals, or bacteria, among others. The objective might be to achieve some nutritional goal through a suitably compounded mixture of foods and dietary supplements. These problems often lead to systems of linear inequalities, a topic treated briefly here.

We assume that there are foods labeled f₁, f₂, …, f_n, and that the nutritional content of each is known. There are m different nutrients being considered, and they are labeled v₁, v₂, …, v_m. (Think of these as vitamins in a typical case.) The nutritional values are displayed in a table.

The entries a_ij signify the amounts of nutrients in each gram of each food. Specifically, a_ij is the amount of nutrient v_i in food f_j. (Standard units must have been established.) In a given row, the units should be the same, as they pertain to a single nutrient. Units (for example, grams) would be assigned to each food, f_j.

Looking down a column, say column j, we see the amounts of each nutrient contained in one unit of food f_j. Looking across a row, say row i, we see what foods contain nutrient v_i and in what quantities. For example, if f_j is orange juice and v_i is vitamin C, then a_ij might indicate how many milligrams of vitamin C are contained in each gram of orange juice.

A diet will be a vector x = (x₁, x₂, …, x_n). If each day a patient ingests x₁ units of food f₁, x₂ units of food f₂, and so on, then he or she will get units of nutrient v_i. The problem is to formulate a diet that provides on a daily basis certain prescribed amounts of each nutrient. We can define a vector b containing these minimum daily requirements and ask for a diet, x, such that

For example, suppose that we are considering a diet that involves three nutrients and four foods. The matrix of nutritional values is known, and we have a target for minimal nutrition. The table of nutrients is the following matrix:

A diet is a vector x in whose components indicate the number of units of each food to be made available to the patient or animal, as the case may be. The product Ax is a vector of three components, and we want these components to be at least 84, 43, 68. We can start by solving the system

In the usual way, we have

We find the general solution to be

Here we have one free variable, x₄. We do not want any component of the vector x to be negative, and this restricts x₄ to the interval 0 ≤ x₄ ≤ . The resulting extreme cases are x = (5, 8, 10, 0) and (0, 17/4, 15/2, ). Different choices of pivot rows lead to different extreme solutions.

This is probably a good opportunity to raise the issue of systems of linear inequalities. We might relax the requirement Ax = b to be

(For two vectors u and v in , we write u ≥ v when u_i ≥ v_i for 1 ≤ i ≤ n.) That would certainly be better if there exists no solution to the first problem, Ax = b.

In the diet problems, it can often happen that the system of linear equations is inconsistent. (Remember that a system of equations in which the equations outnumber the variables is likely to be inconsistent.) Conversely, it may not be necessary to meet the requirements exactly because overnourishment is usually not serious—it is merely wasteful. What is the right formulation of the problem if it is inconsistent as it originally stands? Obviously we can require simply Ax ≥ b, as explained previously. Is this a numerical problem that we can easily solve with our trustworthy row-reduction algorithm? Here, we tread on dangerous ground. Consider a very simple case

Yielding to the temptation of using row operations, we proceed to

This is wrong! For example, it seems to say that x₂ < 0. Where is the error? The manipulation on the inequalities is completely unjustified. A better approach is to turn the inequalities into equations by inserting positive parameters p₁, p₂ like this:

One step of the standard row-reduction leads to

Notice that the term −2p₁ + p₂ can be either positive or negative. In order that x₂ ≥ 0, we must have 2 − 2p₁ + p₂ ≤ 0. For example, we could let p₁ = 1, p₂ = 0, x₂ = 0, and . In problems of this type, there can be many solutions, and usually there is a cost function involved. In that case, the variables should have low values, in general. This leads to a type of problem called linear programming, in which we seek to minimize a linear objective function c₁x₁ + ··· + c_nx_n subject to constraints of the form x ≥ 0 and Ax ≥ 0. Special codes are available for this type of problem. Row-reduction techniques by themselves are virtually useless in these optimization problems.

EXAMPLE 13

A diet problem leads to this system of linear inequalities:

Find one or more good solutions.

SOLUTION We change from a system of inequalities to a system of equations Ax ≥ b by introducing a vector p whose coordinates are to be nonnegative: Ax = b + p. There are now five unknowns: the two components of x and the three components of p. Here is the augmented matrix for the problem:

The reduced echelon form of this matrix is

We obtain the genral solution

From the third equation we see that the smallest possible value for p₁ is 7.5, obtained by selecting p₂ = p₃ = 0. With these values of p, we compute an acceptible solution (x₁, x₂) = (, 4). Hence, we obtain (15.5, 5, 12) ≥ (8, 5, 12) on the righthand side of the preceding inequality.

Dangerous Pitfalls

We end this section with some words of caution.

1. Usually AB ≠ BA. For example, we see that

Thus, the factors in matrix multiplication do not commute, in general.

2. If AB = AC, we cannot infer that B = C. For example, we have

where cancellation of the matrix factor A is not correct!

3. If AB = 0, we cannot conclude that A = 0 or B = 0. For example,

but neither factor A nor B is the zero matrix!

4. If a practical problem results in a system of linear inequalities, say Ax ≤ b, do not expect the row-reduction techniques to work. See the discussion in the subtopic of diets, in this section, for one way of proceeding.

SUMMARY 3.1

• Matrix properties:

• The equation A = B means that A_ij = B_ij for all appropriate indices. (Equality of matrices)

• (A + B)_ij = A_ij + B_ij (Matrix addition)

• (cA)_ij = cA_ij when c is a scalar (Scalar– matrix multiplication)

• AB = [Ab₁, Ab₂, …, Ab_n], where b_j are the column vectors of B. Also,

• (Matrix–matrix multiplication)

• Special matrices:

• I = I_n = (δ_ij) where δ_ij = 1 if i = j and δ_ij = 0 otherwise. (Identity matrix)

• For a square matrix A, Diag(A) has the diagonal of A, but all other elements of the matrix are zero.

• A lower triangular matrix L has the property _ij = 0 when j > i.

• An upper triangular matrix U has the property u_ij = 0 when j < i.

• (A^T )_ij = A_ji (Matrix transpose)

• If A = A^T, then A is said to be symmetric.

• If A^T = −A, then A is said to be skew– symmetric.

• Every square matrix is the sum of a symmetric matrix and a skew–symmetric matrix:

• Examples of elementary matrices:

•

• If A ~ B, then E_kE_k−1 ··· E₂E₁ A = B, where all E_i are elementary matrices.

• The product Ax can be computed with a post-multiplication of A by column vector x.

• The product y^TA can be computed by a pre-multiplication of A by the row vector y^T.

• We can sometimes solve a system of linear inequalities by introducing a nonnegative vector so as to get a system of equations.

• Theorems about matrices:

• If A is an n × n matrix, then AI_n = I_nA = A.

• The set of all upper triangular matrices is a subspace of .

• The product of two n × n upper triangular matrices is also upper triangular. The same is true for lower triangular matrices.

• (AB)^T = B^TA^T

• In general, AB ≠ BA. (Matrix multiplication is usually not commutative.)

• (AB)C = A(BC) (Matrix multiplication is associative.)

• If S(y) = Ay and T(x) = Bx, then the composition of S and T obeys the rule (S T)(x) = ABx.

• If (A) is the result of applying a row operation to A, and (I) is the result of applying the same row operation to I, then .

• where r_i are the rows of A. (Vector–matrix product)

• where r_i are rows of A.

• Cautions:

• Usually AB ≠ BA. Factors in matrix multiplication do not commute, in general.

• If AB = AC, we cannot infer that B = C. Cancellation of the matrix factor A is usually incorrect!

• If AB = 0, we cannot conclude that A = 0 or B = 0.

• Do not expect the row-reduction techniques to work on a system of linear inequalities such as Ax ≤ b or Ax ≥ b.

KEY CONCEPTS 3.1

The space of all m × n matrices, addition and multiplication of matrices, dot product of two vectors, multiplication of a vector by a matrix, diagonal matrices, identity matrices, transpose of a matrix, symmetric matrices, skew–symmetric matrices, elementary matrix, triangular matrices, associativity, commutativity, upper triangular, lower triangular, Kronecker delta, outer products, diet problems

GENERAL EXERCISES 3.1

1. Refer to Example 4. Explain why BA cannot be computed.

2. If A is a square matrix, the product AA exists, and is denoted by A². Compute the square of the matrix

3. For real numbers r, we know that if r² = 0, then r = 0. Is this true also for square matrices? A theorem or example is required.

4. Is this the correct definition of upper triangular, for a matrix A: A_ij ≠ 0 when j ≥ i.

5. Find all the 2 × 2 matrices that commute with the matrix

6. Explain why, in defining row operations, we must insist on i ≠ j in the operation r_i ← r_i + αr_j.

7. Which of these matrices is not an elementary matrix?

8. Define

Verify that BA = I. Then solve the system Ax = c by multiplying both sides of this equation by B. This produces BAx = Bc or x = Bc. Verify by an independent calculation that Ax = c.

9. Let . What is A^T? What is Ax if ?

10. Let ,

C = AB

Compute these: C₂₃, A^T, A + B, and B².

11. Find all the matrices that commute with

12. Certainly we can find n × n matrices that are nonzero, yet satisfy the equation AB = 0. Is this possible with B = A? Is it possible with B = A^T?

13. Explain why this diet problem leads to an inconsistent system of linear equations:

14. (Continuation.) Find a good solution of the inequality Ax ≥ b, using A and b from the preceding exercise. Ideally, there will exist a vector x such that if the vector y satisfies b ≤ Ay ≤ Ax, then x = y. Try to find such an x.

15. Solve this diet problem:

As usual, there may be some solutions that are better than others. Try to find an optimal solution, using the least quantities of foodstuffs to accomplish the nutritional objective.

16. Criticize this solution to a diet problem: Ax ≥ b where and

We find .

Thus, we have x₁ = 1 and x₂ is either 1 or .

17. Show in detail the operations for the matrix–matrix multiplication AB^T with

a. post-multiplication

b. pre-multiplication

18. Split the matrix into the sum of a symmetric and a skew– symmetric matrix.

19. Find all 2 × 2 matrices that commute with

20. Explain the flaw in this reasoning: We are given BA = I. To solve Ax = b, we multiply by B, getting BAx = Bb and x = Bb.

21. Establish that the product of two upper triangular matrices is upper triangular. (The two matrices are square and of the same size.)

22. Each expression in the following list represents a certain element in a product, such as A = (a_ij), (B^TA)_ks, or (A²)_st. Identify each. (In each example, the index of summation is indicated; it runs from 1 to some upper limit appropriate for the matrices in question.)

23. Explain why, for any square matrix A, A + A^T and AA^T are symmetric. Explain why A must be square in these assertions.

24. (Continuation.) Explain why every square matrix is the sum of a symmetric matrix and a skew–symmetric matrix. General Exercise 3 may be helpful.

25. Explain why every square matrix is the sum of a lower and an upper triangular matrix.

26. Establish that the diagonal matrices, the upper triangular matrices, and the symmetric matrices form three subspaces in .

27. Explain why, for any matrix A, this equation holds: (A^T)^T = A.

28. Establish that if (m, n) ≠ (α, β), then .

29. Explain why or find a counterexample: The product of two symmetric n × n matrices is symmetric.

30. Let A be a fixed matrix in . Consider the set of all matrices that commute with A. Is this set a subspace of ?

31. Let A be an n × n matrix. Establish that A commutes with every polynomial in A. (That terminology means a matrix of the form α₀I + α₁A + α₂A² + ··· + α_mA^m.)

32. Is it possible for a 2 × 2 matrix to have the property that its powers (including A⁰ = I) span ?

33. Explain why, for 2 × 2 matrices, A² is a linear combination of I and A.

34. Find a 2 × 2 matrix, none of whose entries is zero, such that its square is zero.

35. Is there a 7 × 7 matrix of rank 7 all of whose entries are either 5 or 8?

36. Establish that (A + B)^T = A^T + B^T.

37. Explain why or find a counterexample: The product of two skew–symmetric matrices is skew–symmetric.

38. Give an example of two matrices, A and B, that are not row equivalent to each other, but their transposes are row equivalent to each other.

39. Give an argument why every 2 × 2 matrix A satisfies the equation

A² − (A₁₁ + A₂₂)A + (A₁₁A₂₂ − A₁₂A₂₁)I = 0 (This is one case of the Cayley–Hamilton Theorem. See Section 8.1.)

40. Establish the remaining case of Theorem 8. (Recall that the interchange of two rows can be executed by row operations of the other two types.)

41. If x and y are two vectors in , their outer product is

Here we assume that x and y are column vectors. The outer product then is an n × n matrix A, specified by the equation A_ij = x_iy_j. Explain why the rows of this matrix are scalar multiples of each other. What is the rank of the matrix xy^T? (The rank of a matrix is defined on p. 63.)

42. Is every square matrix the product of a lower triangular matrix and an upper triangular matrix?

43. Explain why the dot product has this property: x • y = y • x.

44. Find all the 2 × 2 matrices A such that A² = 0.

45. If A is an m × n matrix, what can be said of AI_n, AI_m, I_mA, and I_nA?

46. If A is a 2 × 2 upper triangular matrix and B is a 2 × 2 lower triangular matrix, what special properties does the product AB have? Is it possible to factor any 2 × 2 matrix into the product of an upper and a lower triangular matrix?

47. Explain this apparent contradiction: For some pair of matrices A and B we can have AB ≠ BA, yet in this calculation we have reversed the terms A_ikB_kj:

48. Find two different 2 × 2 matrices (other than ±I) such that AA^T = I.

49. Define A ⊕ B = AB + BA. Is this operation commutative? Is it associative? Answer the same questions for the operation A B = AB − BA.

50. Establish Theorem 9 by using transposes in the equation .

51. Let A be a square matrix. Assume that the homogeneous system of equations A²x = 0 has a nontrivial solution. Establish that the system Ay = 0 has a nontrivial solution.

52. Let A, B, and P be n × n matrices. If A = PBP^T, does it follow that A is symmetric?

53. Establish that if A ~ B, then there is a nonsingular matrix C such that A = CB.

54. Suppose that the column sums of a square matrix A are all equal to 1. Does it follow that the same property is true for A^k for all integers 2, 3, and so on?

55. (Continuation.) Repeat the argument in the preceding exercise when the column sums are all equal but not necessarily 1.

56. Establish the statements that are true and give counterexamples for the others. In each case we are dealing with n × n matrices.

a. (A + B)² = A² + 2AB + B²

b. If AB = 0, then either A = 0 or B = 0

c. (A − B)(A + B) = A² − B²

d. A²B² = (AB)²

e. C(A + B) = CB + CA

f. If A² = 0, then A = 0

g. AB = BA

h. (A + B)C = BC + AC

57. Suppose that A and B are n × n symmetric matrices. Does it follow that (AB)^T = BA? Is the converse true?

58. Establish that for any matrix A (not necessarily square), AA^T is symmetric.

59. Explain why the hypothesis “A is square” leads to the conclusion “A + A^T is symmetric.”

60. Refer to the elements in the matrices A = (a_ij) and B = (b_ij). Compute these quantities: , , , . Recall the Kronecker delta: δ_ij is 1 when i = j; otherwise it is 0.

61. Suppose that a matrix A has the property A^TA = AA^T. Does it follow that A is square? Is the converse true? That is, if A is square, does it follow that A^TA = AA^T?

62. Establish that the product of two n × n lower triangular matrices is lower triangular.

63. For a vector x, we write x ≥ 0 if all components of x are nonnegative. Suppose that A is an m × n matrix such that Ax ≥ 0 whenever x ≥ 0. Does it follow that all entries in A are nonnegative?

64. Let A be an m × n matrix in which each entry is positive (i.e., greater than zero). Establish that, for any b in , the inequality Ax ≥ b has a solution. Also, show that if we drop the assumption on A that each entry be positive, the result is not true.

65. Let

What can be done with this diet problem? (We want Ax ≥ b.)

66. If x and y are vectors, does it follow from x > y that Ax > Ay? What hypothesis on A would be useful here? Establish a theorem of your own devising, and give examples.

67. Find the necessary and sufficient conditions on two symmetric 2 × 2 matrices so that their product is symmetric.

68. (Continuation.) Find the most general pair of symmetric 2 × 2 matrices whose product is not symmetric.

69. Let Ω be the operator that produces from A its reduced row echelon form Ω(A). Explain why Ω(AB) = Ω(A)B.

70. In , suppose addition is defined by the equation (A B)_ij = A_ij + B_ji. Would be a vector space? (A simple yes or no is not sufficient.)

71. Establish that the operation A → A^T is linear.

72. In Example 12, we found that if A and B have no zero entries, then it can happen that AB = 0. For 2 × 2 matrices, find general conditions for this to be true and give another simple example.

COMPUTER EXERCISES 3.1

1. Use mathematical software such as MATLAB, Maple, or Mathematica to generate random matrices of various sizes and types with elements that are general, upper triangular, lower triangular, symmetric, and skew–symmetric. Repeat making the entries random integers.

2. (Continuation.) Multiply two random integer matrices of the same type (upper triangular, lower triangular, symmetric, and skew–symmetric). In each case, is the resulting matrix of the same type?

3. (Continuation.) Illustrate the cautions in Summary 3.1 using random integer matrices.

4. (Continuation.) Use random matrices to illustrate that (AB)^T = B^TA^T.

5. Consider a general 2 × 2 matrix

Using mathematical software such as Maple, determine when these expressions are correct:

a. Trace(AB) = Trace(BA)

b. Trace(AB) = Trace(A) · Trace(B)

c. Trace(A²) > 0

The trace of matrix A = (a_ij) is the sum of the diagonal elements:

3.2 MATRIX INVERSES

Life is good for only two things, discovering mathematics and teaching mathematics.

—SIMÉON POISSON (1781–1840)

God created the integers; all else is the work of man.

—LEOPOLD KRONECKER (1823–1891)

Would it not be advantageous if, to solve a system of equations, Ax = b, we simply could divide by A and get the solution x = A⁻¹b? Something similar to this turns out to be possible in many cases! (In MATLAB, we get the solution with the single command A. How this works will be explained later.)

Solving Systems with a Left Inverse

Suppose it is known that a given system, Ax = b, has a solution. Instead of treating x as an unknown, we can let x denote one such concrete solution. Then Ax really is b. Suppose also that we are in possession of a matrix B such that BA = I. Then we can multiply both sides of the equation Ax = b by B and arrive at BAx = Bb, or Ix = Bb, or x = Bb. We have concluded that if x is a solution of the equation, then x must equal Bb.

Because this statement is often misinterpreted, we state it again as a theorem.

THEOREM 1

If the system of equations Ax = b is consistent, and if a matrix B exists such that BA = I, then the system of equations has a unique solution, namely x = Bb.

EXAMPLE 1

Consider the system Ax = b, in which

Fortune has smiled upon us and given us a matrix B such that BA = I, namely . The verification is

How can we use this information to solve the given system?

SOLUTION If the system of equations has a solution, it must be Bb. So we compute and then test it by substitution:

Notice that we do not talk about verifying the solution; we test it to see whether it is a solution. (There is a subtle difference.) Our work prior to the testing does not prove that the given x is a solution. Remember that we started by assuming that x was a solution! That assumption could have been false. A false assumption can get you in trouble!

EXAMPLE 2

Here we illustrate an incorrect use of Theorem 1. Consider this system of equations:

The coefficient matrix is . Let .

Then DC = I, as we quickly verify:

How can we use this information to solve the given system?

SOLUTION The original system of equations is Cy = g, where . Multiply both sides of this equation by D, getting DCy = Dg.

(There is no question that this is legitimate.) The equation reduces to y = Dg, so the solution must be

Just to be sure, we verify the solution by substitution and get a surprise:

Where is the error?

The procedure used depends for its validity on the consistency of the system of equations. That is, we must know in advance that there exists at least one solution. In this example, such knowledge is not present, and in fact the given system is inconsistent. This is revealed by the row-reduction process of the augmented matrix:

Similar fallacies can arise outside of linear algebra. (For example, see General Exercise 27.)

Solving Systems with a Right Inverse

To pursue these ideas further, suppose that again we want to solve a system of linear equations, Ax = b. Assume now that we have another matrix, B, such that AB = I. Then we can write A(Bb) = (AB)b = Ib = b; whence Bb solves the equation Ax = b. This conclusion did not require an a priori assumption that a solution exist; we have produced a solution. The argument does not reveal whether Bb is the only solution. There may be others.

THEOREM 2

Suppose that two matrices A and B have the property that AB = I. Then the system of linear equations Ax = b has at least one solution, namely x = Bb.

EXAMPLE 3

Consider another system of linear equations:

How can we solve this system using information from Example 2?

SOLUTION The coefficient matrix here is the matrix D from Example 2.

Hence, the problem to be solved is Dx = f, where . We know that DC = I, where C is as in Example 2. Hence, by Theorem 2, one solution is x = Cf = [−3, 6, −12]^T. To check these results, we find that

Analysis

There are two phenomena here, illustrated by Examples 1 and 3. In the first case, if we know from some external information that the system Ax = b has a solution, then it must be Bb. That is not the same as saying flatly that Bb is a solution, because there may be no solution. There is that annoying additional hypothesis that there must exist a solution. In the second case, we are more fortunate, as the calculation shows that Cb is a solution, without further hypotheses. However, we do not know whether, in fact, it is the only solution.

DEFINITION

If AB = I, then B is a right inverse of A and A is a left inverse of B.

EXAMPLE 4

Are these matrices related with regard to left and right inverses?

SOLUTION In Example 2, we had DC = I. Therefore, D is a left inverse of C and C is a right inverse of D.

EXAMPLE 5

Find all the solutions of the system Ax = d, where A and d are as in Example 3.

SOLUTION We carry out a row reduction, getting

As usual, we write this in a form with the free variable on the right:

The problem has infinitely many solutions. The solution of the equation obtained in Example 3 arises by taking x₃ = −4.

To summarize in slightly different language, suppose that we wish to solve the system of linear equations Ax = b, and have a right inverse of A, say AB = I. Then Bb is a solution, as is verified by noting A(Bb) = (AB)b = Ib = b. Conversely, if we have a left inverse of A, say CA = I, then we can only conclude that Cb is the sole candidate for a solution; however, it must be checked by substitution to determine whether, in fact, it is a solution.

THEOREM 3

If the matrix A has a right inverse, then for each b the equation Ax = b has at least one solution. If A has a left inverse, then that equation has at most one solution.

Square Matrices

The situation for square matrices is much simpler, and we turn to it next with a theorem of great importance. For example, in Section 7.1, we shall use the following theorem to deduce that if the rows of a square matrix form an orthonormal set, then the same is true of its columns.

THEOREM 4

If A and B are square matrices such that BA = I, then AB = I.

PROOF Step 1. Let A and B be n × n matrices, and assume that BA = I. The equation Ax = 0 has only the trivial solution because if Ax = 0, then BAx = 0 and x = 0. (Use the equation BA = I.)

Step 2. The reduced echelon form of A must have n pivots, for if there were fewer there would exist free variables and there would exist nontrivial solutions to the homogeneous equation Ax = 0, contrary to the conclusion in Step 1.

Step 3. Because the matrix is square, each row has a pivot, and the reduced echelon form of A is I_n.

Step 4. We can solve any equation Ax = b by the row-reduction process, because A reduces to I in this process.We can also solve the equation AX = I, where I is the identity matrix and X is an n × n matrix. (This calculation can be done using one column at a time from the matrix I. It is more efficient to use the technique explained in Example 8 in Section 1.3.)

Step 5. AB = (AB)I = (AB) (AX) = A(BA) X = AIX = AX = I.

THEOREM 5

If A, B, and C are square matrices such that AB = AC = I, then B = C.

PROOF By Theorem 4, BA = I. Multiply each side of the equation in the hypothesis on the left by B, getting BAB = BAC. This reduces to B = C.

Invertible Matrices

Theorem 5 asserts that if a matrix has an inverse, then it has only one. We need this fact to justify the following definition.

DEFINITION

A square matrix A is invertible if there is a matrix B such that BA = I. In this case, B is the inverse of A and we write B = A⁻¹. The equation AB = I is also true.

An invertible matrix is also said to be nonsingular. If a matrix has no inverse, it is said to be singular or noninvertible.

EXAMPLE 6

Here are two matrices

Is one of these the inverse of the other? If so, which is which?

SOLUTION Do not make the mistake of thinking that it is necessary to compute the inverse of one of these two matrices. It is much easier to compute one of the products AB or BA. It turns out that AB = I. Several conclusions can be drawn, namely, BA = I, A⁻¹ = B, and B⁻¹ = A. Thus, each matrix is the inverse of the other. Theorem 4 applies here. If you compute the products to see whether they equal the identity matrix, the work will be quite different for AB and BA. But Theorem 4 asserts that both of the products will be I₃.

THEOREM 6

If A is invertible, then for any prescribed b the equation Ax = b has one and only one solution, namely x = A⁻¹b.

THEOREM 7

If A, B, and C are square matrices such that AC = BA = I, then B = C.

PROOF B = BI = B(AC) = (BA) C = IC = C

EXAMPLE 7

The inverse of

SOLUTION It is verified by multiplying the matrices, in either order, to get I₂. Recall Theorem 4, which asserts that in verifying that a square matrix B is the inverse of a square matrix A, we need check only one of the two equations AB = I or BA = I.

Do not be misled by this example into thinking that the inverse of an integer matrix is always an integer matrix, because it rarely is! For a 2 × 2 matrix, the inverse is

where ad − bc ≠ 0.

EXAMPLE 8

Solve this equation using an inverse matrix:

SOLUTION This equation has the form Ax = b, where A is the matrix in Example 7. To solve it, multiply both sides of the equation by A⁻¹, getting x = A⁻¹b. Thus, we find the solution

We can check our result by substitution into the original equation.

THEOREM 8

For invertible n × n matrices A and B, we have (AB)⁻¹ = B⁻¹A⁻¹. Thus, the product of two invertible matrices is invertible. The result extends to the product of any number of invertible matrices.

PROOF We wish to know that B⁻¹A⁻¹ is the inverse of AB. It only requires direct verification, and this gives us a chance to use our new knowledge that matrix multiplication is associative (Theorem 7 in Section 3.1):

Elementary Matrices and LU Factorization

Some matrices have inverses that are easy to find. For example, the inverse of the diagonal matrix D = Diag(d_i) is clearly D⁻¹ = Diag(d_i⁻¹). Here Diag(α_i) is notation for a square diagonal matrix with diagonal elements α₁, α₂, …, α_n. Also, the inverse matrix of an elementary matrix that corresponds to a replacement operation such as

is found by simply changing the sign of the multiplier c in the matrix:

Inverses of elementary matrices arise naturally when we recall that elementary row operations on a matrix can be interpreted as pre-multiplication of A by elementary matrices. (Each elementary matrix carries out one row operation, as explained in Section 3.1, Theorem 8). Thus, the row-reduction process ending at U can be expressed like this:

This equation shows immediately that

where

By Theorem 8, any product of elementary matrices is invertible.

EXAMPLE 9

Suppose we have a reduction process that uses only replacement operations such as this one:

Establish the LU factorization of A.

SOLUTION By writing down the elementary matrices

we obtain

where

Notice that the multiplication of these elementary matrices is particularly easy! We multiply the elementary matrices together to form E = E₃E₂E₁ rather than finding the inverse of E, which would be more difficult. Finally, we obtain

The row-reduction process can include swapping and scaling of rows to avoid small pivots, which brings in pivoting and a permutation matrix. We explore matrix factorizations further in Section 8.2.

Computing an Inverse

We turn now to an algorithm for finding the inverse of a square matrix, if it has one. From Theorem 2, we know that if AX = I then the equation XA = I follows, it being assumed that all three matrices involved are square and of the same size. Thus, if the equation AX = I has a solution X, then X is the inverse of A.

In Section 1.3, systems of equations of this sort were discussed. One simply forms an augmented matrix and carries out the row-reduction process. If all goes well, the result will be . In other words, the result will be .

EXAMPLE 10

Compute an inverse of

SOLUTION We consider the augmented matrix

We remind the reader that this calculation is not dispositive, by, itself, in concluding that we have the inverse. As established by the calculation, we know that AX = I and X is a left inverse. An appeal to Theorem 4 is needed so that we can also infer XA = I and X is a right inverse.

More on Left and Right Inverses of Non-square Matrices

The method suggested for computing an inverse can be modified for finding right inverses or left inverses, as shown in the next examples.

EXAMPLE 11

Find all the right inverses of the matrix

SOLUTION To do this, we solve AX = I₂, where A is 2 × 3, X is 3 × 2, and I₂ is 2 × 2. The augmented matrix for this problem and its reduced row echelon form are

Because of column 3, we have one free variable for each of the two subproblems, and thus there are many solutions. Setting the free variables equal to zero, we get—as one possible right inverse—the 3 × 2 matrix

Because there are two columns and ultimately two free variables, there is a two-parameter family of solutions. Concentrating on the first column of X, we find that

We can treat the second column of X in the same way and obtain the general form of the right inverse:

We have left to General Exercise 50 a verification of this result.

EXAMPLE 12

For the matrix in Example 11, find a left inverse, if it has one.

SOLUTION To execute the search, we begin by setting up the equation to be solved: YA = I₃, where A is 2 × 3 and Y must be 3 × 2, since I₃is 3 × 3. To make this into a familiar sort of problem, take transposes, getting A^TY^T = I₃. Here is the result of the row-reduction process:

Clearly, this system is inconsistent and A has no left inverse!

Invertible Matrix Theorem

The set of all n × n matrices is divided into two classes: the invertible matrices and the noninvertible matrices. The latter are also called singular, and we have noted previously that the word nonsingular is synonymous with ‘‘invertible.’’ The favorable and useful properties of an invertible matrix are many in number. Let us contemplate a few of them here, with proofs. Later in this section, a more comprehensive list of equivalent conditions is stated (Theorem 10).

THEOREM 9

The following properties are equivalent for an n × n matrix A. That is, each property implies all the others.

1. A is invertible.

2. For each b in , the equation Ax = b has a unique solution.

3. The kernel of A consists of only the zero vector 0.

4. Every column of A has a pivot position.

5. Every row of A has a pivot position.

PROOF The efficient way to prove such a theorem is to establish the circle of implications 1 2 3 4 5 1.

For 1 2, note that invertibility implies that an inverse exists, A⁻¹. Because AA⁻¹b = b, the system Ax = b has a solution. To see that it is unique, suppose that Ax = b. Multiply on the left by A⁻¹ to get x = A⁻¹b.

For 2 3, observe that 0 is the only solution of the equation Ax = 0. In other words, 0 is the only member of the kernel of A.

For 3 4, recall that if any column of A has no pivot position, then the homogeneous set of equations will have free variables and many solutions. By 3, this does not happen, and every column of A must have a pivot position.

For 4 5, observe that 4 forces there to be n pivot positions in A. Hence, each row must have a pivot.

For 5 1, recall that if each row of A has a pivot, then the row-reduction process applied to the augmented system , ends with , thereby establishing the invertibility of A. This argument gives us AX = I, and that suffices because of Theorem 4.

The following important theorem has many parts, each part asserting some property that an invertible matrix must possess. Each property in the theorem is equivalent logically to all the others. You can think of this as a list of important properties of any invertible matrix. But it is more than that, because a matrix that has any one of the listed properties must be invertible, without any further evidence. Caution: The theorem applies only to square matrices.

This theorem and other versions of it are collectively known as the Invertible Matrix Theorem. Some of the notation and terminology used in this theorem have not yet been dealt with in this text, but it is convenient to have all these properties listed together. For example, column spaces and row spaces arise in Section 5.1 and eigenvalues are covered in Section 6.1. To avoid almost doubling the number of equivalent statements in the Invertible Matrix Theorem, most statements involving A^T have been omitted. To help make the statement of this theorem more digestible for the reader, we intersperse it with some labels.

THEOREM 10 INVERTIBLE MATRIX THEOREM

The following conditions on an n × n matrix A are logically equivalent. In other words, if the matrix has any one of the given properties, then it has all of them.

Coefficient Matrix

1. A is invertible.

2. A^T is invertible.

Linear System

3. The system Ax = b has at least one solution for every b in .

4. For every b in , the equation Ax = b has a unique solution.

5. The homogeneous system Ax = 0 has only the trivial solution x = 0.

Pivots

6. A has n pivot positions.

7. Every column of A has a pivot position.

8. Every row of A has a pivot position.

Rank and Row Equivalence

9. A is row equivalent to I_n.

10. Rank(A) = n.

Linear Transformations

11. The transformation x Ax is one-to-one (injective).

12. The transformation x Ax maps onto (surjective).

Left Inverse and Right Inverse

13. There is an n × n matrix B such that BA = I_n.

14. There is an n × n matrix C such that AC = I_n.

Column Vectors

15. The columns of A form a linearly independent set.

16. The set of columns in A spans .

17. The columns of A form a basis for .

18. The column space of A is : Col(A) = .

19. Dim(Col(A)) = n.

Row Vectors

20. The rows of A form a linearly independent set.

21. The set of rows in A spans .

22. The row space of A is : Row(A) = .

23. The rows of A form a basis for .

24. Dim(Row(A)) = n.

Eigenvalues, Singular Values, and Determinant

25. 0 is not an eigenvalue of A.

26. 0 is not a singular value of A.

27. A has n nonzero singular values.

28. Det(A) ≠ 0.

Kernel or Null Space

29. The kernel of A contains only the zero vector 0.

30. Null(A) = {0}.

Application: Interpolation

One application of linear algebra that arises in many scientific enterprises involves the reconstruction of a function from knowledge of some of its values. For example, we might have empirical data that we wish to reproduce by means of a formula. Or, an experiment might lead to a table of data consisting of points in the plane (t_i, y_i) for i = 0, 1, 2, …, n. The variable t_i is thought of as the independent variable and y_i is the dependent variable. These data are discrete, meaning that we have no curve but only individual points. If a continuous function f is sought such that y_i = f(t_i) for all of the points in the given table, then this problem is called interpolation. Frequently we use polynomials for this task of interpolation. (Spline functions are usually better, but their theory is beyond the scope of this book.)

EXAMPLE 13

Find a polynomial of least degree that reproduces these data: (1, 3), (2, 7), and (3, 9).

SOLUTION The polynomial will have these values: p(1) = 3, p(2) = 7, p(3) = 9. Thus, three conditions are placed on p. Degree 2 is chosen, because a polynomial of degree 2 has three available coefficients. Let p(t) = a + bt + ct². Then

The augmented matrix and its reduced echelon form are

The answer to the problem is then the polynomial p defined by p(t) = −3 + 7t − t². One verifies this result by evaluating the polynomial at the three given points and getting the values 3, 7, and 9.

The polynomial interpolation problem, if carefully stated, will always have a unique solution, as established in the next theorem.

THEOREM 11

If t₀, t₁, …, t_n are distinct real numbers and if y₀, y₁, …, y_n are arbitrary real numbers, then there is a unique polynomial p of degree at most n such that p(t_i) = y_i for 0 ≤ i ≤ n.

PROOF We have seen in the preceding example that such a polynomial is governed by a system of equations. There will be n + 1 equations and n + 1 unknown coefficients in the general case considered here. The matrix that arises will be (n + 1) × (n + 1). Consider the homogeneous problem, in which all the values y_i are zero. The polynomial p must satisfy the equation p(t_i) = 0 for 0 ≤ i ≤ n. However, a nontrivial polynomial of degree n or less can have at most n zeros in the complex plane (see Fundamental Theorem of Algebra in Appendix B). Hence, we conclude that p = 0 and that the homogeneous problem has only the zero solution. The matrix mentioned previously is therefore invertible, and the system of equations in the nonhomogeneous problem will have a unique solution. (We are using one part of the Invertible Matrix Theorem: 5 4.)

EXAMPLE 14

Find a linear combination of these three functions

that takes values 4, 3, −4 at the points 0, 1, 2, respectively.

SOLUTION This is obviously impossible because each basic function takes the value zero at the point zero. This impossibility does not contradict the preceding theorem because that theorem does not apply in this situation. (It requires all the powers of the variable t, from 0 to n, inclusive.)

Interpolation by other functions is also possible, as illustrated in the next example.

EXAMPLE 15

Find a linear combination of these three functions

that takes values 1, 7/12, 13/30 at the points 4, 5, 6.

SOLUTION Let the unknown function be f = a₁h₁ + a₂h₂ + a₃h₃. When we impose the interpolation conditions, we obtain

We multiply each equation by a suitable factor to eliminate the fractions, thus arriving at the augmented matrix and its row reduced echelon form:

This work reveals that a₁ = 3, a₂ = −2, and a₃ = 1.

The preceding example has some interesting theoretical background. The relevant theorem is as follows.

THEOREM 12

If t₁, t₂, …, t_n are distinct points in an interval [a, b], and if distinct points s₁, s₂, …, s_n are outside that interval, then the matrix having elements a_ij = (t_i − s_j)⁻¹ is invertible.

(For a proof of this, see Cheney [1982, p. 195].)

Before leaving the subject of interpolation, we should point out that it is a topic in all elementary textbooks on numerical analysis. Interpolation by polynomials is emphasized, and efficient algorithms are developed for finding the polynomial that takes on prescribed values. The subject has a rich history, going back to Newton and some of his predecessors.

Mathematical Software

Mathematical software systems contain procedures useful in linear algebra such as computing the inverse of a given matrix. They are very easy to use in the following three well-known mathematical software packages, as will be shown here. We use a 3 × 3 matrix randomly generated by a MATLAB command.

MATLAB commands are given for these four tasks: (1) displaying numbers to full precision, (2) constructing a random matrix A, (3) invoking a program to compute the inverse of A, and (4) checking the results. This program and similar ones in Maple and Mathematica are shown in subsequent text. The programs may not automatically display numbers with the full precision in which they are computed and stored.

MATLAB	Maple
format long A = -5.0+10rand(3) inv(A) BA	with(LinearAlgebra): A := RandomMatrix(3,3, generator=-5.0..5.0); B := MatrixInverse(A); B.A;


Mathematica


MatrixForm[A = Table[Random[Real,-5,5,15],3,3]]
MatrixForm[B = Inverse[A]]
B.A

SUMMARY 3.2

• If AB = I, then B is a right inverse of A, and A is a left inverse of B.

^• If A is square and either BA = I or AB = I, then B is the inverse of A and A is the inverse of B.

• Computing an inverse: use row reduction to solve [A | I] ~ [I | A⁻¹].

• If the equation Ax = b has a solution and if BA = I, then Bb is the unique solution of the system.

• If AB = I, then the system of equations Ax = b has at least one solution, namely x = Bb.

• If A has a right inverse, then the equation Ax = b has at least one solution.

• If A has a left inverse, then the equation Ax = b has at most one solution.

• For square matrices only: If BA = I, then AB = I.

• For square matrices only: If AB = AC = I, then B = C.

• If A is an n × n invertible matrix, then the equation Ax = b has a unique solution: x = A⁻¹b.

• For square matrices only: If AC = BA = I, then B = C.

• (AB)⁻¹ = B⁻¹A⁻¹ (Here A and B are n × n invertible matrices.)

• Let A be an arbitrary 2 × 2 matrix, , and let Δ = ad − bc. The invertibility of A is equivalent to the condition Δ ≠ 0.

• For distinct points t₀, t₁, …, t_n and arbitrary points y₀, y₁, …, y_n, there is a unique polynomial p of degree at most n such that p(t_i) = y_i for 0 ≤ i ≤ n.

• Invertible Matrix Theorem.

KEY CONCEPTS 3.2

Inverse, right inverse, left inverse, invertible, noninvertible, singular, nonsingular, an inverse as the product of elementary matrices, Invertible Matrix Theorem, interpolation

GENERAL EXERCISES 3.2

1. Compute the inverses of

2. Find another right inverse for the matrix D in Example 2, and verify its correctness.

3. Find a left inverse for each of the following matrices, if they have one:

4. Establish that if the quantity Δ = ad − bc is not zero, then the inverse of is

5. (Continuation.) Establish the converse of the assertion in General Exercise 4. That is, show that the invertibility of the given matrix implies Δ ≠ 0.

6. Without doing any computing, explain why the following equation is not true.

7. Explain what problems are solved by row-reducing these three augmented matrices.

8. Find all the right inverses of the matrix

9. Find the inverses of these matrices. (Two of them consist of integers.)

10. Critique the solution offered for the system

of equations

First write

Next, multiply both sides of the equation on the left by the matrix ,

getting the solution .

Verify your solution by substitution.

11. Find the inverse of . Verify your answer in an independent manner.

12. If A and B are n × n matrices such that BA = I, does it follow that AB = BA? If so, explain.

13. Is there an invertible 2 × 2 matrix such that ?

14. Derive the formula for the inverse of a 2 × 2 matrix by using the equation and solving for the unknowns e, f, g, h.

15. Find all the right inverses of the matrix

16. Let ,

Verify that BA = I₂. Next, consider the equation Ax = c, where c = [−4, 3, 2]^T. Solve this equation by multiplying both sides by B, on the left, getting BAx = Bc or x = Bc = [1, 9]^T. Verify that the solution is correct.

17. Consider . The inverse of this matrix consists of integers. Find it.

18. Find the polynomial of least degree that interpolates these data: (0, −5), (1, −3), (2, 1), and (−1, −11).

19. Find the polynomial of least degree whose graph passes through these points: (−2, 29), (−1, 12), (0, 3), and (1, 4).

20. Find all left inverses of .

(These make a two-parameter family of left inverses.)

21. Find all left inverses of

22. Establish the validity of Theorem 1. Explain all details.

23. Explain why every invertible matrix is row equivalent to I.

24. Establish that if A is a nonsquare matrix, then it cannot have both a right inverse and a left inverse.

25. Find the general right inverse of the matrix in Example 2 in the form C + tE + sF, where t and s are arbitrary real numbers, while C, E, and F are 3 × 2 matrices.

26. Establish that if a matrix has a right inverse but has no left inverse, then the right inverse is not unique.

27. Explain why this solution of the equation is faulty: square both sides: ; collect similar terms: ; square again: 25 − 10x + x² = 4(2x + 2); collect similar terms: x² − 18x + 17 = 0; factor: (x − 1)(x − 17) = 0; get a root from each factor showing that the solutions are x = 1 and x = 17.

28. Let A and B be n × n matrices. Assume that B ≠ 0 and AB = 0. Does it follow that A is not invertible?

29. Try to generalize Theorem 4 as follows: If A and B are square matrices such that AB is a diagonal matrix with nonzero values on its diagonal, then AB = BA.

30. Establish that if A is invertible, then (A⁻¹)⁻¹ = A.

31. Explain why the invertibility of A implies the invertibility of A^T .

32. For an invertible matrix A, the negative integer powers of A are defined by A^−j = (A⁻¹)^j where j is a positive integer. Explain why, for such a matrix, (A⁻¹)^j = A^−j for all integers j (positive and negative).

33. If A⁴ = I, what is A⁻¹? Generalize.

34. Give an argument to show that if a symmetric matrix is invertible, then its inverse is symmetric.

35. For an invertible matrix A, establish that (A^T )⁻¹ = (A⁻¹)^T .

36. Let A and B be n × n invertible matrices. Let the reduced echelon form of the augmented matrix [A | B] be denoted by [C | E]. What are C and E explicitly?

37. Establish that if a matrix has more rows than columns, then it has no right inverse.

38. (Continuation.) State and establish the form of a left inverse.

39. Explain why the nonzero scalar multiples of an invertible matrix are invertible.

40. Establish that if A is noninvertible, then there is a nonzero row vector, x, such that xA^T = 0.

41. Explain why the inverse of an invertible lower triangular matrix is lower triangular.

42. Find some square matrices, A, having the property A⁻¹ = A^T. Later, we shall see such matrices arising naturally. Perhaps you can find all the 2 × 2 matrices having this property.

43. Is there a valid computational rule that states (A^T B^T)⁻¹ = (A⁻¹B⁻¹)^T ? Explain.

44. Suppose that A is an m × n matrix that has a left inverse and a right inverse. These inverses are not assumed to be identical. Establish that, in fact, they are identical and m = n.

45. (Research project.) Considering only n × n matrices A and B, find out when AB = BA.

46. Define a linear transformation T by the equation T(x) = Ax, where A is an n × n matrix. Explain why T is injective if and only if it is surjective.

47. Verify the result in Example 6 by computing AB and BA.

48. Give an example other than the one in the text in which BA = I and ABb ≠ b for some vector b.

49. Establish that if an m × n matrix has a right inverse, then m ≤ n.

50. Verify that X, as given in Example 11, is indeed a right inverse of A.

51. Let A, B, and C be matrices such that AB = I_m and CA = I_n. Establish that n = m and that C = B.

52. Let A, B, and C be square matrices such that BA = CA = I. Does it follow that B = C?

53. Do the invertible matrices in form a subspace of ? What about the noninvertible matrices? Look ahead to Section 5.1 for the definition of subspace.

54. Is there a matrix A that has one and only one right inverse, but has no left inverse?

55. A beginner writes and concludes that

Is this ever true? Find all the cases when it is true, allowing the dimension of the square matrix to vary.

56. Let A, B, and C be n × n matrices such that BA = CA, and BA is invertible. Does it follow that B = C?

57. Establish that from the equation AB = CA, we cannot conclude, in general, that B = C, even if A is invertible.

58. Use the four numbers 2, 3, 5, 7 to create a 2 × 2 matrix whose inverse has integers for all its entries. Do the same for 4, 5, 6, 11 and 5, 6, 9, 11 and 2, 95, 117, 5557.

59. Let A, B, and I be n × n matrices, where I is the identity matrix. What conclusion can be drawn from the hypothesis [I | A] ~ [B | I]?

60. Critique the following procedure for finding the inverse of a matrix A: Select a nonzero vector x and set b = Ax. Define γ = x^T b and C = γ⁻¹xx^T. It follows that 1 = γ⁻¹γ = γ⁻¹x^T b. Furthermore, we have Cb = γ⁻¹xx^T b = xγ⁻¹x^T b = x(γ⁻¹x^T b) = x. From the two equations Cb = x and x = A⁻¹b, we conclude that Cb = A⁻¹b and C = A⁻¹. (This example is from the book by Barbeau [2000].)

61. If A and B are n × n invertible matrices, can the equation A + B = AB be true?

62. Let A, B, and C be n × n matrices such that B ≠ C and AB = AC. Find the inverse of A.

63. Let A be a square matrix whose rows form a linearly independent set. Do the columns of A necessarily form a linearly independent set?

64. For square matrices, prove that if AB = I, then AB = BA. Is this assertion true for matrices other than I?

65. Suppose that we have n × n matrices such that AE₁E₂ ··· E_m = I. What is A⁻¹? (It is not being assumed that the matrices E_i are elementary matrices.)

66. Justify these implications by use of the Invertible Matrix Theorem. (All matrices here are n × n.)

a. If A is invertible and Ax = Ay, then x = y.

b. If BA = I, then the columns of B span .

c. If Ax ≠ 0 whenever x ≠ 0, then the equation Ay = b has a solution for any b ∈ .

d. If A is row equivalent to I_n, then A is invertible.

e. If A has n pivot positions, then the rows of A span .

f. If the columns of A form a linearly independent set, then the rows also form a linearly independent set.

g. If the equation Ax = b has a solution for every b ∈ , then 0 is the only solution of the equation Ax = 0.

h. If the mapping x Ax is one-to-one, then A is invertible.

67. Find a linear combination of the three functions t 1, cos t, sin t that interpolates these data: (0, 3), (π/2, −2), and (3π/2, 4).

68. Present an argument to explain why a matrix that has both a right inverse and a left inverse is invertible.

69. Verify that with the three functions f₀(t) = 1, f₁(t) = t³, and f₂(t) = t⁶, we can interpolate arbitrary data at any three distinct points.

70. Explain why with the three functions g₀(t) = 1, g₁(t) = t, and g₂(t) = t³, we cannot necessarily interpolate arbitrary data at three distinct points. Are there conditions that can be placed on the nodes (the points t_i) to ensure that the interpolation problem can be solved for arbitrary values of y₀, y₁, and y₂?

71. If the data in an interpolation problem come from a polynomial p,willthe resulting interpolating polynomial be p? Discover the correct theorem in the answer to this question.

72. The values of t that are involved in an interpolation problem are often called nodes. Suppose that our nodes are ordered like this: 0 < t₀ < t₁ < t₂, … < t_n. Can we interpolate arbitrary data at the nodes with the functions t t^j/2, where 0 ≤ j ≤ n?

73. Explain that if interpolation nodes are specified in this order 0 < t₀ < t₁ < … < t_n, then arbitrary data at these points can be interpolated by a function of the form .

74. Find the coefficients a₁, a₂, a₃ so that the function f(t) = a₁(1 + t)⁻¹ + a₂(2 + t)⁻¹ + a₃(3 + t)⁻¹ will interpolate arbitrary values y₁, y₂, y₃ at the points 1, 2, 3. In this situation, the inverse matrix would be useful so that we could easily compute the coefficients a_j from the data y_i.

75. Let f(t) = a₁(t + 1)⁻¹ + a₂(t − 2)⁻¹ + a₃(t + 2)⁻¹. Find the coefficients a_j so that f(0) = 15/2, f(1) = 43/6, and . Draw a rough graph of the resulting function f. Is this function a satisfactory interpolating function for the given data?

76. In Theorem 12, it is asserted that the polynomial of degree at most n that takes assigned values at n + 1 distinct points on the real line is unique. How then can you account for the fact that p(x) = 4x³ − 22x² + 35x − 12 and q(x) = 3 + 5(x − 3) − 2x(x − 3) + 4x(x − 3) (x − 2) take the same values at the points 0, 1, 2, 3? Do these two polynomials agree in value at any other points?

77. Find a function of two variables, f(x, y) = ax + by + c, that takes these values: f(2, −1) = 0, f(3, 2) = 11, and f(−1, 4) = 9. This is an example of bivariate interpolation, there being two independent variables.

78. What is the maximum number of zeros that can be present in an invertible n × n matrix?

79. Establish that if A and B are n × n matrices such that AB is invertible, then A and B are invertible.

80. Establish this string of implications, assuming (a) to start. (a) A is invertible. (b) The system of equations Ax = b has a solution for every b ∈ . (c) The set of columns in A spans . (d) The transformation x Ax maps onto . (e) The columns of A form a linearly independent set.

81. Establish that if A and B are n × n matrices, and if one of A and B is invertible, then AB is row equivalent to BA.

82. Explain why if all the elements of an invertible matrix are positive, then the same cannot be true of its inverse. Generalize. What can you say about an invertible matrix whose elements are nonnegative?

83. Establish that A^k is invertible for some value of k greater than zero, then A is invertible.

84. Suppose that a matrix A has the property that AA^T has a left inverse. Explain why the rows of A must form a linearly independent set.

85. Argue that if A is an m × n matrix of rank m, and if B is an n × n invertible matrix, then ABB^TA^T is invertible. You should try to generalize to ABA^T when B is invertible.

86. In Example 9, form E and then calculate E⁻¹ using the row-reduction process to illustrate that this approach is more difficult than the method shown in the text for finding the inverse.

87. Solve Example 5 using a different free variable.

88. In Example 6, verify that AB = I, A⁻¹ = B, and AB = I.

89. Carry out the details in solving Example 10 and verify that AX = I and XA = I.

90. Explain why the system in Example 12 is inconsistent after carrying out the calculations. In general, does every entry on the righthand side of the last row have to be nonzero for the system to be inconsistent?

COMPUTER EXERCISES 3.2

1. Find out how quickly your software package can invert a random matrix. If you decide to use MATLAB, you will find these commands useful:

a. To generate a random matrix of size 250 × 250, use the command A=rand(250,250); Notice the symbol ‘‘;’’. It countermands the display of the matrix A. You certainly do not want to see allrowsandcolumnsofahugematrix. The MATLAB command rand produces random numbers uniformly distributed in the interval from 0 to 1.

b. To time the execution of the inversion process, use tic, inv(A); toc. This will carry out the inversion of the matrix and report the elapsed time for the process. The output of the inversion command is suppressed by the symbol ‘‘;’’ as described previously.

c. If you have access to different machines or different operating systems, time several of them. You may prefer to use larger matrices, such as 1000 × 1000. In our experimentation we found a time of 40 seconds on one machine and 12 seconds on another; in both cases the matrices were of order 1000.

2. Devise an algorithm for inverting a unit lower triangular matrix. Program and test the computer code for inverting some matrices. For example, invert

3. Using rational integer arithmetic, compute A⁻¹ and x = A⁻¹b:

4. Using rational integer arithmetic, compute A⁻¹:

a. Consider

Here and complex numbers are involved. (See Appendix B.)

5. Let

Explore how changing only one entry in this matrix by the addition of 0.01 affects the values in A⁻¹. For example, replace either 92 by 92.01 or 78 by 78.01 or 10 by 10.01.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for CHAPTER THREE Matrix Operations

Create new playlist

Sign In

Sign Up

CHAPTER THREEMatrix Operations

3.1 MATRICES

Matrix Addition and Scalar Multiplication

Matrix–Matrix Multiplication

Pre-multiplication and Post-multiplication

Dot Product

Special Matrices

Matrix Transpose

Symmetric Matrices

Skew–Symmetric Matrices

Non-commutativity of Matrix Multiplication

Associativity Law for Matrix Multiplication

Linear Transformations

Elementary Matrices

More on the Matrix–Matrix Product

Vector–Matrix Product

Application: Diet Problems

Dangerous Pitfalls

SUMMARY 3.1

KEY CONCEPTS 3.1

GENERAL EXERCISES 3.1

COMPUTER EXERCISES 3.1

3.2 MATRIX INVERSES

Solving Systems with a Left Inverse

Solving Systems with a Right Inverse

Analysis

Square Matrices

Invertible Matrices

Elementary Matrices and LU Factorization

Computing an Inverse

More on Left and Right Inverses of Non-square Matrices

Invertible Matrix Theorem

Application: Interpolation

Mathematical Software

SUMMARY 3.2

KEY CONCEPTS 3.2

GENERAL EXERCISES 3.2

COMPUTER EXERCISES 3.2

Table of Contents for
CHAPTER THREE Matrix Operations

CHAPTER THREE
Matrix Operations