In Example 6 of Section 3.1 we applied the method of elimination to solve the linear system
There we employed elementary operations to transform this system into the equivalent system
which we found easy to solve by back substitution. Here we have printed in color the coefficients and constants (including the 0s and 1s that would normally be omitted) because everything else—the symbols x, y, and z for the variables and the and signs—is excess baggage that means only extra writing, for we can keep track of these symbols mentally. In effect, in Example 6 we used an appropriate sequence of operations to transform the array
of coefficients and constants in (1) into the array
of constants and coefficients in (2).
Rectangular arrays of numbers like those in (3) and (4) are called matrices. Thus, a matrix is simply a rectangular array of numbers, which are called the entries or elements of the matrix. The size, or shape, of a matrix is specified by telling how many horizontal rows and vertical columns it has. Each matrix in (3) and (4) has three rows and four columns; therefore, each is a (read “three by four”) matrix. We always specify first the number of rows and then the number of columns.
The matrices
have sizes and respectively.
A general system of m linear equations in the n variables may be written in the form
Observe that denotes the (constant) coefficient in the ith equation of the jth variable and that denotes the constant on the right-hand side in the ith equation. Thus the first subscript i refers to the equation and the second subscript j to the variable. For instance, denotes the coefficient of in the 3rd equation. This scheme of systematic double subscripts enables us to specify readily the location of each coefficient in a system with many equations and variables. It also helps us get rid of the excess baggage in (5) by focusing our attention on the coefficients.
The coefficient matrix of the linear system in (5) is the matrix
We will use boldface capital letters to denote matrices and lowercase letters to denote numbers. When we want to refer more briefly to the matrix A and its entries, we can write
The first subscript i specifies the row and the second subscript j the column of A in which the element appears:
First subscript | Second subscript |
Row | Column |
Although double subscripts may seem tedious when first encountered, their usefulness should not be underestimated. For instance, they are consistent with the notation used in most programming languages. Many practical applications lead to linear systems with hundreds or even thousands of variables and equations. In a typical computer system, the coefficient matrix A of such a system would be represented by a two-dimensional array in which A(i, j) denotes With this approach, the computer can store a matrix in the same way that it stores a matrix.
The matrices in (3) and (4) include not only the coefficients but also the constants on the right-hand sides in the corresponding linear systems in (1) and (2). Let us write
for the column of constants in the general system in (5). An matrix—that is, one with a single column—is often called a (column) vector and is denoted by a boldface letter. When we adjoin the constant vector b to the coefficient matrix A (as a final column), we get the matrix
This matrix is called the augmented coefficient matrix, or simply the augmented matrix, of the system in (5).
Although it takes a good deal of notation to describe the augmented matrix of a general linear system, it is a very simple matter to write the augmented matrix of a particular system. If the system is written in chalk on a blackboard with the variables in the same order in each equation, we merely erase all the ’s and the plus and equal signs (retaining a minus sign for each negative coefficient) and insert a zero in each spot where a variable is missing in an equation.
The augmented coefficient matrix of the system
of three equations in four variables is the matrix
In Section 3.1 we described the three elementary operations that are used in the method of elimination. To each of these corresponds an elementary row operation on the augmented matrix of the system. For instance, when we interchange two equations in the system, we interchange the corresponding two rows of its augmented coefficient matrix.
Figure 3.2.1 shows notation that we use to describe elementary row operations briefly. For instance,
shows the result of adding 3 times row 1 to row 2. Note that, when we perform the operation on a matrix—that is, when we add c times row p to row q—row p itself remains unchanged.
Type | Row Operation | Notation |
---|---|---|
1 | Multiply row p by c | |
2 | Interchange row p and row q | |
3 | Add c times row p to row q |
To solve the system
whose augmented coefficient matrix is exhibited in (3), we carry out the following sequence of elementary row operations, corresponding to the steps in the solution of Example 6 in Section 3.1:
The final matrix here is the augmented coefficient matrix of the system
whose unique solution (readily found by back substitution) is .
It is not quite self-evident that a sequence of elementary row operations produces a linear system having the same solution set as the original system. To state the pertinent result concisely, we need the following definition.
Thus the two matrices in (10) are row equivalent, as are the two matrices in (12) and (13). In Problem 29 we ask you to show that if B can be obtained from A by elementary row operations, then A can be obtained from B by elementary row operations. This follows from the observation that row operations are “reversible.” In Problem 30 we suggest a proof of the following theorem.
Thus, the linear systems in (11) and (14) have the same solution set, because their augmented matrices in (12) and (13) are row equivalent.
Up to this point we have been somewhat informal in our description of the method of elimination. Its objective is to transform a given linear system, by elementary row operations, into one for which back substitution leads easily and routinely to a solution. The following definition tells what should be the appearance of the augmented coefficient matrix of the transformed system.
Echelon matrices are sometimes called row-echelon matrices. Property 1 says that if E has any all-zero rows, then they are grouped together at the bottom of the matrix. The first (from the left) nonzero element in each of the other rows is called its leading entry. Property 2 says that the leading entries form a “descending staircase” pattern from upper left to lower right, as in the following echelon matrix.
Here we have highlighted the leading entries and indicated the descending staircase pattern.
The following list exhibits all possibilities for the form of a echelon matrix, with asterisks denoting the (nonzero) leading entries; a # denotes an element that may be either zero or nonzero.
It follows from Properties 1 and 2 that the elements beneath any leading entry in the same column are all zero. The matrices
are not echelon matrices because A does not have Property 1 and B does not have Property 2.
Suppose that a linear system is in echelon form—its augmented matrix is an echelon matrix. Then those variables that correspond to columns containing leading entries are called leading variables; all the other variables are called free variables. The following algorithm describes the process of back substitution to solve such a system.
The augmented coefficient matrix of the system
is the echelon matrix
The leading entries are in the first, third, and fourth columns. Hence and are the leading variables and and are the free variables. To solve the system by back substitution, we first write
where s and t are arbitrary parameters. Then the third equation in (15) gives
the second equation gives
finally, the first equation in (15) yields
Therefore,
Thus the system in (15) has an infinite solution set consisting of all given in terms of the two parameters s and t as follows:
For instance, with and we get the solution and .
Because we can use back substitution to solve any linear system already in echelon form, it remains only to establish that we can transform any matrix (using elementary row operations) into an echelon matrix. The procedure that we have already illustrated in several examples is known as Gaussian elimination. The following systematic description of the procedure makes it clear that it succeeds with any given matrix A.
In brief, we work on the matrix A one column at a time, from left to right. In each column containing a leading entry (perhaps after a row interchange), we “clear out” the nonzero elements below it and then move on to the next column.
To solve the system
we reduce its augmented coefficient matrix to echelon form as follows.
Our final result is the echelon matrix in (16), so by Eqs. (17a)–(17d) in Example 4, the infinite solution set of the system in (18) is described in terms of arbitrary parameters s and t as follows:
Thus, the substitution of any two specific values for s and t in (19) yields a particular solution of the system, and each of the system’s infinitely many different solutions is the result of some such substitution.
Examples 3 and 5 illustrate the ways in which Gaussian elimination can result in either a unique solution or infinitely many solutions. On the other hand, if the reduction of the augmented matrix to echelon form leads to a row of the form
where the asterisk denotes a nonzero entry in the last column, then we have an inconsistent equation,
and therefore the system has no solution.
We use algorithms such as the back substitution and Gaussian elimination algorithms of this section to outline the basic computational procedures of linear algebra. In modern numerical work, these procedures often are implemented on a computer. For instance, linear systems of more than four equations are usually solved in practice by using a computer to carry out the process of Gaussian elimination.
The linear systems in Problems 1–10 are in echelon form. Solve each by back substitution.
In Problems 11–22, use elementary row operations to transform each augmented coefficient matrix to echelon form. Then solve the system by back substitution.
In Problems 23–27, determine for what values of k each system has (a) a unique solution; (b) no solution; (c) infinitely many solutions.
Under what condition on the constants a, b, and c does the system
have a unique solution? No solution? Infinitely many solutions?
This problem deals with the reversibility of elementary row operations.
If the elementary row operation changes the matrix A to the matrix B, show that changes B to A.
If changes A to B, show that also changes B to A.
If changes A to B, show that changes B to A.
Conclude that if A can be transformed into B by a finite sequence of elementary row operations, then B can similarly be transformed into A.
This problem outlines a proof that two linear systems and are equivalent (that is, have the same solution set) if their augmented coefficient matrices and are row equivalent.
If a single elementary row operation transforms to show directly—considering separately the three cases—that every solution of is also a solution of .
Explain why it now follows from Problem 29 that every solution of either system is also a solution of the other system; thus the two systems have the same solution set.
Computer algebra systems are often used to ease the labor of matrix computations, including elementary row operations. The augmented coefficient matrix of Example 3 can be entered with the Maple command
with(linalg):
A := array( [[1, 2, 1, 4],
[3, 8, 7, 20],
[2, 7, 9, 23]] );
or the Mathematica command
A = {{1, 2, 1, 4},
{3, 8, 7, 20},
{2, 7, 9, 23}}
or the Matlab command
A = [1 2 1 4
3 8 7 20
2 7 9 23]
The Maple linalg
package has built-in elementary row operations that can be used to carry out the reduction of A
exhibited in Example 3, as follows:
A := addrow(A,1,2,-3); # (-3)R1 + R2
A := addrow(A,1,3,-2); # (-2)R1 + R3
A := mulrow(A,2,1/2); # (1/2)R2
A := addrow(A,2,3,-3); # (-3)R2 + R3
The inserted remarks, which follow the notation of Fig. 3.2.1, should make clear the structure of these row operations in Maple.
In Mathematica the ith row of the matrix A
is denoted by A[[i]]
, and we can operate directly on rows to carry out the reduction of Example 3, as follows:
A[[2]] = (-3)A[[1]] + A[[2]]; (* (-3)R1 + R2 *)
A[[3]] = (-2)A[[1]] + A[[3]]; (* (-2)R1 + R3 *)
A[[2]] = (1/2)A[[2]]; (* (1/2)R2 *)
A[[3]] = (-3)A[[2]] + A[[3]]; (* (-3)R2 + R3 *)
In Matlab, the ith row of the matrix A
is denoted by A(i,:)
, and we can operate similarly on rows to carry out the reduction to echelon form, as follows:
A(2,:) = (-3)*A(1,:) + A(2,:) % (-3)R1 + R2
A(3,:) = (-2)*A(1,:) + A(3,:) % (-2)R1 + R3
A(2,:) = (1/2)*A(2,:) % (1/2)R2
A(3,:) = (-3)*A(2,:) + A(3,:) % (-3)R2 + R3
You should verify (by using the computer algebra system of your choice) that these operations yield the echelon matrix
If you wanted to interchange the 1st and 3rd rows—and proceed to solve the corresponding system by “forward substitution” rather than back substitution—this could be done as follows:
|
(Maple) (Mathematica) (MATLAB) |
These “automated” elementary row operations can be used in Problems 11–22 of this section.
3.145.33.235