3.7 Linear Equations and Curve Fitting

Linear Equations and Curve FittingLinear algebra has important applications to the common scientific problem of representing empirical data by means of equations or functions of specified types. We give here only a brief introduction to this extensive subject.

Typically, we begin with a collection of given data points (x0,y0),(x1,y1),,(xn,yn) that are to be represented by a specific type of function y=f(x). For instance, y might be the volume of a sample of gas when its temperature is x. Thus the given data points are the results of experiment or measurement, and we want to determine the curve y=f(x) in the xy-plane so that it passes through each of these points; see Figure 3.7.1. Thus we speak of “fitting” the curve to the data points.

FIGURE 3.7.1.

A curve y=f(x) interpolating (that is, passing through) given data points.

We will confine our attention largely to polynomial curves. A polynomial of degree n is a function of the form

f(x)=a0+a1x+a2x2++anxn,
(1)

where the coefficients a0,a1,a2,,an are constants. The data point (xi,yi) lies on the curve y=f(x) provided that f(xi)=yi. The condition that this be so for each i=0,1,2,,n yields the n+1 equations

a0+a1x0+a2(x0)2++an(x0)n=y0a0+a1x1+a2(x1)2++an(x1)n=y1a0+a1x2+a2(x2)2++an(x2)n=y2a0+a1xn+a2(xn)2++an(xn)n=yn.
(2)

Because the numbers xi and yi are given, this is a system of n+1 linear equations in the n+1 unknowns a0,a1,a2,,an (the coefficients that determine the polynomial in (1)).

The (n+1)×(n+1) coefficient matrix of the system in (1) is the Vandermonde matrix

A=[1x0(x0)2(x0)n1x1(x1)2(x1)n1x2(x2)2(x2)n1xn(xn)2(xn)n],
(3)

whose determinant is discussed in Problems 6163 of Section 3.6. It follows from Eq. (25) there that, if the x-coordinates x0,x1,x2,,xn are distinct, then the matrix A is nonsingular. Hence Theorem 7 in Section 3.5 implies that the system in (2) has a unique solution for the coefficients a0,a1,a2,,an in (1). Thus there is a unique nth degree polynomial that fits the n+1 given data points. We call it an interpolating polynomial, and say that it interpolates the given points.

Example 1

Find a cubic polynomial of the form

y=A+Bx+Cx2+Dx3

that interpolates the data points (1,4), (1,2), (2,1), and (3, 16).

Solution

In a particular problem, it generally is simpler to use distinct capital letters rather than subscripted symbols to denote the coefficients. Here we want to find the values of A, B, C, and D so that y(1)=4, y(1)=2, y(2)=1, and y(3)=16. These conditions yield the four linear equations

AB+CD=4A+B+C+D=2A+2B+4C+8D=1A+3B+9C+27D=16.

We readily reduce this system to the echelon form

AB+CD=4B+D=1C+2D=0D=2,

and then back substitution yields A=7, B=3, C=4, and D=2. Thus the desired cubic polynomial is

y=73x4x2+2x3.

The graph of this cubic is shown in Fig. 3.7.2, along with the four original data points.

FIGURE 3.7.2.

Cubic curve through the four data points of Example 1.

Modeling World Population Growth

As a concrete example of interpolation, we consider the growth of the world’s human population. The table in Fig. 3.7.3 shows the total world population (in billions) at 5-year intervals. Actual populations are shown for the years 1975–2010. The figures listed for the years 2015–2040 are the world populations that were predicted by the United Nations on the basis of detailed demographic analysis of population trends during the 20th century on a country-by-country basis throughout the world. Each entry of the final column of this table gives the average annual percentage growth rate during the preceding 5-year period. For instance, 4.062(1.018)54.44 for the growth during the 5-year period 1975–1980, so the average annual growth during this period is about 1.8%.

FIGURE 3.7.3.

World population data.

Year World Population (billions) Percent Growth
1975 4.062
1980 4.440 1.80%
1985 4.853 1.79%
1990 5.310 1.82%
1995 5.735 1.55%
2000 6.127 1.33%
2005 6.520 1.25%
2010 6.930 1.23%
2015 7.349 1.18%
2020 7.758 1.09%
2025 8.142 0.97%
2030 8.501 0.87%
2035 8.839 0.78%
2040 9.157 0.71%

We see that the world population grew at an annual rate of about 1.8% during the 1980s, but the rate of growth has slowed since then, and it is expected to slow even more during the coming decades of the 21st century. In particular, the growth of the world population at the present time in history is not natural or exponential in character—that characterization would imply a constant percentage rate of growth. We explore the possibility of interpolating world population data with polynomial models that might be usable to predict future populations. It seems natural to expect better results with higher-degree interpolating polynomials. Let’s see whether this is so.

Example 2

First, we fit a linear polynomial P1(t)=a+bt (with t=0 in 1900) to the 1995 and 2005 world population values. We need only solve the equations

a+95b=5.735a+105b=6.520

for a=1.7225, b=0.0785. Thus our linear interpolating polynomial is

P1(t)=1.7225+0.0785t.
(4)

Example 3

Now let’s fit a quadratic polynomial P2(t)=a+bt+ct2 (with t=0 in 1900) to the 1995, 2000, and 2005 world population values. With the three data points (95,5.735), (100,6.127), and (105,6.520), the system in (2) yields the equations

a+95b+952c=5.735a+100b+1002c=6.127a+105b+1052c=6.520

having the calculator solution a=1.523, b=0.0745, c=0.00002 (Fig. 3.7.4). Thus our quadratic interpolating polynomial is

P2(t)=1.523+0.0745t+0.00002t2.
(5)

FIGURE 3.7.4.

TI-84 Plus CE calculator solution of the 3×3 system in Example 3.

Example 4

Next we fit a cubic polynomial P3(t)=a+bt+ct2+dt3 (with t=0 in 1900) to the 1995, 2000, 2005, and 2010 world population values. With the four data points (95, 5.735), (100, 6.127), (105, 6.520), and (110, 6.930), the system in (2) yields the four equations

5.735=a+95b+952c+953d6.127=a+100b+1002c+1003d6.520=a+105b+1052c+1053d6.930=a+110b+1102c+1103d.

As in the 3.5 Application, a calculator or computer yields the solution

[abcd]=[195952953110010021003110510521053111011021103]1[5.7356.1276.5206.930]=[22.8030.7139670.006380.000021333].

Thus our cubic interpolating polynomial is

P3(t)=22.803+0.713967t0.00638t2+0.000021333t3.
(6)

Example 5

In order to fit a fourth-degree population model of the form

P4(t)=a+bt+ct2+dt3+et4

to the 1990-1995-2000-2005-2010 world population data, we need to solve the linear system

[190902903904195952953954110010021003100411051052105310541110110211031104][abcde]=[5.3105.7356.1276.5206.930]

to find the values of the coefficients a, b, c, d, and e. The result is

P4(t)=154.473+5.867667t0.08195t2+0.00051333t30.0000012t4.
(7)

The table in Fig. 3.7.5 compares our linear, quadratic, cubic, and quartic predictions with the “correct” United Nations prediction for the year 2030. Each “error” in the third column of this table is the amount by which the corresponding prediction undershoots (positive error) or overshoots (negative error) the U.N. prediction. We see that the quadratic prediction is better than the linear but also markedly better than the cubic prediction. The quartic prediction is an improvement over the cubic, yet still not as good as the quadratic. Thus there is at best an uncertain relationship between the degree of the polynomial model and the accuracy of its predictions.

FIGURE 3.7.5.

Predictions of the 2030 world population.

Year 2030 Prediction Error
Linear 8.482 +0.019
Quadratic 8.500 +0.001
Cubic 9.060 0.559
Quartic 8.430 +0.071
United Nations 8.501

Figure 3.7.6 shows the U.N. world population data points for the years 1975 through 2040, together with the plots of the quadratic, cubic, and quartic population functions of Examples 3, 4, and 5. (The plot of the linear population function of Example 2 is virtually indistinguishable from that of the quadratic function for the values of t shown in the figure.) It looks as though the more work we do to find a polynomial fitting selected data points, the less we get for our effort. It is certainly true in this figure that—outside the interval from 1990 to 2010—the higher the degree of the polynomial, the worse it appears to fit the given data points. The issue here is the difference between

  • interpolating data points within the interval of given points being fitted, and

  • extrapolating data points outside this interval.

FIGURE 3.7.6.

Comparison of world population data and the interpolating polynomials Pn(t) for n=2,3,4.

All four of our polynomials appear to do a good job of interpolating but, somewhat paradoxically, the higher the degree, the worse the apparent accuracy of extrapolation. The highly questionable accuracy of data extrapolation outside the interval of interpolation has significant implications. For instance, consider a news report that when a certain alleged carcinogen was fed to mice in sufficient amounts to kill an elephant, the mice developed cancer. It is then argued that moderate amounts of this carcinogen may cause cancer in humans; or that if 1 part per billion of this carcinogen in the environment kills 1 person, then 1 part per million (a thousand times as much) will kill 1000 people. Such arguments are common, but they may well be cases of extrapolation beyond the range of accuracy. The bottom line is that interpolation is fairly safe—though hardly fail-safe—but extrapolation is risky.

Geometric Applications

In contrast with population prediction, there are interesting situations where curve fitting is exact. For instance, the fact that “two points determine a line” in the plane means that, when we fit the linear function y=a+bx to a given pair of points, we get precisely the one and only straight line in the plane that passes through these points. Similarly, “three points determine a circle,” meaning that there is one and only one circle in the plane that passes through three given noncollinear points. In order to find this particular circle, we recall that the equation of a circle with center (h, k) and radius r is

(xh)2+(yk)2=r2(Fig3.7.7).
(8)

FIGURE 3.7.7.

The circle with center (h, k) and radius r.

Simplification gives

(x22hx+h2)+(y22ky+k2)=r2,

that is,

x2+y2+Ax+By+C=0
(9)

(where A=2h, B=2k, and C=h2+k2r2) as the general equation of a circle in the plane.

Example 6

Find the equation of the circle that is determined by the points P(1,5), Q(5,3), and R(6, 4).

Solution

Substitution of the xy-coordinates of each of the three points P, Q, and R into (9) gives the three equations

A+5B+C=265A3B+C=346A+4B+C=52.

Reduction of the corresponding augmented coefficient matrix to reduced row-echelon form (Fig. 3.7.8) yields A=4, B=2, and C=20. Thus the equation of the desired circle is

x2+y24x2y20=0.

FIGURE 3.7.8.

TI-89 calculator solution of the 3×3 system in Example 6.

To find its center and radius, we complete the squares in x and y and get

(x2)2+(y1)2=25.

Thus the circle has center (2, 1) and radius 5 (Fig. 3.7.9).

FIGURE 3.7.9.

The circle of Example 6.

Three appropriate points in the plane also determine a central conic with equation of the form

Ax2+Bxy+Cy2=1.
(10)

This is a rotated conic section—an ellipse, parabola, or hyperbola—centered at the origin of the xy-coordinate system. Figure 3.7.10 shows a typical rotated ellipse in the plane.

FIGURE 3.7.10.

A rotated central ellipse.

Example 7

Find the equation of the central conic that passes through the same three points P(1,5), Q(5,3), and R(6, 4) of Example 6.

Solution

Substitution of the xy-coordinates of each of the three points P, Q, and R into (10) gives the linear system of three equations

A5B+25C=125A15B+9C=136A+24B+16C=1
(11)

in the three unknowns A, B, and C. Reduction of the corresponding augmented coefficient matrix to reduced row-echelon form (Fig. 3.7.11) yields the values

A=27714212,B=17214212,andC=52314212.

FIGURE 3.7.11.

TI-89 calculator solution of the reduced row-echelon form of the augmented coefficient matrix in (11).

If we substitute these coefficient values in (10) and multiply the result by the common denominator 14212, we get the desired equation

277x2172xy+523y2=14212
(12)

of our central conic. The computer plot in Fig. 3.7.12 verifies that this rotated ellipse does indeed pass through all three points P, Q, and R.

FIGURE 3.7.12.

Central ellipse passing through the points P, Q, and R of Example 7.

3.7 Problems

In each of Problems 1–10, n+1 data points are given. Find the nth degree polynomial y=f(x) that fits these points.

  1. (1, 1) and (3, 7)

  2. (1,11) and (2,10)

     

  3. (0, 3), (1, 1), and (2,5)

     

  4. (1,1), (1,5), and (2, 16)

  5. (1, 3), (2, 3), and (3, 5)

  6. (1,1), (3,13), and (5, 5)

  7. (1,1), (0,0), (1,1), and (2,4)

     

  8. (1,3), (0,5), (1,7), and (2, 3)

  9. (2,2), (1,2), (1,10), and (2, 26)

  10. (1,27), (1,13), (2,3), and (3,25)

Three points are given in each of Problems 11–14. Find the equation of the circle determined by these points, as well as its center and radius.

  1. (1,1), (6,6), and (7, 5)

  2. (3,4), (5,10), and (9,12)

     

  3. (1,0), (0,5), and (5,4)

     

  4. (0, 0), (10, 0), and (7,7)

In Problems 15–18, find an equation of the central ellipse that passes through the three given points.

  1. (0, 5), (5, 0), and (5, 5)

  2. (0, 5), (5, 0), and (10, 10)

  3. (0, 1), (1, 0), and (10, 10)

  4. (0, 4), (3, 0), and (5, 5)

  5. Find a curve of the form y=A+(B/x) that passes through the points (1, 5) and (2, 4).

  6. Find a curve of the form y=Ax+(B/x)+(C/x2) that passes through the points (1, 2), (2, 20), and (4, 41).

A sphere in space with center (h, k, l) and radius r has equation

(xh)2+(yk)2+(zl)2=r2.

Four given points in space suffice to determine the values of h, k, l, and r. In Problems 21 and 22, find the center and radius of the sphere that passes through the four given points P, Q, R, and S. Hint: Substitute each given triple of coordinates into the sphere equation above to obtain four equations that h, k, l, and r must satisfy. To solve these equations, first subtract the first one from each of the other three. How many unknowns are left in the three equations that result?

  1. P(4,6,15), Q(13,5,7), R(5,14,6), S(5,5,9)

     

  2. P(11,17,17), Q(29,1,15), R(13,1,33), S(19,13,1)

Population Modeling

Problems 23–34 are intended as calculator or computer problems and are based on the U.S. census data in the table of Fig. 3.7.13, listed by national region in millions for the census years 1950–1990. See www.census.gov/population/censusdata/table-16.pdf for further details.

In Problems 23–26, fit a quadratic function to the 1970, 1980, and 1990 population values for the indicated region.

  1. The Northeast

  2. The Midwest

  3. The South

  4. The West

  1. 27–30. The same as Problems 23–26, except fit a cubic polynomial to the 1960, 1970, 1980, and 1990 population data for the indicated region.

  2. 31–34. The same as Problems 23–26, except fit a quartic polynomial to the 1950, 1960, 1970, 1980, and 1990 population data for the indicated region.

Problems 35 through 40 illustrate the use of determinants in fitting polynomial curves to data points.

FIGURE 3.7.13.

Regional population data (in millions) for Problems 23–34.

1950 1960 1970 1980 1990
Northeast 39.478 44.678 49.061 49.137 50.809
Midwest 44.461 51.619 56.590 58.867 59.669
South 47.197 54.973 62.813 75.367 85.446
West 20.190 28.053 34.838 43.171 52.786
U.S. 151.326 179.323 203.302 226.542 248.710
  1. Explain why the determinant equation

    |yx2x1y1x12x11y2x22x21y3x32x31|=0

    fits a quadratic polynomial of the form y=Ax2+Bx+C to the three given points (x1,y1), (x2,y2), and (x3,y3).

  2. Expand the determinant in Problem 35 to find a parabola that interpolates the points (1, 3), (2, 3), and (3, 7).

  3. Explain why the determinant equation

    |x2+y2xy1x12+y12x1y11x22+y22x2y21x32+y32x3y31|=0

    fits a circle of the form x2+y2+Ax+By+C=0 to the three given points (x1,y1), (x2,y2), and (x3,y3).

  4. Expand the determinant in Problem 37 to find the equation of a circle passing through the three points (3,4), (5,10), and (9,12). Then find its center and radius.

  5. Explain why the determinant equation

    |x2xyy21x12x1y1y121x22x2y2y221x32x3y3y321|=0

    fits a central conic equation of the form Ax2+Bxy+Cy2=1 to the three given points (x1,y1), (x2,y2), and (x3,y3).

  6. Expand the determinant in Problem 39 to find the equation of the ellipse passing through the three points (0, 4), (3, 0), and (5, 5).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.132.123