Linear algebra has been described as the mathematics of computer science, and this chapter will be a bit different from prior chapters. Prior chapters discussed topics such as regression and statistical significance tests, and techniques that can directly be applied to a dataset to produce a solution of interest. A single linear algebra technique in isolation rarely provides a solution of interest to a substantive researcher. However, many numerical analysis techniques rely on linear algebra and matrix operations, making them an important part of scientific computing.
In this chapter, we will discuss the following topics:
In mathematics, matrices are simply an organized table of numbers. The reason they are used is because scientists and mathematicians have found that some numerical problems (generally systems of equations) can be solved algorithmically when the data placed into the problem is organized this way. In mathematics, a row or column of a matrix is termed a "vector", though R has a vector data structure that is not the same thing as the row or column of a matrix. The numbers characterizing a vector in mathematics are thought of as the endpoint of that vector with vector's origin at the origin of the coordinate system (this is what we term a "vector" mathematically, but it is really a point).
While the term "matrix" is often used to refer to a rectangular arrangement for string data (that is, a table), we must be much more cautious with our use of the term in numeric computation since R has more than one way to store composite variables.
While data analysts often load a dataset from an external file into a data frame, which looks like a matrix at first glance, a matrix has a number of defining features, mentioned as follows:
To convert a data frame into a matrix, we use the as.matrix
function, which will require that all data elements be of the same data type.
A vector data structure in R means something slightly different than a vector data structure in mathematics. In mathematics, a vector can either be a column vector or a row vector, but in R, a vector has no dimensions. Let's take a look at the following examples:
> a <- c(1,2,3,4) > b <- matrix(a, nrow = 1) > a [1] 1 2 3 4 > b [,1] [,2] [,3] [,4] [1,] 1 2 3 4
As we can see, both a
and b
store the same values, but R recognizes columns and rows for b
, something that it does not do for a
. However, if we were to apply a
in a linear algebraic operation, R would treat it as a column vector.
We typically refer to a matrix as an m x n data structure with m rows and n columns. There a few different kinds of matrices that we will define, as follows:
matrix(c(1:24), nrow = 2)
Alternatively, we can also specify the number of columns as follows:
matrix(c(1:24), ncol = 12)
diag(c(1:3), 3, 3)
diag()
function and passing only the dimensions to it as follows:diag(5)
c()
function, which will coerce all elements to be of the same data type.3.147.27.171