Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Basic matrix operations

It is important to first point out that computational operations can be done on both whole matrices, termed "matrixwise operations", and on the individual elements of a matrix, termed "element-wise operations". Matrix addition or subtraction is simply a matter of the addition of, or subtraction from, each individual value in a matrix to the value in the corresponding location in another matrix, so matrixwise and element-wise operations are the same. However, matrix multiplication is different from element-wise multiplication of elements within a matrix.

Let's start by creating a correlation matrix of our physical functioning data and creating an abbreviated raw data matrix:

cor.mat <- matrix(cor(phys.func), ncol = 20)
phys.brief.mat <- as.matrix(phys.func[c(1:30),])

This dataset is categorical rather than ordinal and is far from being normally distributed, but as we will discuss in this chapter, on the common factor model, this may still be a legitimate way to handle the data.

We will now go over a few basic matrix operations with the two matrices stored as cor.mat and phys.mat, which are the correlation matrix and the first 10 observations in the raw data (stored as a matrix rather than a data frame), respectively. It is worth noting that cor.mat has the same height and width of 20, so it is a square matrix, while phys.mat has a height of 30 and width of 20, making it a rectangular matrix.

Element-wise matrix operations

The topics that will be covered in this section are matrix subtraction, matrix addition, and matrix sweep.

Matrix subtraction

Matrix addition is element-wise subtraction. We can use a simple operator to transform the values of a matrix by adding to each element individually. We can subtract the same value to or from each element of the matrix as follows:

phys.brief.mat  1

Matrix addition

We can add each element of the matrix to the corresponding element of a different matrix:

phys.brief.mat + matrix(rnorm(600), ncol = 20)

We can also divide (or multiply) a matrix by the standard deviation of the entire matrix:

phys.brief.mat / sd(phys.brief.mat)

Matrix sweep

What if we want to divide each column by the column standard deviation and subtract the respective column mean from each individual element? To do this, we either need to do some coding to perform operations on individual columns or use R's built-in functions.

R's sweep function will "sweep out" a value or summary statistic from a matrix. This particular function can be confusing, but in summary, we will tell R about the matrix we want to do the sweeping on; whether we want to apply the sweep to columns or rows, the statistic or value we want to sweep out, and how we want to sweep things out. For example, if we want to sweep out the mean of each column by subtracting the column mean from each element of a matrix, we will do the following:

mean.phys <- apply(phys.brief.mat, 2, mean)
phys.sweep.1 <- sweep(phys.brief.mat, 2, mean.phys, '-')

We can then divide each element in each column by the standard deviation of the column:

sd.phys <- apply(phys.sweep.1, 2, sd)
phys.sweep.2 <- sweep(phys.sweep.1, 2, sd.phys, '/')
#The zeroes will give one variable NaN responses

We actually did things the hard way here. This type of rescaling of matrices is done so often that R has actually a built-in command:

phys.scaled <- scale(phys.brief.mat, center = TRUE, scale = TRUE)

Likewise, column means can be calculated with the built-in function:

colMeans(phys.brief.mat)

We can see that this gives us essentially the exact same answer as the two-step sweep process we did previously, by performing an element-wise division (a number divided by itself is 1):

phys.sweep.2 / phys.scaled

Basic matrixwise operations

In the previous section, we reviewed how to apply an operation individually to each element of the matrix as if the matrix elements existed outside it. Here, we will go over the operations that involve the matrix as a whole.

Transposition

One of the simplest matrix operations is transposition, which switches columns and rows of a matrix, as follows:

This can be easily performed in R using the following code:

t(phys.mat)

Note

The transposition of a matrix A is often denoted as A^T or as A'.

Matrix multiplication

To perform matrix multiplication with two matrices A and B, we multiply each element of each row in matrix A by each element of each column in matrix B, yielding a matrix with a height equal to matrix A's height and width equal to matrix B's width. The number of rows in A must match the number of columns in B. Unlike multiplication of two numbers, matrix multiplication is not commutative. The importance of this is that matrix multiplication allows us to decompose systems of linear equations into matrices.

Suppose, we have a system of three equations as follows:

Then, we could rewrite this as the product of two matrices as follows:

To perform matrix multiplication, we use the special matrix multiplier binary operator %*%. A toy example to demonstrate the non-commutativity of matrix multiplication is as follows:

A <- matrix(c(rep(2,3), rep(5,3)), ncol = 2, byrow = FALSE) 
B <- matrix(c(1:16), nrow = 2, byrow = TRUE)
C <- matrix(1, ncol = 2, nrow =3 , byrow = FALSE)
A %*% B
B %*% A
C %*% A
A %*% t(C)

The second and third multiplication should give errors because A has more columns than B has rows. Matrix C, which is a perfectly valid matrix, cannot be multiplied by matrix A either because of the mismatch between rows and columns. This mismatch between rows and columns is called non-conformable arguments in R. The final multiplication does work because of the transposition of matrix C.

Multiplying square matrices for social networks

Let's say that we have a social networking site where some members follow others (and nobody follows themselves). We can represent this network as a matrix, with a value of one in the position where a person follows another person. For example, in the matrix describing six people in the following figure, we would say that person 1 is followed by persons 2, 3, and 5 because there is a 1 in the second, third, and fifth columns of the first row. We would say that person 2 is only followed by person 1 and follows everyone else except person 3:

We can recreate this matrix in R as follows:

small.network <- matrix(c(0,1,1,0,1,0,1,0,1,0,0,0,1,0,0,1,0,1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0), nrow = 6, byrow = TRUE)

We can take the sum of individual rows to figure out who is the most influential in this network because a sum of rows is simply the total number of followers a person has:

apply(small.network, 1, sum)

This suggests that persons 1 and 3 are equally influential, both with total counts of three, and persons 2, 5, and 6 are all equally influential with counts of two. However, we must consider that a person in a social network may be influential if they have second degree followers (that is their followers are followed). Conveniently, to determine how many second degree followers a person has, we simply have to multiply the matrix by itself:

> apply((small.network %*% small.network), 1, sum)
[1] 7 6 7 4 4 4

This tells us that person 2 is actually more influential than persons 5 and 6 when second degree relationships are considered. If we simply add the original matrix to the squared matrix, we can get the total number of first or second degree relationships:

apply((small.network %*% small.network + small.network), 1, sum)

Real social networks, of course, have much larger graphs. Just to demonstrate that R can handle much larger matrices, we will use a bigger example with 1,000 people (for which we will simulate some data):

set.seed(51)
social.network.mat <- matrix(sample(c(0,1), 1000000, replace = TRUE, prob = c(0.7, 0.3)), ncol = 1000)
diag(social.network.mat)<-0

Tip

When simulating random data, is there a way to make R give the same results each time?

The set.seed function allows the same result to be displayed each time random data is simulated. R will then generate pseudorandom data based on the given seed. If the same seed is used, the same pseudorandom data will be produced.

Next, determine first degree relationships:

influence.1 <- apply(social.network.mat, 1, sum)

Compute the matrix of second degree relationships:

second.degree.mat <- social.network.mat %*% social.network.mat
influence.2 <- apply(second.degree.mat, 1, sum)

Compute the total influence in first or second degree relationships:

influence.1.2 <- apply(social.network.mat + second.degree.mat, 1, sum)

Many online social networks will have millions of people making these graphs much larger, but even on my modest laptop, R handles these computations in seconds. Additionally, to find third degree relationships, we simply have to raise the matrix to the third power.

Can we make our R code look more like the matrix formulas that we are likely to encounter in journals and textbooks?

We use the apply command with the sum function, which many R users are likely familiar with, but this is not a standard mathematical notation. We can use matrix multiplication to sum the rows as well, which is typically how this operation is expressed in the quantitative literature. Let's take a look at the following example:

We simply need to create a vector of ones with as many rows as our matrix has columns and post multiply the matrix by this vector:

(social.network.mat + second.degree.mat) %*% rep(1, 1000)

Outer products

In linear algebra, the outer product is classically applied to two vectors and produces a matrix with as many rows as the length of the first vector and as many columns as the length of the second vector. The elements in the matrix are produced by multiplying the corresponding elements in the two vectors together. The outer product is defined as follows:

For example, if we have two vectors, we can compute outer products in R using the outer() command:

x<- c(1:3)
y<- c(4:6)
outer(x, y)

Alternatively, we can also compute the outer product as follows:

x %o% y

However, the outer command is actually much more flexible than this. It does not simply apply the multiplication operator but can actually apply any function to the elements of the vectors. The outer product in R is defined as follows:

To apply this, we simply have to pass a function rather than let R default to multiplication:

outer(x, y, FUN = '+')

Using sparse matrices in matrix multiplication

We can use matrix multiplication to transform datasets as follows:

> M <- matrix(rep(1, 9), nrow = 3)
> N <- diag(c(1:3), nrow = 3)
> P <- matrix(rep(c(1:3),3), nrow = 3)
> Q <- matrix(1, nrow = 3)
> M
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    1
[3,]    1    1    1
> N
     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    2    0
[3,]    0    0    3
> P
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    2    2
[3,]    3    3    3
> Q
     [,1]
[1,]    1
[2,]    1
[3,]    1

Postmultiplying a matrix M by a diagonal matrix N is equivalent to multiplying the values in each column of M, by the corresponding value along the diagonal of N:

M %*% N

Premultiplying a matrix M by a diagonal matrix N is equivalent to multiplying the values in each row of M by the corresponding value along the diagonal of N:

N %*% M

Postmultiplying a matrix by a column vector of 1s yields the sum of each row:

P %*% Q

We can obtain the total score of each person on our physical functioning index using matrix multiplication. First, we will create a matrix of all of our physical functioning observations:

phys.func.mat <-  as.matrix(phys.func)

The following code will give us the total score for each person on the entire physical functioning measure:

total.scores <- phys.func.mat %*% matrix(rep(1, 20), nrow = 20)

If we look at the items on this physical function, they appear to cover different kinds of activities. Some are concerned with mobility while others are concerned with cognition. Let's say that this measure, which consists of three different domains, is composed of the following items (previously published research actually does suggest this):

Cognition or social function: This contains items A, Q, R, and S (columns 1, 17, 18, and 19)
Lower extremity (leg or mobility) activities: This contains items B, C, D, H, I, J, M, and N (columns 2, 3, 4, 8, 9, 10, 13, and 14)
Upper extremity (arm and hand) activities: This contains items E, F, G, K, L, O, P, and T (columns 5, 6, 7, 11, 12, 15, 16, and 20)

We want to create a new matrix with three columns, each of which has the score of a particular domain. To do this, we simply multiply the raw data matrix, phys.func.mat (which has 20 columns), by a design matrix. This design matrix will be a sparse matrix that will have three columns (one for each domain) and 20 rows (one for each item). Each row will contain a 1 in the column corresponding to the domain in which that item belongs:

design.matrix <- matrix(rep(0, 60), nrow = 20)
#Place 1s for the cognitive domain
design.matrix[c(1,17,18,19), 1] <- 1
# Place 1s for the lower extremity domain
design.matrix[c(2,3,4,8,9,10,13,14), 2] <- 1
# Place 1s for the upper extremity domain
design.matrix[c(5, 6, 7, 11, 12, 15, 16, 20), 3] <- 1
total.scores <- phys.func.mat %*% design.matrix
summary(total.scores)

The notion of a design matrix is frequently used in linear algebra to assign group membership.

Matrix inversion

There is no such thing as matrix division, but matrix inversion is something close to it. Only square matrices can be inverted, and not all square matrices can be inverted. When a matrix is multiplied by its inverse, it produces an identity matrix; hence, the analogy for division.

Inversion is defined as follows:

To invert a matrix in R, we simply use the solve() command (we will see a more general use for this command in the following sections):

solve(cor.mat)
cor.mat %*% solve(cor.mat)

Solving systems of linear equations

We can use R to solve a large system of linear equations with linear algebra. For example, say that we have the following set of linear equations:

This involves 10 unknowns and 10 equations. While this is solvable by hand, it is quite tedious. Instead, we can solve it in R by decomposing the system of equations into three matrices: C, X, and Y:

CX = Y

If we simply divide Y by C, we can get the matrix of X values. However, since there is no matrix division, we will instead have to rely on multiplication by the inverse of C:

X = C-¹Y

The coefficients to be placed in matrix C are available in the coefficients_matrix.txt file, in which columns 2 to 11 give the coefficients, and column 12 gives the Y values:

Y <- as.matrix(read.csv('coefficients_matrix.csv')[,12] )
C <- as.matrix(read.csv('coefficients_matrix.csv')[,c(2:11)])
X <- solve(C) %*% Y
X

In fact, R automates this process for us by passing two arguments to the solve command:

solve(C, Y)

For those who wish to experiment, we can simulate coefficients for very large systems of equations with many (for example, 1,000) unknowns. As can be seen, R can do these computations extremely quickly:

C.2 <- matrix(sample(c(1:100), 1000000, replace = TRUE), nrow = 1000)
Y.2 <- matrix(sample(c(1:1000), 1000, replace = TRUE), nrow = 1000)
solve(C.2, Y.2)

Determinants

Only square matrices have a determinant. This can be used to test whether the vectors in the matrix are linearly independent of one another, which means that no vector can be written as a combination of other vectors. If the vectors are linearly dependent then there are two important consequences:

There is redundancy of information
The matrix cannot be inverted

The determinant of a matrix in R can be found with the det() function:

det(cor.mat)

The fact that a determinant is not equal to zero implies that the 20 vectors in the cor.mat matrix are linearly independent (and therefore this matrix can be inverted).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Basic matrix operations

Create new playlist

Sign In

Sign Up