Applications

We have shown some uses of linear algebra in this chapter, but for the most part, linear algebra techniques derive practical use from the ways in which they are combined both with each other and other functions. As such, this section is included to show how linear algebra and matrix operations can be applied for substantive use.

Rasch analysis using linear algebra and a paired comparisons matrix

In recent decades, a statistical technique termed "Rasch analysis" has become increasingly popular in psychology, education, and healthcare. This is attributed to the Danish statistician, Georg Rasch, who published it in the 1960s. The basic idea is that a scale (for example, an academic test, a measure of a psychological trait, or a physical functioning measure) represents a one-dimensional latent trait, which can be represented as a number line with an arbitrary zero point. We can plot individuals along this line based on their ability. Individuals with higher abilities on the latent trait are plotted at higher values. We can also plot items on this line ordered by difficulty, with more difficult items placed at high-valued locations on the line. A formula based on logistic regression tells us the probability of a person with given ability giving an affirmative answer (or a getting a correct answer) to an item with a given difficulty.

The probability of a person with a particular ability giving an affirmative response is given as follows:

Rasch analysis using linear algebra and a paired comparisons matrix

In this formula, P is the probability of an affirmative response to the item, e is the base of the natural logarithm, Rasch analysis using linear algebra and a paired comparisons matrix is the person's ability, and b is the item's difficulty.

The difficulty of an item is equal to the ability level of a person who has a 50 percent chance of giving an affirmative answer to the item. Both the person's ability Rasch analysis using linear algebra and a paired comparisons matrix and the item's difficulty b are not directly observed but must be estimated based on the data obtained from responses on the scale.

A tremendous amount has been written both in textbooks and psychometrics journals, but generally speaking, the question that people wish to answer with Rasch analysis is what the difficulty of each item is.

We will transform our physical functioning data into binary data to indicate some difficulty versus no difficulty and apply Rasch analysis to figure out which items are the most difficult. We will perform Rasch analysis on the lower extremity domain only.

We will begin by using matrix multiplication to create a new matrix with just the lower extremity items:

lower.extremity.mat <- phys.func.mat[,c(2,3,4,8,9,10,13,14)]

We will then convert this matrix of ordinal data to a matrix of binary data as follows:

lower.extremity.binary <- replace(lower.extremity.mat, which(lower.extremity.mat %in% c(2:5)), 0)

In lower.extremity.binary, individuals who have no difficulty doing a task are given a value of 1 and individuals who have difficulty are given a value of 0.

Note

In the original data, higher scores meant more difficulty, but in this binary data frame, we have coded the data to be the opposite for the purpose of easily interpreting the results.

Thus, the latent trait that we are interested in now is the ability to perform tasks using the lower extremities without difficulty.

There are many ways to statistically estimate the item difficulties in the Rasch model, but the method that we will use here is based on a paired comparison's matrix. The overall scheme is as follows:

  1. Count the number of individuals able to perform an item without difficulty and compare that to the number of individuals able to do the other seen items without difficulty. Populate a matrix R of raw comparison data:
    Rasch analysis using linear algebra and a paired comparisons matrix

    In the preceding diagram, rij is the number of people who score a 1 on item i and 0 on item j.

    The R code to create the paired comparisons matrix is as follows:

    create.paired.comparisons <- function (input.matrix) {
      n.items <- ncol(input.matrix)
      output.matrix <- matrix(0, nrow = n.items, ncol = n.items)
      for (i in 1:n.items) {
        for (j in 1:n.items) {
          output.matrix[i, j] <- length(which(input.matrix[,i] - input.matrix[,j] > 0))
        }
      }
      return(output.matrix)
    }
    R <- create.paired.comparisons(lower.extremity.binary)
    R
  2. Create a matrix D as follows:
    Rasch analysis using linear algebra and a paired comparisons matrix
  3. The corresponding R code is as follows:
    D <- t(R) / R
    diag(D) <- rep(1, 8)
    D
  4. Take the natural logarithm of each element in D:
    ln.D <- log(D, exp(1))
  5. Find the mean of each row as follows:
    (ln.D %*% matrix(rep(1, 8), nrow = 8)) / 8

The final command shows us that the locations for each item are as follows:

No.

Item

Location

B

Walking for a quarter mile

0.854

C

Walking up 10 steps

-0.169

D

Stooping, crouching, kneeling

2.076

H

Walking from one room to another

-3.561

I

Standing up from a chair

-0.440

J

Getting in or out of bed

-0.761

M

Standing for 2 hours

1.936

N

Sitting for 2 hours

0.065

These results suggest that the easiest mobility item on this scale is item B, and the most difficult is item D. Thus, if a person has only a little difficulty with use of their legs (due to arthritis or other impairments), they might report problems with stooping, crouching, or kneeling, but if they report problems with simply walking between rooms, we would expect that they have a high degree of impairment and are likely to have difficulty with every other item on this scale.

There are a number of other methods to estimate the parameters of the Rasch model with implementations of them in various R packages. The eRm package is one of the numerous available packages, which offer a full suite of functions for both applying the Rasch model and testing model fit. The answers obtained in the previous example can be compared to those obtained from the RM() command in the eRM package:

library(eRM)
RM(lower.extremity.binary)

The units in which these difficulties are expressed are often called "logits". They have no meaning outside of the scale to which they are being applied, and the zero point is arbitrary, so it is perfectly acceptable to add or subtract (but not multiply or divide) any number to them.

Calculating Cronbach's alpha

Earlier, we went over the Rasch analysis, which is a part of a psychometric approach termed item response theory (IRT). However, long before there was IRT, there was classical test theory (CTT), which was concerned with estimating or measuring "true scores". The underlying idea of CTT posited that on a test is Observed Score = True Score + Error.

Therefore, much of the classical test theory is concerned with determining how much of a person's observed score is due to the true score. One important metric for this that is still in heavy use today (in spite of arguments that it has outlived its usefulness) is Cronbach's coefficient alpha. The basic idea underlying internal consistency reliability (and CTT in general) is that the score on a subgroup of items should be correlated with the score on the rest of the test. In the case of alpha, the subgroups of items that are examined are split halves, such that alpha is equal to the average split-half reliability for all possible splits of items. Conveniently, an algebraic formula to compute alpha has been developed so that analysts do not actually have to determine all split half correlations and average them.

The formula for Cronbach's alpha is:

Calculating Cronbach's alpha

In the preceding formula, Calculating Cronbach's alpha is a Cronbach's alpha, k is the number of items, Calculating Cronbach's alpha is a variance of item i, and Calculating Cronbach's alpha is a variance of total test scores.

Here, we will use matrix operations in R to compute alpha for the correlation matrix of the NHANES physical functioning measure.

  1. First, compute the total score for each domain for each person as follows:
    domain.totals <- phys.func.mat %*% design.matrix

    The total score variance for any domain is the diagonal of the covariance matrix of total scores:

    tot.score.var <- diag(cov(domain.totals))
  2. We then find the sum variance of each item for each scale and convert it to a row vector:
    item.var <- diag(cov(phys.func.mat))
    item.var <- matrix(item.var, ncol = 20)
  3. We then post multiply it by our design matrix to obtain total item variances on each domain:
    item.var.tot <- item.var %*% design.matrix
  4. We can calculate the number of items in each domain using our design matrix:
    n <- matrix(1, ncol = 20) %*% design.matrix
  5. Finally, we put everything in the formula for alpha:
    alpha <- (n / (n-1)) * (1 - (item.var.tot / tot.score.var))
    alpha

This is the raw alpha. The standardized alpha can be obtained by performing these operations on the correlations rather than covariances.

Image compression using direct cosine transform

Earlier, we saw how SVD can be used to compress a matrix that is representative of data. Here, we will use another decomposition method to decompose an image into spectral features, which we can use for image compression. We will do this with the discrete cosine transform (DCT) of a square matrix. The DCT is one of a variety of transformation functions that transforms a signal expressed in terms of location to one expressed in terms of frequency. In the case of the DCT, this is done by decomposing a set of discrete values into a series of cosine functions.

Importing an image into R

Before we can start working with an image, we have to import the image, which we will do using the png package. We will import the picture as an array and examine its dimensions as follows:

> library(png)
> picture <- readPNG('landscape.png')
> dim(picture)
[1] 1536 2048    3

The first two dimensions give the pixel dimensions of the image, and the third dimension gives the number of color layers. This array has two important features in its dimensions. The first is that the third dimension is three, which indicates that even though this is a picture of a landscape that has been converted to shades of gray, it is still stored as a three-color PNG file, with a red, green, and blue layer. However, each color layer has identical values in this matrix. The second notable feature is that the pixel dimensions of the image are divisible by eight. This is important because we will base our image compression example on the JPEG standard for image compression in the following section, which decomposes an image into 8 pixel squares; image padding is needed if the number of pixels on a side is not divisible by eight.

Here, we create an array composed of just the first color layer and then use R's image command to view the array as a graphic. We transpose the array and reverse the counting of rows to place the image in the correct orientation. We also tell it to plot intensity values using grayscale rather than color, as shown in the following code:

picture.1 <- picture[,,1]
image(t(picture.1)[,nrow(picture.1):1], col = gray(seq(0,1, length.out = 256)))

We can see the following screenshot imported into R plotted on a graphics device:

Importing an image into R

The compression technique

We will start with describing the basic image compression technique and show the corresponding R code further. The following are the basic steps in our image compression:

  1. Break an image into 64 (8*8) block of pixels, which in our case means 8*8 submatrices or subarrays.
  2. Convert each 8*8 block into an 8*8 block of frequencies (that is, coefficients for cosine terms). This is done using a transformation matrix T, which we will see further.
  3. Discard the frequencies of least importance in each 8*8 block and keep only those of the most importance, which we do with a quantization matrix Q.

Decompression is simply a reversal of the first two steps (we can't undiscard lost data). In the end, we will lose some of the data but hopefully not enough to be visually noticeable, which is similar to what we did with SVD previously.

Creating the transformation and quantization matrices

To perform the DCT, we will first need to compute a transformation matrix T, which we will later use in the actual image transformation. More detailed treatment of the underlying theory is present in many signal processing textbooks but here, we will skip to presenting the needed formulas. We will assume that T is a square matrix with N rows and columns, denoted by i and j, respectively, with the numbering of i and j starting at 0. The formula is shown as follows:

Creating the transformation and quantization matrices

The quantization matrix Q will be based on a predeveloped quantization table, which in this case will be a quantization table provided by the independent JPEG group. This is the most commonly used table, but others are available, and some companies have patented methods for the ad hoc creation of quantization tables based on the image being compressed. The basic quantization table is shown as follows:

16

11

10

16

24

40

51

60

12

12

14

19

26

58

60

55

14

13

16

24

40

57

69

56

14

17

22

29

51

87

80

62

18

22

37

59

68

109

103

77

24

35

55

64

81

104

113

92

49

64

78

87

103

121

120

101

72

92

95

98

112

100

103

99

A quality level between 0 and 100 can be created, and the matrix shown in the previous table produces a compression at a quality level of 50. Additional quantization matrices can be created as scalar multiples of the quantization table based on the image quality desired with higher levels corresponding to higher quality. This table has smaller values at the top-left corner and larger values towards the bottom-right corner. As we will see in a moment, this is an important part of compression.

Putting the matrices together for image compression

To transform the original 8*8 matrix R to the discrete cosine transformed matrix D, we simply use matrix multiplication:

D = TRT'

Here, T' is the transposition of T.

The compressed matrix C is created by the element-wise division of D, by Q rounded to the nearest integer value:

Ci,j = Di,j / Qi,j

The most important frequencies in the matrix tend to congregate towards the top-left corner of D. Since Q has small values in the top-left corner and large values in the bottom-right corner, the division followed by rounding will mean that those values in the top-left corner will be retained as integers while those in the bottom-right corner will be rounded to zero and discarded.

To reverse the quantization of the image, we perform element-wise multiplication of C by Q:

Di,j = Ci,j / Qi,j

Since we rounded some of the values of C in the compression process, this D matrix will be slightly different than the D matrix we started with.

We then perform the opposite matrix multiplication to get R:

R = T'DT

DCT in R

We will now implement the procedures described earlier in R code. The first thing that we will do is convert our array of pixel values from 0 to 1 to an array of 256 integer pixel values from -128 to 127, as shown in the following code:

picture.1.256 <- round(picture.1 * 255 -128)

Our compression method relies on rounding to integer values, so some type of transformation to integer values is crucial.

Then, we will write a function that can calculate the values of the matrix T, for any desired size:

create.dct.matrix <- function(n) {
  output.matrix <- matrix(0, nrow = n, ncol = n)
  for (i in 1:n) {
    for (j in 1:n) {
      if (i == 1) {output.matrix[i,j] <- 1/sqrt(n)}
      if (i > 1) {
        output.matrix[i,j] <- 
         sqrt(2 / n) * cos( (2*(j-1) +1)*(i-1)*pi / (2* n))
      }
    }
  }
  return(output.matrix)
}

The formula to compute the elements of the quantization matrix assumes that matrix rows and columns are numbered starting with 0, but R numbers start with 1, which we must account for in our code.

We will simply define the basic quantization matrix as follows:

quant.matrix <- matrix(
  c(
  16,11,10,16,24,40,51,60,
  12,12,14,19,26,58,60,55,
  14,13,16,24,40,57,69,56,
  14,17,22,29,51,87,80,62,
  18,22,37,59,68,109,103,77,
  24,35,55,64,81,104,113,92,
  49,64,78,87,103,121,120,101,
  72,92,95,98,112,100,103,99
  ),
  byrow = TRUE,
  nrow = 8, ncol = 8
)

We will then give R a compression ratio for the desired quality as follows:

compression.ratio <- 50

Now, we will compute the Q matrix as a scalar multiple of the basic quantization table based on our desired compression ratio, and we will create a matrix T for n equal to 8, as shown:

Q <- round(quant.matrix * (100-compression.ratio)/50)
T <- create.dct.matrix(8)

Finally, we will create the function that breaks the original image into 8*8 subimages, performs the DCT, and does the compression:

dct.compress <- function(input.matrix) {
  input.row <- nrow(input.matrix)
  input.col <- ncol(input.matrix)
  output.matrix <- matrix(0, nrow = input.row, ncol = input.col)
  working.row <- c(1:8)
  while (max(working.row) <= input.row) {
    working.col  <- c(1:8)
    while (max(working.col) <= input.col) {
      output.matrix[working.row, working.col] <- 
      (T %*% input.matrix[working.row, working.col] %*% t(T))/Q
      working.col <- working.col + 8
    }
    working.row <- working.row + 8
  }
  return(output.matrix)
}

Now, let's actually compress our image using the following function:

picture.compressed <- dct.compress(picture.1.256)

This is transformed and compressed image (if appropriately encoded, the abundance of zeros will require less storage space). If we want to see what the image looks like, we can plot it using the following code:

image(t(picture.compressed)[,nrow(picture.compressed):1], col = gray(seq(1,0, length.out = 2)))
DCT in R

Now, this compression technique is useless if we can't recreate the original image, so we will create a function to reverse what our compression function did:

decompress.image <- function(input.matrix) {
  input.row <- nrow(input.matrix)
  input.col <- ncol(input.matrix)
  output.matrix <- matrix(0, nrow = input.row, ncol = input.col)
  working.row <- c(1:8)
  while (max(working.row) <= input.row) {
    working.col	<- c(1:8)
    while (max(working.col) <= input.col) {
      output.matrix[working.row, working.col] <- 
      (t(T) %*% (Q * input.matrix[working.row, working.col]) %*% T)
      working.col <- working.col + 8
    }
    working.row <- working.row + 8
  }
  return(output.matrix)
}

Let's now apply our decompression function and plot the result to see how faithfully our original image was recreated:

picture.decompressed <- decompress.image(picture.compressed)
image(t(picture.decompressed)[,nrow(picture.decompressed):1], col = gray(seq(0,1, length.out = 256)))

The result of the function is as shown in the following figure:

DCT in R

We see that the original image is pretty well preserved. It may be interesting to try lower values for a compression ratio other than 50 to see how well the original image is preserved with more and more compression.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.110.155