Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Exploratory factor analysis and reflective constructs

The PCA method that we have discussed so far models all of the variance of the variables to which it is applied. An alternative approach, which is often confused with PCA, is to model only the common variance: an approach called factor analysis (FA). In this chapter, we will discuss exploratory factor analysis (EFA).

Familiarizing yourself with the basic terms

The following are the basic terminologies that you need to be aware of:

Latent trait or common factor: This is an unobserved variable that explains some or all of the variance in observed variables.
Path coefficient: This is the correlation coefficient between a latent trait and an observed variable.
Communality: This refers to the square of a path coefficient in a single factor model.
Uniqueness: Computationally, this is simply one minus the communality of an observed variable. (If a covariance matrix is used, it is equal to variance minus one.)
Observed: This is used to describe matrices and values that are obtained by direct measure or calculations on directly observed values.
Implied: This is used to describe matrices and values that are not observed but estimated to be consistent with other values.
Orthogonal factor structure: We have already seen that PCA treats all of the original variables as being plotted on dimensions at 90 degrees to one another (that is, uncorrelated dimensions) and attempts to account for correlations that really do exist by rotating the coordinate space. Factor analysis that treats the factors as being represented by coordinates at 90 degrees to one another yields an orthogonal factor solution.
Oblique factor structure: This refers to a factor analysis solution in which the factors can be thought of as axes that are not perpendicular to one another and are actually correlated.

Matrices of interest

The following is a list of the matrices that you should familiarize yourself with:

Reduced correlation matrix: This is represented here as R_r; this is the correlation matrix of observed variables with the "1s" along the diagonal replaced by the communalities corresponding to the observed variables.
Implied correlation matrix: This is represented here as R_imp; this is the correlation matrix implied by the factor analysis solution.
Residual correlation matrix: This is denoted as R_resid. This is the matrix obtained when the implied correlation matrix is subtracted from the observed correlation matrix. The diagonal of this matrix contains the uniqueness values.
Factor pattern matrix: This is represented as P. This is a matrix in which factors are represented as columns and observed variables as rows. There should be more observed variables than factors, so this matrix should be taller than it is wide. Each element in the matrix is the path coefficient corresponding to the respective factor and observed variable.
Factor correlation matrix: This is represented as F. This is the correlation matrix of the factors with each other. In the case of a single factor model, this is simply a single value "1". In the case of an orthogonal factor structure, this is a diagonal matrix of "1s".
Uniqueness matrix: This is a diagonal matrix, represented as U, with individual uniqueness values of each observed variable.

Expressing factor analysis in a matrix model

In matrix representation terms, we can relate the previously mentioned matrices as follows (P' is the transposition of P):

R_r = PFP'

R_imp = PFP' + UU

Arriving at a factor analysis solution is then a matter of solving for the elements of these matrices.

Basic EFA and concepts of covariance algebra

EFA assumes that the observed variables can be explained by some unobserved variables, also called latent variables, which are statistically modeled as a source of common variance.

If we look at the following diagram, it depicts a Trait that is a common source of variance to five observed variables, A through E. In this diagram, the arrows represent correlations between the observed variables and the latent trait as well as the path coefficients:

Basic EFA and concepts of covariance algebra

The rules of covariance algebra give us the following formula:

cov(A, B) = cov(A, Trait) × cov(B, Trait)

cov(A, C) = cov(A, Trait) × cov(C, Trait)

Based on these rules, EFA tries to estimate the path coefficients using some type of statistical estimation method that achieves the best fit, since we are not able to directly calculate these correlations.

Concepts of EFA estimation

To understand the basic idea of how this can be done for a single factor and introduce some concepts, we will start with a very old method of factor estimation, known as the centroid method. This method has largely been supplanted by new computerized methods but serves well to demonstrate the basic idea of factor analysis.

For the single factor model, we will call the path coefficients by the small letter version of the capital representing the observed trait (for example, cor(A, trait) = a). Let's assume that we can place the products of all path coefficients with each other into a square matrix, which is called the reduced correlation matrix R_r:

We do not know the values of the path coefficients a, b, c, d, or e. (After all, it is these values that we are trying to estimate.) However, because of the rules of covariance algebra, we know:

ab = cor(A,B)

Thus, while we do not know the values of any of the path coefficients, we are able to estimate the product of these coefficients simply by calculating the correlations between the observed variables associated with these path correlations. It is only the communalities along the diagonal that we are unable to calculate and must come up with some initial estimates.

There are a number of possible methods to generate initial communality estimates:

Just use "1s".
Use the highest value off of the diagonal in the same row (or column).
Use squared multiple correlations (SMCs). These are found by inverting the correlation matrix and subtracting the reciprocals of the values along the diagonal from one. This is the most computationally intensive of the three methods, but generally the best of the three as well.

The greater the number of variables involved, the lesser the importance of the initial estimates of the communalities. This is because when there are more observed variables, the size of the matrix increases dramatically, and the larger a matrix is, the smaller the proportion of elements that fall on the diagonal.

The centroid method

To give the mathematical background of the centroid method, we start with the reduced correlation matrix. To emphasize that the diagonal has merely initial estimates of communalities, we use the 0 subscript, as shown in the following matrix:

We then sum each row (or each column, since this is a symmetric matrix) to get the following matrix. Here we achieve the sum of rows by post-multiplying the reduced correlation matrix by a column matrix of 1s as shown in the following matrix:

We then sum all elements of this matrix of row sums to get the sum of all elements in the reduced correlation matrix, and we take the square root of the total as shown in the following formula:

We then divide the sum of each row by the square root of the total to create path coefficient estimates:

If we wish to obtain better path coefficient estimates, we repeat this procedure multiple times by substituting the squared path coefficient estimates for the initial communality estimates.

Since this is a single factor, the matrix F is simply a 1. Using these solutions, we can now solve for U:

R_imp – PFP' = UU

And for the residual correlation matrix:

R_obs – R_imp = R_resid

Now, we will execute the same steps using R. We start with a single factor model from the physical functioning dataset, and pick just the items that we think are related to the leg function:

le.matrix <- as.matrix(phys.func[,c(2,3,4,8,9,10,13,14)])

When coming up with a numerical solution, we need initial estimates for the communalities. We then use the simplest possible reduced correlation matrix. We will use "1" for simplicity as initial estimates for communalities (that is, we use the observed correlation matrix as the reduced correlation matrix) as depicted in the following funtion:

le.cor <- cor(le.matrix)
le.cor.reduced <- le.cor

Following this step, we sum the rows (or the columns), and from these row sums, we create a sum of all values in the matrix. Path coefficient estimates are equal to the row sums divided by the square root of the total sum of all values in the matrix:

  
row.sums <- le.cor.reduced %*% matrix(rep(1, 8), nrow = 8)
total.sum <- sum(row.sums)
sqrt.total <- sqrt(total.sum)

row.sums / sqrt.total

Now, if we recall, our initial communality estimates were just "1". The communalities are simply the path coefficients squared. Therefore, if we like, we can use our solutions for path estimates to create a new reduced correlation matrix, and submit this new reduced correlation matrix to the centroid method all over again. We could do this repeatedly until the path coefficient estimates change minimally with each additional iteration.

An important limitation of the centroid method is that it assumes that all observed variables are correlated in the same direction (for example, all positive) with the latent trait.

Multiple actors

Earlier in this chapter, we went through the basic idea of estimating a path coefficient for a single common factor, but EFA is used not for a single factor but for multiple underlying factors, as depicted in the following figure:

Here we see that there are two traits and six observed variables with both unobserved traits assuming to have some (potentially almost zero) correlation with each of the observed variables and with each other (arrows not shown for this). Here we will use a small letter followed by a number to indicate the path. For example, a₁ is the path coefficient from Trait-1 to A.

The centroid method described earlier only extracts a single factor. If we want to extract multiple factors, we have to do this one at a time, subjecting residual matrices to the centroid method repeatedly. Since this is not a typically used method with modern computers, we will not go through this tedious exercise, but rather skip to a commonly used method that can extract multiple factors.

Direct factor extraction by principal axis factoring

Rather than iteratively factoring out residual matrices, we can directly extract the desired number of factors using principal axis factoring (PAF).

We start with our reduced correlation matrix. We then perform an eigenvalue decomposition of the reduced correlation matrix, yielding eigenvalues, L, rank ordered from small to large, and a matrix of corresponding eigenvalues, V. If we wish to compute a two-factor solution, we post-multiply the matrix of the first two eigenvectors by a diagonal matrix containing the first two eigvenvalues.

Performing principal axis factoring in R

We will now go through the basic steps of how PAF can be performed in R. We will be demonstrating this with multiple factors, so we will use the full physical functioning dataset. First, we obtain the correlation matrix using the following code:

phys.cor <- cor(phys.func)

Then, we create a reduced correlation matrix using squared multiple correlations:

reduce.cor.mat <- function(cor.mat) {
  inverted.cor.mat <- solve(cor.mat)
  reduced.cor.mat <- cor.mat
  diag(reduced.cor.mat) <- 1 - (1/diag(inverted.cor.mat))
  
  return(reduced.cor.mat)
}


phys.cor.reduced <- reduce.cor.mat(phys.cor)

Finally, we perform the PAF:

paf.method <- function(reduced.matrix, nfactor) {
  row.count <- nrow(reduced.matrix)
  eigen.r <- eigen(reduced.matrix, symmetric = TRUE)
  V <- eigen.r$vectors[,c(1:nfactor)]
  L <- diag(sqrt(eigen.r$values[c(1:nfactor)]), nrow = nfactor)
  
  return((V %*% L))
}

path.coef <- paf.method(phys.cor.reduced, 3)

You may find that all loadings on a factor are negative. There is nothing wrong with reversing the sign on a factor loading so long as the relative signs of factors remain consistent (that is, the sign reversal has to be applied consistently to path coefficients). It is important to note that all observed variables have a loading on all extracted factors.

Depending on our goals, this can be the end of our factor analysis, but generally researchers seek to find a simpler structure by rotating the factor structure in a similar manner as PCA rotates axes.

Other factor extraction methods

Principal axis factoring is likely the oldest commonly used method of factor extraction, and it is probably still the most commonly used. It does not make distributional assumptions, and in the case of normally distributed data gives pretty similar estimates as methods that do make distributional assumptions. Maximum likelihood estimation is being used increasingly and is considered numerically superior on datasets that are close to multivariate normally distributed. This method assumes that the dataset is normally distributed and maximizes the (usually log) likelihood function based on a normal distribution. It is relatively robust to mild or moderate deviations from this assumption. Minimum residual factoring seeks to minimize the residual correlations off the diagonal and gives similar estimates as maximum likelihood, while being robust to poorly behaved matrices.

Factor rotation

The final step in most factor analyses is factor rotation. The goal of this step is to determine whether the cloud of data can be represented by a simpler set of coordinates by rotating the axes of the factors. Rotation should increase the number of near zero coefficients in the factor pattern matrix. All observed variables will still have a loading on all of the factors, but ideally, observed variables will show substantially higher loadings on a single particular factor than on other factors once this is completed.

Broadly speaking, there are two approaches to factor rotation: orthogonal and oblique. Orthogonal rotations are still the most commonly used methods, and many regard them as producing easier to interpret solutions. However, oblique rotations are thought to provide more of a real-world estimate. It is also worth noting that single factor models cannot be rotated.

There are many different factor analysis methods, but here we will delve into just four: two orthogonal and two oblique rotations.

Orthogonal factor rotation methods

In this section, we will discuss the commonly used orthogonal factor rotation methods. These rotation methods produce factors that have no correlation, which is why they are thought to be easier to interpret. The downside is that many of the constructs we think of in the real world are in fact correlated.

Quartimax rotation

Quartimax rotation attempts to satisfy the criteria of maximizing the sum of all values in the factor pattern matrix raised to the fourth power:

Here, P_ij is the element in i^th row and j^th column of the factor pattern matrix P (in which variables are represented by rows and factors by columns). The number of variables is denoted by v which is also the number of rows in P. The number of factors is denoted by f, which is also the number of columns in P.

Raising a number to the fourth power has the effect of exaggerating the differences between large and small numbers, so the quartimax criterion will be better met in a factor loading matrix with very large loadings and very small loadings than in one with many moderate-sized loadings. Notably, this rotation simply maximizes this very simple criterion without regard for whether the higher loadings are well distributed among factors or all load onto a single factor.

Varimax rotation

The varimax rotation subtracts a term summing over squared elements of rows and columns. Its criterion is to find a rotation fitting the following formula:

The effect of this is to favor rotations in which large loadings are distributed over those with large loadings falling on a single (or relatively few) factors.

Varimax is probably the most widely used factor rotation method.

Oblique rotations

We saw that orthogonal transformations attempt to maximize a transformation of the sums of factor loadings. Oblique rotations do the opposite; they minimize such sums.

Oblimin rotation

Oblimin rotation seeks to minimize the following criteria:

Here, (x,y) represents a pair of variables, and the summation is done for all variable pairs.

Promax rotation

Promax is an oblique rotation that starts with varimax and then rotates the varimax to an oblique solution. It takes the factor loadings in the promax solution, raises them to a high power to bring small loadings close to zero, and then attempts a rotation that makes the closest loadings to zero equal to zero.

Factor rotation in R

A package that fits many different rotations has been developed, known as GPA rotation. It is available in languages outside of R as well. Notably, it uses a method that can be applied to almost any rotation criterion, so the package offers functions of not only common rotations but some obscure ones as well. For example:

> library(GPArotation)
> rotated.structure <- oblimin(path.coef)
> rotated.structure
Oblique rotation method Oblimin Quartimin converged.
Loadings:
         [,1]     [,2]      [,3]
 [1,] -0.0013  0.06999  3.58e-01
 [2,] -0.6804 -0.10073  4.18e-02
 [3,] -0.6434 -0.04880 -1.12e-02
 [4,] -0.6764  0.06330 -1.20e-01
 [5,] -0.4324  0.11336  2.17e-01
 [6,] -0.2496  0.06544  4.82e-01
 [7,]  0.0446  0.06167  5.50e-01
 [8,] -0.2300  0.31230  5.50e-02
 [9,] -0.4410  0.34663 -6.06e-02
[10,] -0.4072  0.41199 -1.13e-01
[11,]  0.1783  0.58615  1.36e-01
[12,] -0.0698  0.56071  1.03e-01
[13,] -0.6842 -0.01633  1.11e-01
[14,] -0.4586  0.20304 -6.44e-03
[15,] -0.2744  0.37031 -9.65e-05
[16,]  0.0308  0.60407  7.37e-03
[17,] -0.2598  0.22145  2.99e-01
[18,] -0.1519  0.29464  2.58e-01
[19,] -0.0840  0.38490  2.30e-02
[20,] -0.5786  0.00746  2.10e-01

Rotating matrix:
       [,1]   [,2]   [,3]
[1,]  0.627 -0.397 -0.202
[2,]  1.017  0.855  0.414
[3,] -0.025 -0.748  1.001

Phi:
       [,1]   [,2]   [,3]
[1,]  1.000 -0.519 -0.356
[2,] -0.519  1.000  0.374
[3,] -0.356  0.374  1.000

The object produced by this command has a number of important matrices including the new factor loading matrix produced by the rotation, the correlation matrix of the factors, and the rotation matrix (post-multiplication of the original factor pattern matrix with any of the previously mentioned matrices gives the new factor loading matrix).

The question then is how to interpret these factors. The factor loading matrix shows that all 20 observed items load on all three factors (as is typical of EFA), but the loadings are pretty low on some factors. As such, we can interpret what each factor means by those items that load sufficiently heavily on it. What constitutes "sufficiently heavy" loading is far from clear. One of the most commonly used criteria is that the item should have a loading of at least 0.4. However, other criteria exist that require that an item load substantially more on a single factor than any other factor. In general, it is probably best to use some judgment rather than rigid criteria.

Here, we will reprint the loading matrix replacing all those values less than 0.3 with NA for ease of examination:

> loading.matrix <- rotated.structure$loadings
> loading.matrix[ abs(loading.matrix) < 0.3] <- NA
> loading.matrix
            [,1]      [,2]      [,3]
 [1,]         NA        NA 0.3582315
 [2,] -0.6804255        NA        NA
 [3,] -0.6434025        NA        NA
 [4,] -0.6763837        NA        NA
 [5,] -0.4324032        NA        NA
 [6,]         NA        NA 0.4824759
 [7,]         NA        NA 0.5499046
 [8,]         NA 0.3123046        NA
 [9,] -0.4410473 0.3466325        NA
[10,] -0.4071513 0.4119939        NA
[11,]         NA 0.5861508        NA
[12,]         NA 0.5607121        NA
[13,] -0.6842129        NA        NA
[14,] -0.4585589        NA        NA
[15,]         NA 0.3703143        NA
[16,]         NA 0.6040669        NA
[17,]         NA        NA        NA
[18,]         NA        NA        NA
[19,]         NA 0.3848953        NA
[20,] -0.5785879        NA        NA
>

Based on these results, it appears that we have a first factor dealing with gross motor function, a second factor dealing mostly with fine motor function, and a third factor concerned with household management. The two items concerned with recreation (rows 18 and 19) fall on none of these factors. Appropriately, the factor correlation matrix suggests that the factors we interpret as fine and gross motor are more highly correlated with each other than with the household management factor.

Advanced EFA with the psych package

We have gone through the basic conceptual and computational ideas underlying EFA in R. As we discussed earlier, to get good estimates, multiple iterations of these computations are needed until some criteria indicating that an optimal solution has been achieved. An excellent package that bundles much of the work we have done earlier into convenient commands is the psych package. We will now go over how to use this package, including calling some of the advanced features that it offers.

We will continue to use the physical functioning dataset here. Our question at hand is whether a few common sources of variance are able to explain the responses to the 20 items. We saw that a three-factor solution is likely most appropriate (or maybe a four-factor solution, but we will stick with three here). For serious exploratory work, it is often ideal to split off a development and validation dataset, but we will skip that step here.

We also do one more thing here. We are working with ordinal data and treating it like it is continuous. Now we will explicitly account for the fact that it is ordinal rather than continuous. We will use polychoric correlations here to create our correlation matrix. The polychoric correlation assumes that the data is ordinal but represents some continuous underlying phenomena that have simply been binned into discrete ordered categories. The polychoric correlation attempts to estimate a correlation with this assumption and calculate the threshold at which the discretization occurs.

Let's start by finding the polychoric correlations:

library(psych)
fit.efa.prep <- polychoric(phys.func, polycor = TRUE)

We then take the correlation matrix from these polychoric correlations and place it into our factor analysis:

> fit.efa.3 <- fa(fit.efa.prep$rho, nfac = 3, rotate = 'promax')
> fit.efa.3
Factor Analysis using method =  minres
Call: fa(r = fit.efa.prep$rho, nfactors = 3, rotate = "promax")
Standardized loadings (pattern matrix) based upon correlation matrix
          MR1   MR3   MR2   h2   u2 com
PFQ061A -0.13  0.72 -0.09 0.33 0.67 1.1
PFQ061B  0.88  0.09 -0.27 0.62 0.38 1.2
PFQ061C  0.88  0.00 -0.19 0.59 0.41 1.1
PFQ061D  0.96 -0.27  0.03 0.65 0.35 1.2
PFQ061E  0.49  0.22  0.10 0.55 0.45 1.5
PFQ061F  0.35  0.46  0.04 0.62 0.38 1.9
PFQ061G -0.23  0.81  0.10 0.54 0.46 1.2
PFQ061H  0.52  0.31  0.05 0.67 0.33 1.7
PFQ061I  0.64  0.03  0.20 0.65 0.35 1.2
PFQ061J  0.60  0.01  0.28 0.66 0.34 1.4
PFQ061K -0.24  0.08  0.98 0.78 0.22 1.1
PFQ061L  0.17  0.17  0.56 0.67 0.33 1.4
PFQ061M  0.77  0.09 -0.06 0.63 0.37 1.0
PFQ061N  0.58  0.13  0.05 0.51 0.49 1.1
PFQ061O  0.38  0.01  0.40 0.52 0.48 2.0
PFQ061P  0.01 -0.07  0.85 0.65 0.35 1.0
PFQ061Q  0.26  0.69 -0.04 0.74 0.26 1.3
PFQ061R  0.13  0.74 -0.01 0.69 0.31 1.1
PFQ061S  0.12  0.51  0.14 0.49 0.51 1.3
PFQ061T  0.66  0.16 -0.01 0.60 0.40 1.1

                       MR1  MR3  MR2
SS loadings           6.09 3.55 2.57
Proportion Var        0.30 0.18 0.13
Cumulative Var        0.30 0.48 0.61
Proportion Explained  0.50 0.29 0.21
Cumulative Proportion 0.50 0.79 1.00

 With factor correlations of 
     MR1  MR3  MR2
MR1 1.00 0.71 0.67
MR3 0.71 1.00 0.68
MR2 0.67 0.68 1.00

Mean item complexity =  1.3
Test of the hypothesis that 3 factors are sufficient.

The degrees of freedom for the null model are  190  and the objective function was  15.46
The degrees of freedom for the model are 133  and the objective function was  2.09 

The root mean square of the residuals (RMSR) is  0.04 
The df corrected root mean square of the residuals is  0.05 

Fit based upon off diagonal values = 0.99
Measures of factor score adequacy             
                                                MR1  MR3  MR2
Correlation of scores with factors             0.97 0.95 0.95
Multiple R square of scores with factors       0.94 0.90 0.90
Minimum correlation of possible factor scores  0.89 0.80 0.81

There are a number of matrices that we can see in the fit.efa.3 object. The first is the factor pattern matrix, which contains the factor loadings. To the right of this are two columns, namely h2 and u2, the communality and uniqueness estimates for each item respectively. The two values should sum to one. The greater the communality, the more the total variance of an item is explained by the common factors. The next matrix informs us how much of the variance is explained both by the individual factors and by the whole EFA model. Then there is the matrix with the correlations between the common factors, which is not present for orthogonal solutions.

Let's start by looking at the communalities in the h2 column. Low communalities suggest that the latent variables do not explain the data well. The communality is computed as the sum of the squared factor loadings of the unrotated factor solution. They tell us how much of the variance of each observed variable is explained by all factors. The communalities are relatively high with the exception of item A, so we are a little suspicious of how well this item is explained even by all three factors together, but for now we will keep all items.

Let's now look at the loadings. Remember that these loadings indicate how these items relate to some underlying unobserved variables that are causing the observed data. However, we have to try to make sense of what these unobserved factors actually are.

Item A (concerned with money management) loads heavily on MR3. Items Q, R, and S also load heavily on this factor. These items appear to be concerned with cognition and social engagement, requiring a few physical demands. Items B, C, D, E, H, I, and T all load heavily on MR1. These items require leg use and are concerned with mobility. Items K, L, O, and P load fairly heavily onto MR2. These are items that tend to require hand use, suggesting that this is an arm or hand function factor. It is worth pointing out that item O loads almost as heavily on MR1, suggesting that some component of mobility is needed to reach overhead. This may be because reaching overhead requires good torso control, which is also needed to walk around and do basic mobility skills. Let me emphasize that this interpretation of the results is made based on a researcher's substantive understanding of the items rather than the statistical analysis alone.

There is some additional information provided, but I will bring your attention to the fit measures in particular. These are a bigger deal in confirmatory factor analysis (discussed in the next chapter), but in summary, these give us some sense of how well the model explains the data. There is more disagreement than agreement on which fit measures for use and how to interpret it. However, three of the commonly used measures are Root Mean Square Residual (RMSR), Root Mean Square Error of Approximation (RMSEA), and the Tucker-Lewis Index (TLI). We would see all three of these if we did not use polychoric correlations, but in this case we see only RMSR. An acceptable fit is usually thought to be indicated by an RMSR less than 0.08, RMSEA less than 0.06, and TLI greater than 0.95 (some would accept greater than 0.90 as adequate).

Now that we have made sense of the factors and the model fit, let's look at the internal consistency reliability. The basic idea of internal consistency is to look at the proportion of variance in scale scores accounted for by the latent variables. In the previous chapter, we touched on this topic discussing how to calculate Cronbach's alpha. Coefficient alpha is the most widely used measure of internal consistency, but for multidimensional scales, McDonald's Omega (of which there are a few) is generally considered better. Let's use psych's omega function to examine internal consistency reliability:

omega(fit.efa.prep$rho, nfac = 3, rotate = 'promax')

Omega 
Call: omega(m = fit.efa.prep$rho, nfactors = 3, rotate = "promax")
Alpha:                 0.95 
G.6:                   0.97 
Omega Hierarchical:    0.83 
Omega H asymptotic:    0.86 
Omega Total            0.96

We simply show the beginning of the output from this function, which shows a number of internal consistency reliability coefficients including Cronbach's alpha, Guttman's lambda 6, omega hierarchical, omega asymptotic, and omega total. Cronbach's alpha is the classic split half reliability (discussed in further detail under the applications section of Chapter 5, Linear Algebra). Guttman's lambda 6, while rarely used nowadays, is the squared multiple correlation of the item with the other items.

We will focus on omega hierarchical, omega asymptotic, and omega total here. Omega assumes that a multifactor scale has both specific factors onto which only some items load and a general factor onto which all items load. Omega hierarchical gives us the proportion of variance in scaled scores explained by the general factor. Omega asymptotic is the estimated omega hierarchical for a test with the same structure and infinite length (reliability tends to increase with test length). Omega total is the total reliability of a test including that attributable to both the general factor and the factors onto which not all items load.

Before we look at these reliability coefficients, it may be worth looking back at the results of the fa function. We see that there are sizeable correlations between the factors, 0.67 to 0.71. If these correlations were low (for example, 0.2), then it would be questionable as to whether omega hierarchical should even be examined given that low correlations between the factors would suggest the non-existence of a general factor that explains scores well.

We can see from these results that omega hierarchical is relatively good (0.83), suggesting that a general factor explains a large proportion of the variance in the scale scores.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Exploratory factor analysis and reflective constructs

Create new playlist

Sign In

Sign Up