Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Factor analysis

Although the literature on confirmatory factor analysis (FA) is really impressive and is being highly used in, for example, social sciences, we will only focus on exploratory FA, where our goal is to identify some unknown, not observed variables based on other empirical data.

The latent variable model of FA was first introduced in 1904 by Spearman for one factor, and then Thurstone generalized the model for more than one factor in 1947. This statistical model assumes that the manifest variables available in the dataset are the results of latent variables that were not observed but can be tracked based on the observed data.

FA can deal with continuous (numeric) variables, and the model states that each observed variable is the sum of some unknown, latent factors.

Note

Please note the that normality, KMO, and Bartlett's tests are a lot more important to check before doing FA compared to PCA; the latter is a rather descriptive method while, in FA, we are actually building a model.

The most used exploratory FA method is maximum-likelihood FA, which is also available in the factanal function in the already installed stats package. Other factoring methods are made available by the fa functions in the psych package—for example, ordinary least squares (OLS), weighted least squares (WLS), generalized weighted least squares (GLS), or principal factor solution. These functions take raw data or the covariance matrix as input.

For demonstration purposes, let's see how the default factoring method performs on a subset of mtcars. Let's extract all performance-related variables except for displacement, which is probably accountable for all the other relevant metrics:

> m <- subset(mtcars, select = c(mpg, cyl, hp, carb))

Now simply call and save the results of fa on the preceding data.frame:

> (f <- fa(m))
Factor Analysis using method =  minres
Call: fa(r = m)
Standardized loadings (pattern matrix) based upon correlation matrix
       MR1   h2   u2 com
mpg  -0.87 0.77 0.23   1
cyl   0.91 0.83 0.17   1
hp    0.92 0.85 0.15   1
carb  0.69 0.48 0.52   1

                MR1
SS loadings    2.93
Proportion Var 0.73

Mean item complexity =  1
Test of the hypothesis that 1 factor is sufficient.

The degrees of freedom for the null model are  6 
and the objective function was  3.44 with Chi Square of  99.21
The degrees of freedom for the model are 2
and the objective function was  0.42 

The root mean square of the residuals (RMSR) is  0.07 
The df corrected root mean square of the residuals is  0.12 

The harmonic number of observations is  32
with the empirical chi square  1.92  with prob <  0.38 
The total number of observations was  32
with MLE Chi Square =  11.78  with prob <  0.0028 

Tucker Lewis Index of factoring reliability =  0.677
RMSEA index =  0.42
and the 90 % confidence intervals are  0.196 0.619
BIC =  4.84
Fit based upon off diagonal values = 0.99
Measures of factor score adequacy             
                                                MR1
Correlation of scores with factors             0.97
Multiple R square of scores with factors       0.94
Minimum correlation of possible factor scores  0.87

Well, this is a rather impressive amount of information with a bunch of details! MR1 stands for the first extracted factor named after the default factoring method (Minimal Residuals or OLS). Since there is only one factor included in the model, rotation of factors is not an option. There is a test or hypothesis to check whether the numbers of factors are sufficient, and some coefficients represent a really great model fit.

The results can be summarized on the following plot:

> fa.diagram(f)

Here we see the high correlation coefficients between the latent and the observed variables, and the direction of the arrows suggests that the factor has an effect on the values found in our empirical dataset. Guess the relationship between this factor and the displacement of the car engines!

> cor(f$scores, mtcars$disp)
0.87595

Well, this seems like a good match.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Factor analysis

Create new playlist

Sign In

Sign Up

Factor analysis

Note

Table of Contents for
Factor analysis