Factor analysis can be thought of as an extension to principal components. Here, we are interested in identifying the underlying factors that might explain a large number of variables. By finding the correlations between a group of variables, we look to find the underlying factors that describe them. The difference between the two techniques is that we are only interested in the correlations of the variables in PCA. Here, in factor analysis, we want to find the underlying factors that are not being described in the data currently. As such, rotations on the factors can be used to closely align the factors with structure in the variables.
The data has been collected from different automobile manufacturers. The variables look at weights of vehicles, fuel efficiency, engine power, capacity, and CO2 emissions.
We will use factor analysis to try and understand the underlying factors in the study. First, we try and identify the number of factors involved, and then we evaluate the study. Finally, we step through different methods and rotations to check for a suitable alignment between the factors and the component variables.
The following steps will help us identify the underlying factors in the jobs dataset:
mpg.MTW
worksheet.CO2
, Cylinders
, Weight
, Combined mpg
, Max hp
, and Capacity
into the Variables: section.CO2
and Max hp
values, with a strong negative combined mpg.2
in the Number of factors to extract: section. Click on the Graphs… button and select Loading plot.Combined Mpg
, with PC2 showing a strong association with Weight
; the factors associated with Capacity
, cylinders
, hp
, and CO2
tend to the upper-right corner of the chart.The steps run through several steps to iteratively compare the factor analysis. The strategy of checking different fitting methods and rotations to settle on the most suitable technique is discussed in "Applied Multivariate Statistical Analysis 5th Edition",Richard A. Johnson and Dean W. Wichern, Prentice Hall, page 517.
The method and type of rotation is probably a less crucial decision, but one that can be useful in separating the loading of variables into the different factors rather than having two factors that are a mix of many component variables.
The strategy, as discussed by Johnson and Wichern in brief, is as follows:
In this example, we will get similar groups of loadings with both fitting methods and rotations. The loadings will be different with each rotation, but they group in a similar way; we can observe this from the loading plot.
The suggestion that the principal components method with Varimax rotation is suitable for this data comes from the line along factor 1 and 2 in the loading plot, as shown in the following figure:
Factor 1 appears to be associated with fuel efficiency versus power and factor 2 with vehicle weight.
Minitab offers Equimax, Varimax, Quartimax rotations, and Orthomax, where the rotation gamma can be chosen by the user.
The storage option allow us to store loadings, coefficients, scores, and matrices. Stored loadings can be used to predict factor scores of new data by entering the stored loadings into the loadings section of the initial solution within options.
18.118.20.90