Formative constructs using PCA

Let's delve into another example using PCA that will be instructive when compared to factor analysis later on in the Exploratory factor analysis and reflective constructs section. We will return to the physical functioning dataset used earlier in this book. Here we will discuss the notions of formative and reflective constructs. See the following diagram (paying close attention to the arrow directions) for a visual representation of the differences. A formative construct is one in which a general trait is composed of a number of very specific traits as shown in the diagram. The arrows pointing towards the construct indicate that the construct is derived from the traits.

Formative constructs using PCA

Alternately, a reflexive construct is one in which a general trait is thought to underlie and cause specific traits, as shown in the following diagram. The arrows pointing away from the construct are designed to reflect the fact that the construct drives the traits, and the specific traits are merely manifestations of this construct:

Formative constructs using PCA

PCA is often considered a method of modeling formative constructs. Later on in this chapter, we will discuss factor analysis, which models reflective constructs.

The physical functioning dataset collects data on the ability of individuals to engage in 20 ADLs and IADLs. Taken together, we would expect that the way a person scores on this would be some sort of a measure of functional independence with ADLs and IADLs. Unfortunately, reporting the scores for all 20 items for each individual is tedious, so we wish to use some sort of a summed score or multiple summed scores to summarize a person's functional status. The question is whether this can legitimately be done. Does it make sense to add standing to walking? There is one more question: What are we trying to achieve by creating summary scores? The answer to this question lies in what we seek to measure with this scale. If we assume that a person should be able to do these 20 things independently to be truly independent, then we are assuming that functional independence is in some sense defined by the abilities to perform these 20 items, and we have a formative construct. Alternately, if we assume that the abilities to perform these 20 items simply serve as manifestations of an underlying trait of functional independence, then we have a reflective construct in mind. Statistically, this is the difference between a fixed effects and random effects model, respectively.

We will start here with the idea that we are attempting to use these 20 items to model a formative construct, and we will use PCA for this. We will apply PCA on the physical functioning dataset and look at the variance explained in the PCA:

> phys.func <- read.csv('phys_func.txt')[,c(-1)]
> phys.func.pca <- PCA(phys.func)
> summary(phys.func.pca)

Call:
PCA(phys.func) 


Eigenvalues
                       Dim.1   Dim.2   Dim.3   Dim.4   Dim.5
Variance               6.423   1.574   1.286   1.094   0.988
% of var.             32.113   7.869   6.428   5.470   4.939
Cumulative % of var.  32.113  39.983  46.410  51.880  56.818
                       Dim.6   Dim.7   Dim.8   Dim.9  Dim.10
Variance               0.865   0.800   0.738   0.699   0.657
% of var.              4.323   4.001   3.689   3.494   3.287
Cumulative % of var.  61.142  65.143  68.832  72.326  75.613
                      Dim.11  Dim.12  Dim.13  Dim.14  Dim.15
Variance               0.643   0.560   0.523   0.510   0.485
% of var.              3.213   2.802   2.613   2.552   2.425
Cumulative % of var.  78.826  81.628  84.241  86.793  89.218
                      Dim.16  Dim.17  Dim.18  Dim.19  Dim.20
Variance               0.467   0.454   0.440   0.416   0.380
% of var.              2.333   2.271   2.200   2.080   1.899
Cumulative % of var.  91.551  93.821  96.021  98.101 100.000

We can see that the first component explains more than four times as much of the variance as any other single component, and four components are needed to explain the majority of the variance. This suggests that there may be a more simple summary interpretation than inspecting all 20 variables for each subject.

We can also plot the results of the scree plot as follows:

plot(phys.func.pca$eig$eigenvalue, type = 'b', xlab = 'Principal Component', ylab = 'Eigenvalue', main = 'Eigenvalues of Principal Components')

The result is as shown in the following scree plot:

Formative constructs using PCA

The first question is whether we should simply treat physical functioning as a uni-dimensional scale (that is, reduce 20 dimensions to one) or whether we should treat physical functioning as being multidimensional. Based on the Kaiser-Guttman rule, we should retain four components. Based on the screenshot, it is less clear how many components are worth retaining. Looking very closely at the preceding graph, it appears that there are two plateaus: one that starts after the third component and another that starts after the sixth component. Thus, we should probably retain three components based on scree criteria. However, there is also the problem of the interpretation of the components. Let's try to make sense of a three to four component model by looking at the squared cosines. Remember that these reflect how well a variable is projected onto an axis, so if it is not well projected, then it is not helpful to use that axis to measure the variable:

> phys.func.cos <- phys.func.pca$var$cos2
> phys.func.cos[ phys.func.cos < 0.2 ] <- NA
> phys.func.cos
            Dim.1     Dim.2     Dim.3     Dim.4     Dim.5
PFQ061A        NA        NA 0.2180148        NA        NA
PFQ061B 0.3425603        NA        NA        NA        NA
PFQ061C 0.3197294        NA        NA        NA        NA
PFQ061D 0.3846285        NA        NA        NA        NA
PFQ061E 0.4029381        NA        NA        NA        NA
PFQ061F 0.3407052        NA        NA        NA        NA
PFQ061G        NA        NA 0.3237579        NA        NA
PFQ061H 0.2748247        NA        NA        NA 0.2984617
PFQ061I 0.4465091        NA        NA        NA        NA
PFQ061J 0.4352429        NA        NA        NA        NA
PFQ061K        NA 0.2953024        NA        NA        NA
PFQ061L 0.3623138        NA        NA        NA        NA
PFQ061M 0.4776348        NA        NA        NA        NA
PFQ061N 0.3667740        NA        NA        NA        NA
PFQ061O 0.3354625        NA        NA        NA        NA
PFQ061P 0.2383867        NA        NA        NA        NA
PFQ061Q 0.3746035        NA        NA        NA        NA
PFQ061R 0.3054978        NA        NA        NA        NA
PFQ061S        NA        NA        NA 0.2773378        NA
PFQ061T 0.4540454        NA        NA        NA        NA

We have set a relatively low threshold of 0.2 for a squared cosine, and we see that most of the variables meet this criterion for the first component, but most fail to meet this criterion for subsequent components. All items meet this criterion for one component when we include four components, but two components have only a single item. As we can see, the items that have a low projection on the first component do not really relate to mobility, whereas those that meet our 0.2 criterion do. Based on this data, we may simply want to consider this outcome measure as being composed of a single dimension that is concerned with physical mobility and exclude the five items from our scoring of the test that do not project well onto this dimension.

In the following sections, we will revisit this as a reflective construct yielding more interpretable results.

Tip

What kind of commonly encountered constructs are regarded as formative?

The analysis that we did begins to touch on concepts of psychometrics, which is a field that rarely models formative constructs since it is usually concerned with observable manifestations of invisible psychological processes (for example, reflective constructs). Socio-economic status is one of the few well accepted formative constructs in the psychological and social science canon.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.71.21