SAS Program and Output

You can perform a principal component analysis using either the PRINCOMP, CALIS, or FACTOR procedures. This chapter shows how to perform the analysis using PROC FACTOR since this is a somewhat more flexible SAS procedure. (It is also possible to perform an exploratory factor analysis with PROC FACTOR or CALIS.) Because the analysis is to be performed using the FACTOR procedure, the output will at times make reference to factors rather than to principal components (e.g., component 1 is referred to as FACTOR1 in the output). However, it is important to remember that you are performing a principal component analysis.

This section provides instructions on writing the SAS program and an overview of the SAS output. A subsequent section provides a more detailed treatment of the steps followed in the analysis as well as the decisions to be made at each step.

Writing the SAS Program

The DATA Step

To perform a principal component analysis, data can be entered in the form of raw data, a correlation matrix, a covariance matrix, and other types of data. (See Chapter 3, “Data Input,” for a further description of these data input options.) Raw data are analyzed for this chapter’s first example.

Assume that you administered the POI to 50 participants and entered their responses according to the following guide:

LineColumnVariable NameExplanation
11-6V1-V6Participants’ responses to survey questions 1 through 6. Responses were made using a 7-point Likert-type scale.

Here are the statements that input these responses as raw data. The first three and the last three observations are reproduced here; for the entire dataset, see Appendix B.

 1     DATA D1;
 2        INPUT    #1    @1   (V1-V6)    (1.)  ;
 3
 4     DATALINES;
 5     556754
 6     567343
 7     777222
 8     .
 9     .
10     .
11     767151
12     455323
13     455544
14     ;
15     RUN;

The dataset in Appendix B includes only 50 cases so that it is relatively easy to enter the data and replicate the analyses presented here. However, it should be remembered that 50 observations normally constitute an unacceptably small sample for a principal component analysis. Earlier, it was said that a sample should provide usable data from the larger of either 100 cases or five times the number of observed variables. A small sample is being analyzed here for illustrative purposes only.

The PROC FACTOR Statement

The general form for the SAS program to perform a principal component analysis is presented here:

PROC FACTOR   DATA=dataset-name
              SIMPLE
              METHOD=PRIN
              PRIORS=ONE
              MINEIGEN=p
              SCREE
              ROTATE=VARIMAX
              ROUND
              FLAG=desired-size-of-"significant"-factor-loadings ;
   VAR  variables-to-be-analyzed ;
RUN;

Options Used with PROC FACTOR

The PROC FACTOR statement begins the FACTOR procedure; a number of options can be requested in this statement before it ends with a semicolon. Some options that can be especially useful in social science research are:

FLAG

causes the output to flag (with an asterisk) any factor loading whose absolute value is greater than some specified size. For example, if you specify

     FLAG=.35

an asterisk appears next to any loading whose absolute value exceeds .35. This option can make it much easier to interpret a factor pattern. Negative values are not allowed in the FLAG option and the FLAG option can be used in conjunction with the ROUND option.

METHOD=factor-extraction-method

specifies the method to be used in extracting the factors or components. The current program specifies METHOD=PRIN to request that the principal axis (principal factors) method be used for the initial extraction. This is the appropriate method for a principal component analysis.

MINEIGEN=p

specifies the critical eigenvalue a component must display if that component is to be retained (here, p = the critical eigenvalue). For example, the current program specifies

     MINEIGEN=1

This statement causes PROC FACTOR to retain and rotate any component with an eigenvalue of 1.00 or larger. Negative values are not allowed.

NFACT=n

allows you to specify the number of components to be retained and rotated, where n = the number of components.

OUT=name-of-new-dataset

creates a new dataset that includes all variables of the existing dataset, along with factor scores for the components retained in the present analysis. Component 1 is given the variable name FACTOR1, component 2 is given the name FACTOR2, and so forth. The name must be used in conjunction with the NFACT option, and the analysis must be based on raw data.

PRIORS=prior-communality-estimates

specifies prior communality estimates. Users should always specify PRIORS=ONE to perform a principal component analysis.

ROTATE=rotation-method

specifies the rotation method to be used. The preceding program requests a varimax rotation that provides orthogonal (uncorrelated) components. Oblique rotations can also be requested (correlated components).

ROUND

causes all coefficients to be limited to two decimal places, multiplied by 100, and rounded to the nearest integer (thus eliminating the decimal point). This generally makes it easier to read coefficients because factor loadings and correlation coefficients in the matrices printed by PROC FACTOR are normally carried to several decimal places.

SCREE

creates a plot that graphically displays the size of the eigenvalues associated with each component. This can be used to perform a scree test to visually determine how many components should be retained.

SIMPLE

requests simple descriptive statistics: the number of usable cases on which the analysis was performed and the means and standard deviations of the observed variables.

The VAR Statement

The variables to be analyzed are listed on the VAR statement with each variable separated by at least one space. Remember that the VAR statement is a separate statement, not an option within the FACTOR statement; don’t forget to end the FACTOR statement with a semicolon before beginning the VAR statement.

Example of an Actual Program

The following is an actual program, including the DATA step, which could be used to analyze fictitious study data. Only a few sample lines of data appear here; the entire dataset is in Appendix B.

 1     DATA D1;
 2        INPUT    #1    @1   (V1-V6)    (1.)  ;
 3
 4     DATALINES;
 5     556754
 6     567343
 7     777222
 8     .
 9     .
10     .
11     767151
12     455323
13     455544
14     ;
15     RUN;
16
17     PROC FACTOR   DATA=D1
18                   SIMPLE
19                   METHOD=PRIN
20                   PRIORS=ONE
21                   MINEIGEN=1
22                   SCREE
23                   ROTATE=VARIMAX
24                   ROUND
25                   FLAG=.40   ;
26        VAR V1 V2 V3 V4 V5 V6;
27     RUN;

Results from the Output

If printer options are set so that LINESIZE=80 and PAGESIZE=60 (first line), the preceding program would produce four pages of output. This is a list of some of the most important output, and the page on which it appears:

  • Page 1 includes simple statistics.

  • Page 2 includes the eigenvalue table.

  • Page 3 includes the scree plot of eigenvalues.

  • Page 4 includes the unrotated factor pattern and final communality estimates.

  • Page 5 includes the rotated factor pattern.

The output created by the preceding program is reproduced here as Output 15.1:

Output 15.1. Results of the Initial Principal Component Analysis of the Prosocial Orientation Inventory (POI) Data


Page 1 from Output 15.1 provides simple statistics for the observed variables included in the analysis. Once the SAS log is checked to verify that no errors were made in the analysis, these simple statistics should be reviewed to determine how many usable observations were included in the analysis, and to verify that the means and standard deviations are in the expected range. The top line of Output 15.1, page 1, says “Means and Standard Deviations from 50 Observations,” meaning that data from 50 participants are included in the analysis.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.47.59