Odd-Numbered Exercise Solutions

Chapter 6 Solutions

1. Tasks Statistics Summary Statistics

On the DATA tab select the data set Iris from the SASHELP library.

Under Analysis variables select PetalLength and Petal Width and click OK.

On the OPTIONS tab, check boxes for Mean, Standard deviation, Median, Number of Observations, Number of missing values.

Under Additional Statistics check Confidence limits for the mean.

Under Plots, select Histogram.

3. Tasks Statistics One-Way Frequencies

On the DATA tab, select the data set Iris from the SASHELP library.

Under Analysis variables select Species.

5. Tasks Statistics Distribution Analysis

On the DATA tab, select the data set Heart in the SASHELP library.

Under Analysis variables select Weight.

Under Additional Roles and Group analysis by, select Sex.

On the OPTIONS tab, check boxes for Normality and Histogram and Goodness of fit.

The p-value for the Anderson-Darling test is <.005 to you reject the null hypothesis that the data values are normally distributed.

7. Tasks Graphs Box plot

On the DATA tab, select the Cars data set in the SASHELP library.

Click on Horizontal plot.

Under Analysis variables select Invoice.

Under Category select Cylinders.

Run this task and then go back to the DATA tab and add the filter: Cylinders=4 or Cylinders=6.

Chapter 7 Solutions

1. Tasks Statistics t-Tests

On the DATA tab, select the Heart data set in the SASHELP library.

Be sure One-Sample t test is displayed in the box.

Under Analysis variables select Weight.

On the OPTIONS tab, be sure a two-tailed test is selected (it is the default).

The alternative hypothesis is mu ^= 150.

Check the box for a test of normality.

Under Plots select histogram and box plots.

You should NOT be concerned that the tests for normality are significant. First of all, the histogram look symmetrical and the sample size is 5,000. With large sample sizes, tests for normality are often significant.

3.

a. Tasks Data List Data

On the DATA tab, select the Air data set in the SASHELP library.

Under the List variables box, select Date and Air (the only variables in the data set).

On the OPTIONS tab, under Rows to list, select First n rows.

Enter 10 in the Amount box.

b. Tasks Statistics Summary Statistics

On the DATA tab, be sure the Air data set is still selected.

Under Analysis variables, select Air.

On the OPTIONS tab, select Plots and Histogram and box plot.

c. Tasks Statistics t Tests

On the DATA tab, be sure that a one-sample t Test is selected.

Under Analysis Variables, select Air.

On the OPTIONS tab, set the alternative hypothesis to mu ^= 285.

Check tests for normality and Nonparametric tests (Sign test and Wilcoxon signed rank test).

All tests fail to reject the null hypothesis. With an n of 144 and only a moderate tail on the distribution you could probably use the t Test, but it would also be a good idea to run the Wilcoxon signed rank test as well.

5. Rerun the program with 20 changed to 200 and run rerun the task in exercise 4.

All the tests for normality reject the null hypothesis. This is because you now have more power with a larger sample size. However, because this is a uniform distribution and n = 200, you can be comfortable running a t Test.

Chapter 8 Solutions

1. Tasks Statistics Tasks t-Tests

On the DATA tab, select the Heart data set in the SASHELP library.

Select a two-sample t Test, Systolic Blood Pressure as your Analysis variable and Sex as your Groups variable.

On the OPTIONS tab be sure the box for Tests of normality is checked.

You reject the null hypothesis of normality for both males and females, but with an n of 5,000 and a reasonable distribution, it is OK to run a t Test. Both of the two p-values for the t Test are not significant.

3. If you are using the SAS University Edition, you will find the file Ttest_Data.xlsx in My Folders. If not, type in the data in Excel.

Tasks Utilities Import Data

Scroll down and click on the Change box and name your file Ttest_Data and leave it in the WORK library.

Tasks Statistics t Tests

On the DATA tab, select Ttest_Data in the WORK library and select a two-sample test.

Select Score1 as the Analysis variable and Method as the Groups variable. The variances are not significantly different so you can use the Pooled values. Score1 is significant at the .05 level.

5. Tasks Statistics t Tests

On the DATA tab, select the Cars data set in the SASHELP library.

Create a filter that reads: Cylinders = 4 or Cylinders = 6

On the DATA tab, select Two-sample t Test.

Select Invoice as the Analysis variable and Cylinders as the Groups variable.

On the OPTIONS tab select Tests for normality and the Wilcoxon rank sum test.

Leave the defaults on the PLOTS tab.

All statistical tests are highly significant.

7. Change the value on the DO loop to 50 from Exercise 6.

If you are using SAS University edition this data set will be created when you run Create_Datasets.sas. You can also get it from the download (it’s part of the Create_Datasets.sas program) or you can type it in.

Tasks Statistics t Tests

On the DATA tab find data set TTest in the WORK library. Select a two-sample t Test.

Select variable X as the Analysis variable and Group as the Groups variable.

The p-value is .0094 (significant).

Chapter 9 Solutions

1. Tasks Linear Models One-Way ANOVA

On the DATA tab, select the High_School data set in the STATS library.

Select Vocab_Score as the Dependent variable and Grade as the Categorical variable.

On the OPTIONS tab, be sure that the Tukey multiple comparison test is checked.

Grade is highly significant. Pair-wise comparisons that are not significant are Junior vs Senior and Senior and Sophomore (close but > .05). Levene’s test is not significant.

3. Run the program shown. That program is also in Create_Datasets.sas if you don’t want to type it.

Tasks Linear Models One-Way ANOVA

On the DATA tab select data set Temp in the WORK library, select Weekly_Salary as your dependent variable and Gender_Age as your Categorical variable. The ANOVA is highly significant as are all the pairwise comparisons.

5. Tasks Linear Models Nonparametric One-Way ANOVA

On the DATA tab, select the Heart data set in the SASHELP library.

Enter Cholesterol as the Dependent variable and DeathCause as the Categorical variable.

On the OPTIONS tab, select multiple comparisons.

Pairwise differences less than .05 are: Other vs Coronary Heart Disease; Cancer vs Coronary Heart Disease; Cerebral Vascular Disease vs Coronary Heart Disease.

Parametric analysis gives identical results.

7. Tasks Linear Models One-Way ANOVA

On the DATA tab, select the Cars data set in the SASHELP library.

Enter the following in the filter box: Cylinders IN (4,6,8)

Select Weight as the Dependent variable and Cylinders as the Categorical variable.

All pairwise differences are highly significant

Chapter 10 Solutions

1. Tasks Linear Models N-Way ANOVA

On the DATA tab, select the High_School data set in the STATS library.

Enter Vocab_Score as the Dependent variable, Gender and Grade as Factors.

On the MODEL tab, click on Edit, select Gender and Grade in the Variables list then click on Full Factorial and scroll down to OK.

The interaction term is not significant and both main effects are significant at the .05 level.

The only pair-wise comparison for Grade that is not significant is between Junior and Senior.

3. Tasks Linear Models N-Way ANOVA

On the DATA tab, select the Interact data set in the STATS library, select Parts as the Dependent variable, Training and Seniority as Factors.

On the MODEL tab, select Training and Seniority as Variables, click Full Factorial and OK.

Notice the highly significant interactive term. Go back to the OPTIONS tab. Under Statistics, select All effects in the pull-down list under multiple comparisons. Rerun the program. Notice that training has an effect on beginning workers but not the long time workers.

5. Tasks Linear Models N-Way ANOVA

On the DATA tab, select the data set Salary in the STATS library.

Select Weekly_Salary as the Dependent variable, Age_Group, Education, and Gender as Factors.

On the MODEL tab, click on Edit, select Age_Group, Education, and Gender as Variables, click on N-Way Factorial, select two-way interactions and click OK.

On the OPTIONS tab, select Plots and Suppress Plots.

Chapter 11 Solutions

1. Tasks Statistics Correlation Analysis

On the DATA tab, select the data set High_School in the STATS library.

Select the three variables as Analysis variables.

On the OPTIONS tab, select a Matrix of Plots for Type of plot and check the box for including histograms on the diagonal.

3. Tasks Statistics Correlation Analysis

On the DATA tab, select the Heart data set in the SASHELP library. Select Height, Weight, Systolic blood pressure, and Diastolic blood pressure as Analysis Variables.

On the OPTIONS tab, be sure the selection of None is in the box under Plots.

When you run it again with a request for individual plots, you do not get any plots. You need to increase the Maximum number of points to “No limit.”

5. Tasks Statistics Correlation Analysis

On the DATA tab, select the High_School data set in the STATS library.

Select the variables Vocab_Score, Spelling_Score, and English_Grade in the Analysis Variables box. Select the variable Honor in the Correlate With box.

On the OPTIONS tab, under Plots, select None.

7. Tasks Statistics Correlation Analysis

On the DATA tab, select Physics_Test in the STATS library. Select the variables Ans1 through Ans10 for the Analysis Variables and Grade as the Correlate With variable.

On the OPTIONS tab, under Plots, select None.

Chapter 12 Solutions

1. Tasks Linear Models Linear Regression

On the DATA tab, select the data set High_School in the STATS library. Select Spelling_Score as the Dependent variable and Vocab_Score as the Continuous variable.

On the MODEL tab, click Edit and select Vocab_Score in the variable box and then click on Add. Finally, scroll down and click on OK.

The regression equation is Spelling_Score = 45.54663 + .74910x100 = 120.46

3. Tasks Linear Models Linear Regression

On the DATA tab, select the High_School data set in the STATS library. Select English_Grade as the Dependent variable and Vocab_Score as a Continuous variable. In the box for Categorical variables, add Honor (0 or 1) and Gender.

On the MODEL tab, click on Edit and choose all four variables in the variable box. Then click Add and OK.

On the MODELS tab, select Default and additional statistics. Under collinearity, check the box for Variance Inflation Factor. Even though none of the VIFs are high, notice that Vocab_Score is no longer significant and it’s beta (slope) is negative.

5. Tasks Linear Models Linear Regression

On the DATA tab, select the Cars data set in the SASHELP library, MSRP (manufactures suggested retail price) as the dependent variable and the three variables Horsepower, Weight, and Length as Continuous variables.

On the MODEL tab, click Edit, select the three predictor variables Horsepower, Weight, and Length in the Variables box, click Add and OK.

On the SELECTION tab, chose Stepwise selection and run the model. Only Horsepower and Length enter. Interesting to note that the coefficient for Length is negative. That’s probably because of very expensive sports cars.

Chapter 13 Solutions

1. Tasks Statistics Summary Statistics

On the DATA tab, select the Graduate data set in the STATS library. Select English_Grade and Math_grade as Analysis variables, Study and Graduate as Classification variables.

If you wish, you can open the OPTIONS tab and select comparative box plots and/or histograms.

3. Tasks Linear Models Binary Logistic Regression

On the DATA tab, select the Graduate data set in the STATS library. Choose Graduate as the Analysis variable and 1 (yes) as the Event of interest. Select Study as a Classification variable and the two variables English_Grade and Math_Grade as Continuous variables. Select Reference coding as the parameterization method. Yes, the model improved as shown by a smaller value of SC (Schwartz Criteria), increased concordant pairs and decreasing discordant pairs. If you choose, you can edit the code to modify the CLASS statement to read:

   class Study (ref=’0’) / param=ref;

5. Tasks Linear Models Binary Logistic Regression

On the DATA tab, select the Graduate data set in the STATS library. Choose Graduate as the Analysis variable and 1 (yes) as the Event of interest. Select Study as a Classification variable and the two variables English_Grade and Math_Grade as Continuous variables. Select Reference coding as the parameterization method.

On the SELECTION tab, select Stepwise as the Selection method. If you wish, you can change the reference level for study to ‘0’. All variables are selected. Your edited code should look like this:

proc logistic data=STATS.GRADUATE;

   class Study (ref=’0’)/ param=ref;

   model Graduate(event=’1’)=Study English_Grade Math_Grade / link=logit 

      selection=stepwise slentry=0.05 slstay=0.05 hierarchy=single technique=fisher;

run;

Chapter 14 Solutions

1. Tasks Statistics One-way Frequencies

On the DATA tab, select the Graduate data set in the STATS library. Under Analysis variables, select Gender, Study, and Graduate.

On the OPTIONS tab, deselect Percentages and Cumulative frequencies and percentages. Under Plots, click on Suppress all plots.

3. Tasks Statistics Table analysis

On the DATA tab, select the Graduate data set in the STATS library. Select Study in the Row variables box and Graduate in the Column variables box.

On the OPTIONS tab, be sure Chi-square is checked on the Statistics menu.

Click on CODE and Edit so that you can write your format. Your final program should look like this:

proc format;

   value Graduate 1=’1:Yes’ 0=’2:No’;

proc freq data=STATS.GRADUATE order=formatted;

   format Graduate Graduate.;

   tables  (Study) *(Graduate) / chisq nopercent norow nocol nocum plots=none;

run;

5. Tasks Statistics One-way Frequencies

On the DATA tab, select the Fish data set in the SASHELP library. Select Species as the Analysis variables box.

On the OPTIONS tab, under Row Value Order, select Decreasing frequency. Make your own choices about cumulative statistics and plots.

Chapter 15 Solutions

1. Tasks Power and Sample Size t Tests

On the PROPERTIES tab, select Two-sample test under type of test. You want to solve for sample size per group. Leave the default selection of Pooled for Type of test. Under Select a Form, select

Group means. Enter 200 for Group 1 and 210 for Group 2. Enter 10 as the Standard deviation. Under Power, enter .8, click the + sign, enter .85, click the + sign and enter .9.

3. Tasks Power and Sample Size One-way ANOVA

On the PROPERTIES tab, select Sample size per group. For the number of groups select 3. Under means enter the three means 50, 55, and 60, click the + sign and enter three values 50, 60, and 70.

Under Standard deviation, enter the two values 15 and 20 in the usual way. Enter a power of .8.

On the PLOTS tab, select Power by sample size and enter .7 as the minimum value and .9 as the maximum power.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.70.131