Part A &#x2013; Fit statistics

Did you notice that the observations used have decreased in the regression model? Earlier in the univariate procedure, we had 594 rows of data, and they have now reduced to 564. We have called the new data the build data. The last 30 observations in the data have been left out of the model-building process, and have been put in a dataset called validation.

We will use the validation dataset later in the chapter:

Figure 2.10: Fit statistics for regression model

In the analysis of variance (ANOVA) table, the eight degrees of freedom refer to the eight independent variables that are available to estimate the parameters of predicting the dependent variable. The total degrees of freedom represent the sources of variance, which are usually denoted by N-1. Since we did not exclude the intercept from the modeling statement, we have N-1 (eight independent variables - 1) + intercept, which gives us eight degrees of freedom.

The error mentioned in the ANOVA table is something that most modelers don't actively consider when assessing the results of regression. However, if the error is equal to 0, then no F or P statistic will be produced in ANOVA, and the modeler will probably have to collect more data before the regression model can be run. The error term refers to the residual degrees of freedom, which in our case will be 563-8=555.

The sum of squares in ANOVA is associated with the variance for the model, error, and the corrected total. The mean square refers to the sum of squares divided by the degrees of freedom. The F value is compiled by dividing the mean square model by the mean square error. The F value, used in conjunction with the P value informs us whether the independent variables are reliable variables for predictions of the dependent variable stock.

The next important statistics that the modeler referred to in the ANOVA output were the values of the R square and the adjusted R square. The R square is the value that depicts the percentage of the variance in the variable stock that can be explained by the independent variable. In this case, 88.59% of the variance can be explained by the use of the independent variables selected for modeling. As the independent variables keep getting added on to the model, the prediction power of the model increases, in general. Some of this increase happens by chance. To negate the chance factor, we observe the adjusted R square. Having a close R square and adjusted R square is always a good sign in the model. One of the reasons that the adjusted R square could vary significantly from the R square is the low number of observations and high number of predictor variables. Apart from this, if the two values differ significantly, then the modeler should ensure that the model is thoroughly scrutinized. A higher adjusted R square value is a positive sign in interpreting the model, but it isn't the only measure that the modeler should rely on:

Figure 2.11: Parameter estimates for regression model

The parameter estimate table is critical for the model – building exercise, as this is the section that helps to identify the significant variables for prediction, and informs us of the value of the parameter estimate that is to be used in the regression equation. However, some modelers overlook the observations that can be made from looking at the diagnostic plots. Before accepting the results from the model, it is necessary to review the diagnostic plots. But for now, let's focus on the value of the information in the parameter estimates table.

According to the table , there are six variables that are statistically significant in predicting the variable stock. All of these variables have a Pr> |t| <0.001. We can set our significance level (alpha level) at 95%, or 0.05, to accept or reject the null hypothesis. All of the six significant variables have a P value less than 0.05. The null hypothesis tested here is that the explanatory variables don't have a significant explanatory power (as the regression coefficient is assumed to be zero) for the response variable of stock price. In the case of these six variables, as the P value is <0.05 (our significance level), we can reject the null hypothesis and conclude that these variables can be used for explaining the relationship with the stock price movement.

For every change in one unit of the significant variable of EPS, there is a 1.63450 unit change in the value of the stock price. This level of change per unit is provided to us by the parameter estimate. The parameter estimate also provides the direction of the relationship. Three of the six significant variables are inversely related to the stock price. These are the P/E ratio, the weighted inflation of the top 10 economies, and the weighted index of the M1 money supply for the top 10 economies. The modeler expected that the P/E ratio could be inversely related to the stock price, as a lower P/E ratio is what investors are typically looking for, with the expectation that investing in such a stock will lead to higher returns. Hence, the negative relationship between the P/E ratio and stock price makes intuitive sense.

Inflation does impact consumer spending and corporate profits. As inflation rises, consumers may hold back their spending. The raw material costs may increase, and the corporations may be forced to either reduce profitability or pass the high costs onto the consumer, thereby further pushing up the cost of goods. A higher inflation rate may also be a sign of wage growth, giving consumers a bit of extra cash every month. Each company's stock price may react to changes in inflation differently, due to the many factors at play in the economy. However, for the mobile manufacturer in question, it seems that a rise in inflation would negatively impact the stock price.

The modeler expected that the M1 money supply would have a positive impact on the stock price. The assumption was that the higher the money supply in the top 10 economies of the world, the greater the chance that consumers would splurge on new mobile phones. However, here, a bit of recent history might come into play. Almost a decade ago, there was a major worldwide financial crisis, and, as a result, central banks started a program of quantitative easing. Although quantitative easing doesn't strictly increase the M1 money supply directly, there is a chance that it could end up helping it.

A reduction in the M1 index may be interpreted in the current context, with the tightening or stopping of quantitative easing by a section of the large economies. This may, in turn, be a positive indicator for the economy. The modeler was nevertheless surprised with the negative parameter estimate for the variable, and had to consider the reasons why the model assigned it an inverse relationship.

If you recollect the correlation output, you might have noticed that only variable the M1 money supply index should an inverse relationship when we measured the strength of each variable and stock price. However, in the regression model output, we have three significant variables with an inverse relationship with the stock price. Remember, we also said earlier that the correlation doesn't test the significance, and we do not assume any independent or dependent variable relationship in the correlation. Furthermore, the correlation was just a measure of two variables. It didn't take into account the effects of other variables in measuring the strength of the relationship between two variables; whereas in the regression model, all independent variables are collectively trying to explain the variance seen in the stock price movement. This interplay between various independent variables is different from the correlation phenomenon. Hence, the output of the correlation and regression should be viewed in the correct perspective.

So far, in this section, we have analyzed the output of ANOVA and studied the relevance of the parameter estimates. We have also highlighted how any one statistic shouldn't be interpreted in isolation, and diagnostic plots should also be evaluated. Let's move on to assessing the diagnostic plots.

Table of Contents for
Part A – Fit statistics

Part A – Fit statistics

Table of Contents for Part A &#x2013; Fit statistics

Create new playlist

Sign In

Sign Up

Table of Contents for
Part A – Fit statistics