How to do it... 

In the following exercise, we will work with a model containing several correlated regressors:

We first load the MASS library, and we then build a dataset that we can later use. It will have two variables (V1, V2) that are severely correlated. And we'll have two other variables (V3-V4) that are very correlated as well. We'll have a fifth variable that is independent from V1-V4. We want to compare Ridge against ordinary least squares and see what happens with these two pairs of variables and for the independent variable. Furthermore, we will wrap everything into a function that plots a boxplot. Our preferred way of doing Ridge is using the glmnet function/package. This allows us to do something slightly more powerful than Ridge, which is called glmnet. This method is actually a mixture of Ridge and Lasso (another penalized regression technique, which we will review on the next chapter). We even have a specific parameter called alpha that controls how much Ridge/Lasso we want:

library(MASS) 
library(tidyr) 
library(ggplot2) 
get_results <- function(lambda){ 
coeffs_total = data.frame(V1=numeric(),V2=numeric(),V3=numeric(),V4=numeric(),V5=numeric()) 
for (q in 1:100){ 
V1        = runif(1000)*100 
V2        = runif(1000)*10 + V1 
V3        = runif(1000)*100 
V4        = runif(1000)*10 + V3 
V5        = runif(1000)*100 
Residuals = runif(1000)*100 
Y         = V1 + V2 + V3 + V4 + V5 + Residuals 
coefs_lm     <- lm(Y ~ V1 + V2 + V3 + V4 + V5)$coefficients 
coefs_rd     <- glmnet(cbind(V1 ,V2,V3,V4 ,V5),Y,lambda=lambda,alpha=1)$beta 
frame1       <- data.frame(V1= coefs_lm[2], V2= coefs_lm[3],V3= coefs_lm[4], V4= coefs_lm[5],V5=  coefs_lm[6],method="lm") 
frame2       <- data.frame(V1= coefs_rd[2], V2= coefs_rd[3],V3= coefs_rd[4], V4= coefs_rd[5],V5=  coefs_rd[6],method="ridge") 
coeffs_total <- rbind(coeffs_total,frame1,frame2) 
} 
transposed_data = gather(coeffs_total,"variable","value",1:5) 
ggplot(transposed_data, aes(x=variable, y=value, fill=method)) + geom_boxplot() 
print(transposed_data %>% group_by(variable,method) %>%
summarise(median=median(value))) 
}

We call the function using lambda=8. As we can see, ridge estimates are much more stable than the ordinary least squares ones. Also, the coefficients are slightly smaller (are farther away from 1, which is the correct value for them). This is the bias that is introduced by ridge regression. Note that we use alpha=0, meaning that we don't want any Lasso regularization here (we just want all of it to be Ridge). There are two important points here: firstly, the coefficients are much more stable than their lm counterparts (less variability—shorter boxes in the boxplot); secondly, most variables get slightly compressed (shown on the following table): this is caused by the bias introduced by Ridge. It's worth noting that the irrelevant variable got compressed quite significantly, but the result is not exactly zero:

get_results(8)

The following screenshot shows the boxplots for the coefficients:

The following screenshot shows the medians for each coefficient:

We check our results with lambda=0.1. In this case, the coefficients' variability is almost the same as when using lm (basically, there is no ridge compression). As it happened before, the coefficients are smaller, but on a smaller scale due to the smaller lambda:

get_results(0.1)

The following screenshot shows the boxplots for the coefficients:

The following screenshot shows medians for each coefficient:

Table of Contents for How&#xA0;to do it...&#xA0;

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...