How to do it...

In this example, we will follow the same logic, but we will use alpha=1, forcing glmnet to do Lasso. This will penalize the coefficients now using the L1 norm, which means that some of the coefficients (the irrelevant ones) will be pushed towards zero exactly. Therefore, some data scientists use LASSO as a variable selection tool:

We use the same code as before, but now with alpha=1:

library(MASS) 
library(tidyr) 
library(ggplot2) 
library(glmnet) 
get_results <- function(lambda){ 
coeffs_total = data.frame(V1=numeric(), V2=numeric(), V3=numeric(), V4=numeric(), V5=numeric()) 
 for (q in 1:100){ 
 V1 = runif(1000)*100 
 V2 = runif(1000)*10 + V1 
 V3 = runif(1000)*100 
 V4 = runif(1000)*10 + V3 
 V5 = runif(1000)*100 
Residuals = runif(1000)*100 
Y = V1 + V2 + V3 + V4 + Residuals 
coefs_lm <- lm(Y ~ V1 + V2 + V3 + V4 + V5)$coefficients 
coefs_rd <- glmnet(cbind(V1 ,V2,V3,V4 ,V5),Y,lambda=lambda,alpha=1)$beta 
 frame1 <- data.frame(V1= coefs_lm[2], V2= coefs_lm[3],V3= coefs_lm[4], V4= 
coefs_lm[5],V5=  coefs_lm[6],method="lm") 
 frame2 <- data.frame(V1= coefs_rd[1], V2= coefs_rd[2], V3= coefs_rd[3], V4= coefs_rd[4], V5=  coefs_rd[5],method="ridge") 
coeffs_total <- rbind(coeffs_total,frame1,frame2) 
} 
transposed_data = gather(coeffs_total,"variable","value",1:5) 
ggplot(transposed_data, aes(x=variable, y=value, fill=method)) + geom_boxplot() 
print(transposed_data %>% group_by(variable,method) %>%
summarise(median=median(value))) 
}

We now run the code with lambda=8. As you can see, the coefficients are slightly smaller than those with Ridge. But the most important part here is that the irrelevant coefficient is now equal to zero. This is slightly better than in Ridge, because it is literally telling us to discard that variable from the model:

get_results(8)

The following screenshot shows the boxplots for the (lambda=8) coefficients:

The following screenshot shows medians for each coefficient (notice that values used for the previous plot):

If we had used 0.1 instead of 8 for lambda, we would have got very similar results. This behaves differently from Ridge, where we got very different results for different lambdas:

get_results(0.1)

The following screenshot shows the boxplots for the (lambda=0.1) coefficients:

Table of Contents for How to do it...

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...