Regression models

Regression models range from commonly used linear, logistic, and multiple regression algorithms used in statistics to Ridge and Lasso regression, which penalizes co-efficients to improve model performance.

In our earlier examples, we saw the application of linear regression when we created trend-lines. Multiple linear regression refers to the fact that the process of creating the model requires multiple independent variables.

For instance:

Total Advertising Cost = x* Print Ads, would be a simple linear regression; whereas

Total Advertising Cost = X + Print Ads + Radio Ads + TV Ads, due to the presence of more than one independent variable (Print, Radio, and TV), would be a multiple linear regression.

Logistic regression is another commonly used statistical regression modelling technique that predicts the outcome of a discrete categorical value, mainly for cases where the outcome variable is dichotomous (for example, 0 or 1, Yes or No, and so on). There can, however, be more than 2 discrete outcomes (for example, State NY, NJ, CT) and this type of logistic regression is known as multinomial logistic regression.

Ridge and Lasso Regressions include a regularization term (λ) in addition to the other aspects of Linear Regression. The regularization term, Ridge Regression, has the effect of reducing the β coefficients (thus 'penalizing' the co-efficients). In Lasso, the regularization term tends to reduce some of the co-efficients to 0, thus eliminating the effect of the variable on the final model:

# Load mlbench and create a regression model of glucose (outcome/dependent variable) with pressure, triceps and insulin as the independent variables.

> library("mlbench") >lm_model<- lm(glucose ~ pressure + triceps + insulin, data=PimaIndiansDiabetes[1:100,]) > plot(lm_model)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.119.81