How to do it...

In the following example, we will load data for the 2019 season for the Los Angeles Lakers. We will have one dummy for each player, flagging if that player was part of the starting team that night, plus an extra dummy that flags if the team is playing at home or away. The objective is to predict if the team won that night using just those dummies as explanatory variables. Ideally, we would like our model to capture those complicated interactions that might occur (that is, Player X might perform much better if Player Y was present):

We first load the dataset and we propose our first logic model:

library(caret)
set.seed(10)
baseketball_data_2019 = read.csv("./lakers_.csv")
baseketball_data_2019 = baseketball_data_2019[,-c(1,2)]

It will have two hyperparameters that will be tuned by caret (treesize and ntrees). Not surprisingly, the results are not good, and the kappa is even negative—meaning that the model is worse than one model that classifies the data randomly:

rctrl1 <- trainControl(method = "cv",number=5) 
baseketball_data_2019$win = as.factor(baseketball_data_2019$win)
model1 <- train(win~.,baseketball_data_2019, method = "logreg", metric = "Accuracy", trControl  = rctrl1, tuneLength = 4)

The following screenshot shows logic model results:

Table of Contents for How to do it...

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...