How to do it...

In the following exercise, we will model the amount deposited by clients, based on just two variables:

The amount of time that each salesperson spent on the client
The number of salespeople involved with the specific client

We first load our dataset:

data = read.csv("./sample_random_regression.csv") 
data$clientid = as.factor(data$clientid) 
library("lme4")

In order to practice with different formulations, let's start with the most complex one. Here, we have an intercept and two coefficients (one for each fixed effect). We then have a random effect for both the salespeople involved and the time spent on the deal. Because we added two terms for the parts involving the client IDs, we assume that the shocks impacting each variable will not be correlated:

lmer(data=data,deal_size ~ salespeople_involved + time_spent_deal + (-1 + salespeople_involved|clientid)
+ (-1  + time_spent_deal|clientid) )

The following screenshot shows the estimated mixed model:

This is a similar model, now with an intercept, two fixed effects, and a random slope for only the salespeople involved. A subtle detail here is that we assume that the shocks impacting the slope and the intercept might be correlated:

lmer(data=data,deal_size ~ salespeople_involved + time_spent_deal + (1  + salespeople_involved|clientid))

The following screenshot shows the estimated model:

Another model could have these three random effects: one for the intercept, another one for the time spent, and a final one for the salespeople involved. Here, we allow for nonzero correlation between the three of them:

lmer(data=data,deal_size ~ salespeople_involved + time_spent_deal + (1  + time_spent_deal   
+  salespeople_involved|clientid))

Mixed model result—allowing for correlation between the variables and intercept:

Finally, let's suppose we choose the initial formulation, and we want to predict the differences (predictions) for each group, for each random effect. We can do this with the ranef function. For example, client_id=1, has a predicted smaller salespeople_involved–deal_size relationship than the average client. A very similar interpretation follows for same client for the time_spent_deal. Client = 8 has a positive and abnormally large response to the salespeople_involved variable: each extra person involved in dealing with this client yields a much larger response in the deal size than for the other groups:

model = lmer(data=data,deal_size ~ salespeople_involved + time_spent_deal + (-1   
+  salespeople_involved|clientid) + (-1 + time_spent_deal|clientid) )  
ranef(model)

Take a look at the following screenshot:

Table of Contents for How to do it...

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...