Time for action - modelling interactions

One other way that we can explore the relationships in our data is by looking at interaction effects. An interaction spawns from an interplay between variables whereby the interaction effect is different from either of the variables alone. Interaction variables can be created in R, although a specific procedure must be followed to use them properly.

Let us look at how an interaction variable can be created and incorporated into a regression model in R. We will accomplish this by including the interaction between Shu and Wei soldiers engaged as a variable in our multiple regression model:

  1. Center the two variables that you plan to interact:
    > #before creating an interaction variable, the component
    variables must first be centered
    > #center a variable by subtracting its mean from each of its
    values
    > #center the number of Shu soldiers engaged
    > centeredShuSoldiersHeadToHead <-
    subsetHeadToHead$ShuSoldiersEngaged
    mean(subsetHeadToHead$ShuSoldiersEngaged)
    > #center the number of Wei soldiers engaged
    > centeredWeiSoldiersHeadToHead <-
    subsetHeadToHead$WeiSoldiersEngaged
    mean(subsetHeadToHead$WeiSoldiersEngaged)
    
  2. Multiply the two centered variables to create the interaction variable:
    > #create an interaction variable by multiplying two or more
    centered variables
    > interactionSoldiersHeadToHead <-
    centeredShuSoldiersHeadToHead * centeredWeiSoldiersHeadToHead
    
  3. Create an interaction model that predicts Rating using the duration, Shu soldiers engaged, Wei soldiers engaged, and the interaction between the number of Shu and Wei soldiers engaged:
    > #predict the rating of a battle using the duration, number of
    Shu and Wei soldiers engaged, and the interaction between the
    number of Shu and Wei soldiers engaged
    > lmHeadToHeadRating_DurationSoldiersShuWeiInteraction <-
    lm(subsetHeadToHead$Rating ~ subsetHeadToHead$DurationInDays +
    subsetHeadToHead$ShuSoldiersEngaged +
    subsetHeadToHead$WeiSoldiersEngaged +
    interactionSoldiersHeadToHead, subsetHeadToHead)
    
  4. Create a summary of the model:
    > #model summary
    > lmHeadToHeadRating_DurationSoldiersShuWeiInteraction_Summary
    <-
    summary(lmHeadToHeadRating_DurationSoldiersShuWeiInteraction)
    
  5. Display your interaction model summary in the R console:
    > #display the summary
    > lmHeadToHeadRating_DurationSoldiersShuWeiInteraction_Summary
    
    Time for action - modelling interactions

What just happened?

You have completed the process of creating and implementing an interaction variable. The resulting interaction model expanded upon our multiple regression model by factoring in the the interplay between the number of Shu and Wei soldiers on the performance rating of the Shu army. Let us review the two-step interaction variable creation process and discuss how such variables can be interpreted:

  1. Center the input variables:

    The initial step in creating an interaction variable is to center the input variables that you wish to interact. This is accomplished by subtracting the mean of all of the values from each data point. For example, in:

    centeredA <- A - mean(A)
    

    The centered version of variable A is created by subtracting the mean of A from each value of A.

    Centering is necessary because it mitigates the threat of multicollinearity, which occurs when two or more independent variables are highly correlated with one another. For instance, our interaction variable was composed of the number of Shu and Wei soldiers engaged in head to head combat. At the same time, our regression model used these variables as separate predictors. Naturally, multicollinearity is a threat in this situation, because our interaction variable is composed of the same data as our other predictors. Thankfully, the centering process is effective in mitigating most of the ill-effects that can be attributed to multicollinearity.

  2. Multiply the input variables:

    The second step in creating an interaction variable is to multiply the centered versions of the input variables, like so:

    interactionAB <- centeredA * centeredB
    

    Afterwards, your interaction variable can be used in the same manner as any other variable within a regression model.

Interpreting interaction variables

The statistical significance of the interaction coefficient is an indication of whether or not an interaction is present in the data. When present, an interaction suggests that the relationship between the dependent variable and a predictor varies as the value of the interacting predictor (Wei soldiers) changes. This phenomenon is sometimes referred to as a moderation effect, because it describes how one predictor moderates, or affects the strength or direction of, the relationship between another predictor and the dependent variable. When an interaction is absent, the relationship between the dependent variable and a given predictor is not believed to alter as the value of the interacting predictor changes.

The interaction term in our latest model was not statistically significant and did not increase the predictive power of the model. This is logical in our situation. If there were an interaction, then we would expect the number of soldiers that one side engaged to differ across the range of soldiers that the other side deployed. For example, if the Shu engaged 1000 soldiers in battle, then the Wei might deploy 10000 (ten times), but if the Shu engaged 10000, the Wei might deploy 500000 (fifty times). In contrast, without the interaction, we would not expect the number of soldiers engaged by one side to vary across the range of soldiers that the other side deployed. Furthermore, the number of soldiers deployed may be better explained by situational attributes, such as the number of soldiers that happen to be available at a given place or time when a battle occurs. The latter explanations have more practical meaning than the interaction interpretation and help to verify the absence of an interaction effect.

Pop quiz

  1. How is a variable centered in R?

    a. By adding its mean to each of its values.

    b. By subtracting its mean from each of its values.

    c. By multiplying its mean by each of its values.

    d. By dividing its mean by each of its values.

  2. How is an interaction variable created in R?

    a. By adding the two variables that are believed to interact.

    b. By multiplying the two variables that are believed to interact.

    c. By adding the centered versions of the two variables that are believed to interact.

    d. By multiplying the centered versions of the two variables that are believed to interact.

  3. Which of the following would be a viable interpretation of a statistically significant interaction between the variables A and B?

    a. The relationship between B and the dependent variable fluctuates based on the value of A.

    b. The relationship between A and B fluctuates based on the value of the dependent variable.

    c. The value of the dependent variable fluctuates based on the relationship between A and B.

    d. The value of the dependent variable fluctuates based on the values of A and B.

Have a go hero

Consider the data in one of your remaining battle method subsets (surround, ambush, or fire). Use the techniques that we have employed in this chapter to create a multiple regression model that incorporates an interaction variable. Then interpret the model. Be sure to address the meaning and significance of the interaction that you explored.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.60.166