Time for action - quantifying categorical variables

Categorical or nominal data is information that is classified into nonnumeric levels. Two pertinent columns in our battle history dataset, and subsequently our head to head combat subset, are represented by categorical data. These are the SuccessfullyExecuted (categorized as Y or N) and Result (categorized as Victory or Defeat) columns. A major benefit of categorical data is that it represents information in a very practical and understandable manner. However, categorical data is not well-suited for quantitative data analysis. Fortunately, R is able to recode categorical data in numeric form, thus allowing us to analyze it quantitatively.

Let us proceed through the steps required to recode our SuccessfullyExecuted and Result columns and save them as numeric variables:

  1. Recode the SuccessfullyExecuted column using as.numeric(data), as can be seen in the following:
    > #represent categorical data numerically using as.numeric(data)
    > #recode the SuccessfullyExecuted column into N = 1 and Y = 2
    > numericExecutionHeadToHead <-
    as.numeric(subsetHeadToHead$SuccessfullyExecuted)
    
  2. Display the contents of your numeric variable in the R console.
    > #display the contents of numericSuccessfullyExecutedHeadToHead
    > numericExecutionHeadToHead
    [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
    

    Note

    Note that if you prefer your categorical variables to begin with a value of zero, as in N = 0 and Y = 1, then you should subtract one from our statement in step 1.

  3. Recode the SuccessfullyExecuted column so it begins with a value of zero.
    > #recode the SuccessfullyExecuted column into N = 0 and Y = 1
    > #by default, R recodes variables alphabetically from 1 to n,
    so subtract one to offset the coding from 0 to n
    > numericExecutionHeadToHead <-
    as.numeric(subsetHeadToHead$SuccessfullyExecuted) - 1
    
  4. Display the contents of your revised variable in the R console:
    > #display the contents of numericExecutionHeadToHead
    > numericExecutionHeadToHead
    [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    
  5. Recode the Result column using as.numeric(data):
    > #recode the Result column into Defeat = 0 and Victory = 1
    > numericResultHeadToHead <- as.numeric(subsetHeadToHead$Result)
    - 1
    
  6. Display the contents of your numeric variable in the R console:
    > #display the contents of numericResultHeadToHead
    > numericResultHeadToHead
    [1] 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 1
    

What just happened?

You have represented your categorical columns (SuccessfullyExecuted and Result) from the head to head combat dataset as numeric variables, thereby preparing them for quantitative analysis. During this process, you encountered the as.numeric(data) function and exercised your ability to overwrite variables.

as.numeric(data)

The as.numeric(data) function is used to represent nonnumeric data in numeric terms. For example, we used as.numeric(data) to convert our N and Y text values from the SuccessfullyExecuted column into the numbers 0 and 1 respectively, using the following:

> numericExecutionHeadToHead <-
as.numeric(subsetHeadToHead$SuccessfullyExecuted) - 1

Similarly, we used as.numeric(data) to code our Result column text of Defeat and Victory into the numbers 0 and 1:

> numericResultHeadToHead <- as.numeric(subsetHeadToHead$Result) - 1

Note

Although our data contained only two categories, note that the as.numeric(data) function is capable of handling any number of levels. For instance, it would be able to code a variable containing levels for low, medium, and high as 0, 1, and 2.

Overwriting variables

In step 1 of our activity, we originally recoded our SuccessfullyExecuted column using values of N as 1 and Y as 2 and saved the results into a variable called numericExecutionHeadToHead, this was done by the following command:

> numericExecutionHeadToHead <-
as.numeric(subsetHeadToHead$SuccessfullyExecuted)

Then, in step 3, we recoded the column using values of N as 0 and Y as 1 and then saved the results into a variable with the same name of numericExecutionHeadToHead:

> numericExecutionHeadToHead <-
as.numeric(subsetHeadToHead$SuccessfullyExecuted) - 1

While this was a seamless process that occurred without interruption, it demonstrates an important property of R variables. That is, R variables can be reassigned to new values. When a variable is overwritten in this manner, it assumes a new value and abandons its previous one. So, after step 3, our numericSuccessfullyExecutedHeadToHead variable represented N and Y as 0 and 1 and ceased to depict the values as we had defined them in step 1.

To demonstrate this point, consider variable A, which has yet to be assigned a value. Once we execute the line:

> A <- 1

Variable A will take on a value of 1 in the preceding line. If we were then to enter the line:

> A <- 2

Variable A would take on a value of 2 in the preceding line. Its previous contents would be overwritten and therefore forgotten.

Pop quiz

  1. What values would represent N and Y in the SuccessfullyExecuted column if it were recoded using the following line?
    > as.numeric(as.numeric(subsetHeadToHead$SuccessfullyExecuted) + 5
    

    a. N = 0 and Y = 1

    b. N = 1 and Y = 2

    c. N = 5 and Y = 6

    d. N = 6 and Y = 7

  2. What would be the value of variable A after the following lines were executed in the R console?
    > A <- 0
    > A <- 1
    > A <- 2
    > A <- 3
    

    a. 3

    b. 2

    c. 1

    d. 0

Have a go hero

Now that you have quantified your first categorical variables, proceed to recode the SuccessfullyExecuted and Result columns for each of the remaining battle methods surround, ambush, and fire. Follow a similar console structure and naming convention that we used with our head to head combat data. For example, you should create the following variables with your ambush data:

  • numericExecutionAmbush
  • numericResultAmbush
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.160.66