Mixed data types

DataFrames in R require a single data type per column of data. The same column may not contain both numeric and character data and when that happens, R coerces the column using the sequence shown as follows:

Logical à Integer à Double à Character

What this means is that, if say a column contains numeric (integer or double) values and character strings, R will coerce the column to be a character column. We can see this by using the typeof command:

> typeof(c(1,2,"a")) 
[1] "character" 

A dataset containing the symbol $ in an amount field for instance, would be interpreted as a character column even though the column was intended to be numeric. In such cases, it would be essential to leverage string operations in R to cleanse the dataset and prepare it for reading in using the appropriate data types. The string operations have been discussed further as follows.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.156.251