In this recipe, we will visualize multivariate data with all continuous variables. The plot will look like a table of a bar plot but the important feature of this plot is that we can easily understand the relationship among variables.
Let's call the modified mtcars
dataset that we created in the introduction section. Then, we will take only the continuous variables for this recipe:
# Taking subset with only continuous variables con_dat <- modified_mtcars[c("mpg","disp","drat","wt","qsec")]
To produce multivariable visualization, we need to call the tabplot
library. If this is not preinstalled, then users can install it using the following command and then load it:
# To install tabplot library install.packages("tabplot") # Loading the library library(tabplot)
The primary command structure for this visualization is as follows:
tableplot(con_dat)
The resultant plot is very much similar to a bar plot, but it contains all the variables at once. This enables us to quickly examine the patterns of relation among variables, anomalies, or even unusual observations.
By default, the tableplot
function produces the visualization for each of the columns in the input dataset. Each column is divided into 100 bins (the default) and produces a horizontal bar plot. The dataset is sorted on the first column (the default). If there is any missing value in any column, then it automatically puts a category for that missing value and is represented by the color red.
By default, the tableplot
function produces the plot for all variables available in the input data, but we can change this using the argument within the function. The following is a list of arguments and their role within the tableplot
function:
nBin
: This is to specify number of bins for the y axis. By default, it splits into 100 bins but we can control it using this argument. This is similar to the nclass
argument within this function.select
: If we are interested in plotting only a subset of variables, then we can specify the variable names using this argument, for example, select=c(mpg,wt)
. Note that in order to write a variable name, we do not need to include quotation marks (""
).sortCol
: The default sort order is based on the first variable in the input dataset, but we can change it using this argument, for example, sortCol=wt
.3.14.247.9