Multivariate continuous data visualization

In this recipe, we will visualize multivariate data with all continuous variables. The plot will look like a table of a bar plot but the important feature of this plot is that we can easily understand the relationship among variables.

Getting ready

Let's call the modified mtcars dataset that we created in the introduction section. Then, we will take only the continuous variables for this recipe:

# Taking subset with only continuous variables
con_dat <- modified_mtcars[c("mpg","disp","drat","wt","qsec")]

To produce multivariable visualization, we need to call the tabplot library. If this is not preinstalled, then users can install it using the following command and then load it:

# To install tabplot library
install.packages("tabplot")

# Loading the library
library(tabplot)

How to do it…

The primary command structure for this visualization is as follows:

tableplot(con_dat)

The resultant plot is very much similar to a bar plot, but it contains all the variables at once. This enables us to quickly examine the patterns of relation among variables, anomalies, or even unusual observations.

How to do it…

How it works…

By default, the tableplot function produces the visualization for each of the columns in the input dataset. Each column is divided into 100 bins (the default) and produces a horizontal bar plot. The dataset is sorted on the first column (the default). If there is any missing value in any column, then it automatically puts a category for that missing value and is represented by the color red.

There's more…

By default, the tableplot function produces the plot for all variables available in the input data, but we can change this using the argument within the function. The following is a list of arguments and their role within the tableplot function:

  • nBin: This is to specify number of bins for the y axis. By default, it splits into 100 bins but we can control it using this argument. This is similar to the nclass argument within this function.
  • select: If we are interested in plotting only a subset of variables, then we can specify the variable names using this argument, for example, select=c(mpg,wt). Note that in order to write a variable name, we do not need to include quotation marks ("").
  • sortCol: The default sort order is based on the first variable in the input dataset, but we can change it using this argument, for example, sortCol=wt.

See also

To produce similar visualizations for categorical variables, refer to the Multivariate visualization of categorical data recipe.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.152.194