Creating a conditional scatter plot

A scatter plot is the simplest plot that visualizes the relationship pattern between numeric variables. In this recipe, we will see how we can produce a scatter plot of two numeric variables conditional on a categorical variable.

Getting ready

The dataset used for this recipe is as follows:

# Set a seed value to make the data reproducible
set.seed(12345)
qqdata <-data.frame(disA=rnorm(n=100,mean=20,sd=3),
                disB=rnorm(n=100,mean=25,sd=4),
                disC=rnorm(n=100,mean=15,sd=1.5),
                age=sample((c(1,2,3,4)),size=100,replace=T),
                sex=sample(c("Male","Female"),size=100,replace=T),
                econ_status=sample(c("Poor","Middle","Rich"),
                size=100,replace=T))

How to do it…

The primary code structure that produces the scatter plot using the lattice environment is as follows:

xyplot(disA~disB, data=qqdata)
How to do it…

However, in this recipe, we want to produce a conditional scatter plot. We can perform the conditioning in two different ways:

  • We can create a scatter plot and color the points based on the value of another variable
  • We can create a scatter plot with a separate panel for each unique value of another variable

Here are both the code respectively:

# colored scatter plot
xyplot(disA~disB,group=sex,data=qqdata,auto.key=T)
How to do it…

To create the panel scatter plot, we could use the following code, where the scatter plot will be created for each unique value of a categorical variable. In this case, sex is the categorical variable:

# panel scatter plot
xyplot(disA~disB|sex,data=qqdata)
How to do it…

How it works…

The formula part of the xyplot() function specifies the variables for each axis that corresponds to a single point for each pair of values. The group argument has been used to create the conditional plot with the color for each point. However, if we do not use the group argument and use a vertical bar in the formula, then a panel scatter plot is produced.

There's more…

Like the other plot, we can use the title and label for each axis. We can also specify the limit of values for each axis using the xlim and ylim arguments.

There's more…
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.106.78