In this recipe, we will learn how to show the density of data points on a scatter plot in the margin of the X or Y axes.
We will use the rug()
function in the base graphics library. As a simple example to illustrate the use of this function, let's see the data density of a normal distribution:
x<-rnorm(1000) plot(density(x)) rug(x)
As can be seen from this example, the rug()
function adds a set of lines just above the x-axis. A line or tick mark is placed wherever there is a point at that particular X
location. So, the more closely packed together the lines are, the higher the data density around those X
values is. The example is obvious, as we know that in a normal distribution, most values are around the mean value (in this case, zero).
The rug()
function, in its simplest form, only takes one numeric vector as its argument. Note that it draws on top of an existing plot.
Let's take another example and explore some of the additional arguments that can be passed to rug()
. We will use the example metals.csv
dataset:
metals<-read.csv("metals.csv") plot(Ba~Cu,data=metals,xlim=c(0,100)) rug(metals$Cu) rug(metals$Ba,side=2,col="red",ticksize=0.02)
We first read the metals.csv
file and plot barium (Ba
) concentrations against copper (Cu
) concentrations. Next, we added a rug
of Cu
values on the X axis using the default settings. Then, we added another rug
for Ba
values on the Y axis by setting the side
argument to 2
. The side
argument takes four values:
1
: Bottom axis (the default)2
: Left3
: Top4
: RightWe also set the color of the tick marks to red using the col
argument. Finally, we adjusted the size of the tick marks using the ticksize
argument that reads numeric values as a fraction of the width of the plotting area. Positive values draw inward ticks and negative values draw ticks on the outside.
18.116.42.136