Visualization in R

Visualization is a powerful tool for analyzing data and for presenting results. Many relationships and patterns that are obscured by summary statistics can be brought to light through visualization. The next graph shows a potent example of this. To begin with, let's look at some data that R comes with on the stopping distance of cars. This variable is contained in a dataset called cars, in a variable called dist. Histograms provide an informative way to visualize single variables. We can make a histogram with one line of code:

> hist(cars$dist)
Visualization in R

R makes the histogram, decides how to break up the data, and provides default labels for the graph title and the x and y axes. Type the ?hist() command to see other arguments to this function that change the number of bars, the labels, and other features of the histogram.

Anscombe's quartet comprises four small datasets with two variables each. Each of the sets has similar mean and variance for both variables, and regressions of y on x in each dataset generate nearly identical regression estimates. Overall, we might be tempted to infer that these datasets are nearly identical. However, bivariate visualization (of the x and y variables from each dataset) using the generic plot() function shows otherwise. At a minimum, the plot() function takes two arguments, each as a vector of the same length. To create the following four plots, enter the following commands for each pair of x and y (x1 and y1, x2 and y2, x3 and y3, and finally x4 and y4):

# par can be used to set or query graphical parameters.
# subsequent figures will be drawn in a n-row-by-n-column array (e.g. 2,2)
#par(mfrow=c(2,2)) 
> plot(anscombe$x1, anscombe$y1, xlab="x1", ylab="y1",
  main="Anscombe 1")
> abline(lm(anscombe$y1~anscombe$x1)

The code discussed earlier shows how to create your own x and y axis labels and plot titles. The abline call adds straight lines to an existing plot, in this case, the best fit (or regression) line of y on x. The output of the previous code is shown as follows:

Visualization in R

This exercise not only demonstrates some simple graphical commands, but also the importance of visualization generally. In later chapters, we elaborate on these graphical methods to enhance analyses and the presentation of analytical results.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.107.152