Introducing graphical functions

The graphical representation of data is a central feature, or even the main purpose, of data analysis in general and of spatial data analysis in particular. This section serves as a basic introduction to the procedure of creating graphical output in R. Such an introduction is necessary before moving on to the later chapters, where we would like to quickly be able to display intermediate products during various spatial data analysis steps. In Chapter 9, Advanced Visualization of Spatial Data, we will devote some additional time to the subject of visualization in R, and see how graphical output can be customized when producing publication-quality plots as the end product of spatial data analysis.

Displaying vectors using base graphics

We can graphically display a vector's values using the plot function. For example, the following expression opens a new window within the R environment with a plot of the vector values:

> plot(tmax)

The following screenshot shows what the graphical output looks like, and where it appears, when using RGui and RStudio:

Displaying vectors using base graphics

This output is the default one for the plot function; the values of the tmax vector are plotted on the y axis as a function of their index on the x axis, with open circles marking data points.

When plotting a time series, we would usually like to have the time of observation on the x axis (rather than the indices) and see a line connecting the data points from left to right (rather than unconnected circles). This can be done as follows:

> plot(time, tmax, type = "l")

When plotted, we will see what is shown in the following screenshot:

Displaying vectors using base graphics

The "l" argument for the type parameter indicates we want a line plot, while the first and second arguments are treated as vectors of coordinates on the x and y axes, respectively. We also see that the time vector is automatically formatted so that year breakpoints are labeled on the x axis. There are many additional ways in which we can further customize this plot (and other types of plots we will produce in subsequent chapters). However, we will usually limit ourselves to the default plots until we reach Chapter 9, Advanced Visualization of Spatial Data, where we will elaborate on the subject of graphical output customization within the context of spatial data.

The last plot can also be produced using the following expression:

> plot(tmax ~ time, type = "l")

In this form of calling the plot function, the specification of the x and y axes is indicated by the tmax~time expression. The ~ operator creates a special type of object, a formula object. In this particular case, the formula indicates that tmax is the dependent variable (to the left of the ~ operator and thus plotted on the y axis) and time is the independent variable (to the right of the ~ operator and thus, plotted on the x axis). Formula objects are most common in statistical applications of R (we shall see an example of this in Chapter 8, Spatial Interpolation of Point Data), and in some other cases as well (as we shall see in the next chapter).

Saving graphical output

With the graphical window selected, we can save the image we see in a file through the menus (by navigating to File | Save as in RGui). Several raster (such as *.png) and vector (such as *.pdf) file formats are available for the output. However, sometimes we would like to embed the instructions to save a graphical output within our code to save ourselves the trouble of clicking on the menu buttons when constantly updating an image or when saving multiple images. This is possible by specifying a different graphical device—a file—instead of the graphical window—and closing it afterward. For example, the following code creates a PDF file (named time_series.pdf) with the plot we just saw in the C:Data directory:

> pdf("C:\Data\time_series.pdf")
> plot(tmax ~ time, type = "l")
> dev.off()

The last expression, dev.off(), turns the PDF graphics device off, thereby returning to the default device (which is the graphical window) for the subsequent plots.

Note

Note that path indications in R are character values with directories separated by \. The / symbol can also be used, but not the usual Windows symbol (which is used for a different purpose in R).

There are several functions analogous to the pdf function to write graphical output in other formats, such as bmp, jpeg, png, and tiff. All of these functions have several parameters (in addition to the file path) to modify the output, such as specifying image width, height, and resolution; see the help pages of these functions for more information.

The main graphical systems in R

There are three main graphics systems in R: base graphics (which we just used to create the previous plot), lattice, and ggplot2. For example, the following code produces the previous plot as well as two analogous plots using lattice and ggplot2. The code includes some functions that will be made clear later, and requires installation of additional packages (which we will cover in the next chapter).

> dat = data.frame(time = time, tmax = tmax)
# Base graphics
> plot(tmax ~ time, dat, type = "l")
# lattice
> library(lattice)
> xyplot(tmax ~ time, data = dat, type = "l")
# ggplot2
> library(ggplot2)
> ggplot(dat, aes(x = time, y = tmax)) + 
+ geom_line()

The graphs this code produces are sequentially shown in the following screenshot, from left to right, with the name of the respective graphics system indicated at the top of each panel:

The main graphical systems in R

Many types of plots (such as the time series plot we just created) can be produced using any of the three systems. Therefore, choosing one in many cases is a matter of taste. However, some non-overlapping features do exist among the graphics systems. For example, faceting (which produces a series of plots for different portions of the data side by side) cannot be achieved using base graphics, while 3D plots cannot be produced using ggplot2. As seen in the preceding screenshot, there are also some small differences in the default styling of the plots. Finally, as we can see in the preceding code section, the ggplot2 system has quite a different syntax compared to base graphics and lattice.

In the upcoming chapters, we are going to use base graphics (and sometimes lattice) to quickly visualize the products we get at various steps of spatial data analysis. In Chapter 9, Advanced Visualization of Spatial Data, we are going to concentrate on customizing graphical output in R, mostly using ggplot2.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.75.217