Introducing ggplot2

The ggplot2 package is a frequently used graphical package among the R community. The package provides a comprehensive and coherent grammar for graphical functions. The grammar is also consistent, and you can create nice graphs with this package. The ggplot2 package enhances the built-in graphical capabilities and gives a layer-oriented approach to plotting graphs.

The following command installs the package and loads it into memory:

install.packages("ggplot2"); 
library("ggplot2"); 

Here is the code for the first graph that uses the ggplot() function. The data used is the TM data frame, as created and modified in the previous section, with all of the factors properly defined:

ggplot (TM, aes(Region, fill=Education)) + 
  geom_bar(position="stack"); 

The ggplot() function defines the dataset used (TM in this case) and initializes the plot. Inside this function, the aes() (short for aesthetics) function defines the roles of the variables for the graph. In this case, the graph shows the frequencies of the Region variable, where the Region bars are filled with the Education variable, to show the number of cases with each level of education in each region. The geom_bar() function defines the plot type, in this case, a bar plot. There are many other geom_xxx() functions for other plot types. The following screenshot shows the results:

Education in regions

You should also try to create graphs with a different position parameter—you can use position = "fill" to create stacked bars, where you can easily see the relative distribution of education inside each region, and position = "dodge" to show the distribution of education for each region side by side.

Remember the graph from the linear regression section, the fourth graph in this chapter, the graph that shows the data points for the first 100 cases for the YearlyIncome and Age variables, together with the linear regression and the lowess line? The following code creates the same graph, this time with the ggplot() function, with multiple geom_xxx() functions:

TMLM <- TM[1:100, c("YearlyIncome", "Age")]; 
ggplot(data = TMLM, aes(x=Age, y=YearlyIncome)) + 
  geom_point() + 
  geom_smooth(method = "lm", color = "red") + 
  geom_smooth(color = "blue"); 

The result is shown in the following screenshot:

Linear and polynomial regression with ggplot
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.17.18