As we saw in the previous recipe, the hist()
function automatically computes the number of breaks and the size of bins in which the values of the variable will be grouped. In this recipe, we will learn how we can control this and specify exactly how many bins we want or where to have breaks between bars.
Once again, we will use the airpollution.csv
example dataset, so make sure that you have loaded it:
air<-read.csv("airpollution.csv")
First, let's see how to specify the number of breaks. Let's make 20
breaks in the nitrogen oxides histogram instead of the default 11
:
hist(air$Nitrogen.Oxides, breaks=20,xlab="Nitrogen Oxide Concentrations", main="Distribution of Nitrogen Oxide Concentrations")
We used the breaks
argument to specify the number of bars for the histogram. We set breaks
to 20
. However, the graph shows more than 20 bars because R uses the value specified only as a suggestion and computes the best way to bin the data with breaks as close to the value specified as possible.
We can also specify the exact values at which we want the breaks to occur. In this case, R does use the value we specify. Once again, we use the breaks
argument but, this time, we have to set it to a numerical vector that contains the values at which we want the breaks. The breaks
vector must cover the full range of values of the X
variable.
Let's say we want breaks at every 100
units of concentration:
hist(air$Nitrogen.Oxides, breaks=c(0,100,200,300,400,500,600), xlab="Nitrogen Oxide Concentrations", main="Distribution of Nitrogen Oxide Concentrations")
So, as you might have noticed, the breaks
argument can take different types of values: a single value that suggests the number of breaks or a vector that specifies exact bin breaks. In addition, breaks
can also take a function that computes the number of bins.
Finally, breaks
can also take a character string as a value that names an algorithm to calculate the number of bins. By default, it is set to "Sturges"
. Other names for which algorithms are supplied are "Scott"
and "FD"
(or "Freedman-Diaconis")
.
3.144.37.240