In this recipe, we will learn how to vary box widths in proportion to the number of observations for each variable.
Just like the previous recipe, we will continue to use the metals.csv
example dataset for this recipe. So, let's first load it:
metals<-read.csv("metals.csv")
Let's build a box plot with boxes of width proportional to the number of observations in the dataset:
boxplot(Cu ~ Source, data = metals,varwidth=TRUE, main="Summary of Copper concentrations by Site")
In the example, we set the varwidth
argument to TRUE
, which makes the width of the boxes proportional to the square roots of the number of observations in the groups.
We can see that the box for Site4 is the narrowest, as it has the least number of observations in the dataset. Differences in the other boxes' widths might not be so obvious, but this setting is useful when we are dealing with larger datasets. By default, varwidth
is set to FALSE
.
18.119.172.75