Box and whisker plots, also called box plots, are charts that divide their data points into quartiles. Box plots are great at comparing distributions of data for different groups or categories side by side.
In this recipe, we will create a box and whisker plot that shows the spread of points garnered by NBA teams from 2000-2009.
To follow this recipe, open B05527_02 – STARTER.twbx
. Use the worksheet called Box and Whisker
, and connect to the Player Stats (NBA Players Regular Season 2009)
data source.
The following are the steps to create a box and whisker plot:
Box and whisker plots allow you to show the spread of your data, and can easily show where 50% of your data is.
This recipe is probably one of the few recipes that start with Show Me. This is because in this case, we select a dimension, Year, and a measure, Points, first and then click on the box and whisker plot in Show Me. The following is what Show Me creates. Each circle represents the points in a specific field.
If we want to show this by year, then we can move Year from Details to the Columns shelf, which will give us the following viz.
What's missing in this box and whisker plot is the scatter. We need to see the individual teams' points as individual points. To do this, we can add the team name onto Details in the Marks card. Remember that adding a field onto Detail changes the level of detail of the visualization; in this case each point is now the sum of points for Year and Team.
Once team name is added, we can see the variation in team points from year to year, and we can note that the box is where 50% of the points lie (thus where 50% of the teams are for that year).
As for the whisker, the default option is to plot the lines based on points that are within 1.5 times the IQR (interquartile range).
There is another option to extend the whiskers to the maximum extent of the points. This extends the whiskers even to the outliers, making outliers even more prominent. However this will skew your boxes.
Box and whisker plots can be shown whenever you want to see how the variations compare. Here is a box-and-whisker plot for the same data set used in this recipe, but this time narrowed down to the Phoenix Suns players who played for the Phoenix Suns for at least 3 years between 2000 and 2009. Each point represents that player's points for a year. The smaller box may mean that the player is more consistently producing points per year. The longer the box means the more variation—in some years the player does very well, and other years not so much.
18.226.185.196