Time for action - creating a histogram

A histogram displays the frequency with which certain values occur in a dataset. Visually, a histogram looks similar to a bar chart, but it conveys different information. Histograms help us to get an idea of how varied and distributed our data are. Let us begin the histogram making process in R:

  1. Use the hist(...) function to create a histogram:
    > #create a histogram that depicts the frequency distribution
    of past fire attack durations
    > #get the histogram data
    > histFireDurationData <- subsetFire$DurationInDays
    > #customize the histogram
    > histFireDurationDataMain <- "Duration of Past Fire Attacks"
    > histFireDurationLabX <- "Duration in Days"
    > histFireDurationLimY <- c(0, 10)
    > histFireDurationRainbowColor <-
    rainbow(max(histFireDurationData))
    > #use hist(...) to create and display the histogram
    > hist(x = histFireDurationData,
    main = histFireDurationDataMain, xlab = histFireDurationLabX,
    ylim = histFireDurationLimY,
    col = histFireDurationRainbowColor)
    
  2. Your histogram will be displayed in the graphic window, as shown in the following:
    Time for action - creating a histogram

What just happened?

We used the hist(...) function to generate a histogram that depicted the frequency distribution of our fire attack duration data.

hist(...)

In its simplest form, the hist(...) function is very similar to boxplot(...). At a minimum, it requires only that the data for the chart's columns be defined. A simple function looks like the following:

hist(x = dataset)

As is true with our other graphics, the hist(...) function also receives graphic customization arguments. We rescaled our y-axis with ylim, colored our bars with col, and added text to our histogram with main and xlab. Also note that we used the max(data) function within the rainbow(...) component of our col argument to ensure that our histogram would have enough colors to represent each unique value in our dataset:

hist(x = histFireDurationData, main = histFireDurationDataMain,
xlab = histFireDurationLabX, ylim = histFireDurationLimY,
col = histFireDurationRainbowColor)

Pop quiz

  1. Which of the following information are we not capable of deriving from a histogram?

    a. The most and least frequently occurring values in the dataset.

    b. The total number of data points in the dataset.

    c. The minimum and maximum values in the dataset.

    d. The exact value of each data point in the dataset.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.168.152