To plot the distribution, we can add a density=1 parameter in the plot.hist function. Let's go through the code. Note that there are changes in steps 1, 4, 5, and 6. The rest of the code is the same as the preceding example:

  1. Plot the distribution of group experience:
plt.figure(figsize = (10,6))

nbins = 20
n, bins, patches = plt.hist(yearsOfExperience, bins=nbins, density=1)
  1. Add labels to the axes and a title:
plt.xlabel("Years of experience with Python Programming")
plt.ylabel("Frequency")
plt.title("Distribution of Python programming experience in the vocational training session")
  1. Draw a green vertical line in the graph at the average experience:
plt.axvline(x=yearsOfExperience.mean(), linewidth=3, color = 'g') 
  1. Compute the mean and standard deviation of the dataset:
mu = yearsOfExperience.mean()
sigma = yearsOfExperience.std()
  1. Add a best-fit line for the normal distribution:
y = ((1 / (np.sqrt(2 * np.pi) * sigma)) * np.exp(-0.5 * (1 / sigma * (bins - mu))**2))
  1. Plot the normal distribution:
plt.plot(bins, y, '--')
  1. Display the plot:
plt.show()

And the generated histogram with the normal distribution is as follows: 

The preceding plot illustrates clearly that it is not following a normal distribution. There are many vertical bars that are above and below the best-fit curve for a normal distribution. Perhaps you are wondering where we got the formula to compute step 6 in the preceding code. Well, there is a little theory involved here. When we mentioned the normal distribution, we can compute the probability density function using the Gaussian distribution function given by ((1 / (np.sqrt(2 * np.pi) * sigma)) * np.exp(-0.5 * (1 / sigma * (bins - mu))**2)).