Sampling

Often, we would be interested in creating a representative dataset, for some analysis or design of experiments, by sampling from a population. This is particularly the case for Bayesian inference, as we will see in the later chapters, where samples are drawn from posterior distribution for inference. Therefore, it would be useful to learn how to sample N points from some well-known distributions in this chapter.

Before we use any particular sampling methods, readers should note that R, like any other computer program, uses pseudo random number generators for sampling. It is useful to supply a starting seed number to get reproducible results. This can be done using the set.seed(n) command with an integer n as the seed.

Random uniform sampling from an interval

To generate n random numbers (numeric) that are uniformly distributed in the interval [a, b], one can use the runif() function:

>runif(5,1,10)  #generates 5 random numbers between 1 and 10
[1]  7.416    9.846    3.093   2.656   1.561

Without any arguments, runif() will generate uniform random numbers between 0 and 1.

If we want to generate random integers uniformly distributed in an interval, the function to use is sample():

>sample(1:100,10,replace=T)   #generates 10 random integers between 1 and 100
[1]  24 51 46 87 30 86 50 45 53 62

The option replace=T indicates that the repetition is allowed.

Sampling from normal distribution

Often, we may want to generate data that is distributed according to a particular distribution, say normal distribution. In the case of univariate distributions, R has several in-built functions for this. For sampling data from a normal distribution, the function to be used is rnorm(). For example, consider the following code:

>rnorm(5,mean=0,sd=1)
[1]  0.759  -1.676   0.569  0.928 -0.609

This generates five random numbers distributed according to a normal distribution with mean 0 and standard deviation 1.

Similarly, one can use the rbinom() function for sampling from a binomial distribution, rpois() to sample from a Poisson distribution, rbeta() to sample from a Beta distribution, and rgamma() to sample from a Gamma distribution to mention a few other distributions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.172.93