As Heraclitus said, “No man ever steps in the same river twice”; the world surrounding us is fast moving and full of randomness. Either the world is created that way, such that everything is random, or our understanding or knowledge is not sufficient to understand the laws behind the randomness. Regardless of this, we have to deal with the randomness and uncertainty that is represented in the random data we observe. The only tools we humans possess to deal with randomness are probability and statistics. Probability is a branch of theoretical mathematics to study the likeliness that a certain event will occur; statistics, on the other hand, is a branch of applied mathematics that measures the events occurring in the world through observation and presents the data collected. Both fields are important for the understanding of phenomena occurring in the world. Probability provides a fundamental foundation for data analysis in statistics, while statistics uses probability theory to guide the measurement and analysis of the random data observed in the world. Systems engineering lasts years, even decades; it involves many people and thus inevitably a great deal of randomness and uncertainty; for example, the number of defects appearing in the production batch, the market share of the developed system, or the errors occurring in the manufacturing process. We have seen many examples in the book’s chapters. Understanding the basic concepts of probability and statistics is essential for the successful development and implementation of a complex system. It is expected that probability and statistics should be included in any systems engineering curriculum as a required course. We here include a brief review of the basic concepts of probability theory for readers to refresh their memory on these subjects.
As mentioned above, probability studies the randomness of the universe. For example, if we toss a coin, when it lands, it could be showing either heads or tails. We cannot say for sure whether it will be heads or tails before it lands, as the result of tossing a coin is a typically random event that many people like to use to demonstrate probability events. If this coin is fair, we know the chances of a head or tail would be 50%. This simple example includes all the fundamental concepts of probability: experiment, event, sample, sample space, and probability.
In probability theory, an experiment can be loosely defined as an orderly action procedure carried out for a certain scientific purpose, such as verifying the chances of certain things occurring. Examples of experiments include tossing a coin, throwing a die, drawing a card from a deck of cards, testing a rocket launch, predicting tomorrow’s weather temperature, finding out the number of customers in a store, and so on.
The outcome of an experiment is called a sample. For example, when tossing a coin, the possible sample would be either heads or tails; when throwing a die, the sample would be 1, 2, 3, 4, 5, or 6 dots on the uppermost face. We denote a sample as ω, so for throwing a die, ω = 1, 2, 3, 4, 5, or 6. The set that includes all the possible sample results is called a sample space (Ω), for example, for tossing the dice, Ω = {1, 2, 3, 4, 5, 6}. An event is a collection of samples resulting from a certain experiment; for example, if we toss a coin twice, the possible events for this experiment could be, {H,H}, {H,T}, {T,H}, or {T,T}. From this perspective, we can see events are the subsets of sample space, if we denote the event as Ai, then Ai ⊂ Ω. If an outcome of an experiment ω ∈ A, then we say that event A occurs. The set of all the events possible together is the largest possible sample space; we also call it the universe.
Set theory can be used for operations of events, such as intersection (∩) or union (∪) of the events. For example, for event A and B, A ∪ B, A ∩ B, are all new events ( , the compliment of B). Some of the notations for events are:
A = B: A and B are the same events
A ∪ B: Union of A and B, meaning at least one event occurs
A ∩ B: Intersection of A and B, meaning both events occur
A – B: A occurs and B does not occur
All the computation rules of set operations will apply for events; for example,
(I.1)
(I.2)
(I.3)
(I.4)
(I.5)
(I.6)
Probability measures the likelihood of certain events occurring; that is, the probability of event A is denoted as P(A). Probability has the following characteristics:
(I.7)
(I.8)
(I.9)
(I.10)
(I.11)
Events A and B are said to be mutually exclusive if A and B cannot both occur at the same time. For example, if a coin is tossed, A (the event of a head showing) and B (the event of a tail showing) are two mutually exclusive events, since a head and a tail cannot both be showing at the same time; or P(A ∩ B) = 0.
For mutually exclusive events, the probability of either event occurring equals the sum of the probabilities of the individual events occurring, or
If this is a fair coin, meaning the probability of a head or tail showing is 50%, then
Event A is said to be independent to Event B if the probability of A occurring does not depend upon the probability of Event B occurring; in other words, the occurrence of A has nothing to do with Event B or vice versa.
For example, if we toss a coin, the event of having a head on the first toss (Event A) is independent to the event of having a head on the second toss (Event B). If two events are independent to each other, then the probability of the occurrence of the two events are the products of the two individual probabilities. So, the probability of two heads showing in a row if we toss a fair coin twice is
Suppose that we throw a fair die twice in a row; we know that the probability of a sum of 2 for the two throws is 1/36 (i.e., throwing 1 on the first throw and 1 on the second throw, and these two throws are independent to each other). However, if we have observed the first throw result, the probability becomes a conditional probability since the first throw results will impact the probability of the sum. Suppose if we see the first throw is 1, then the probability having a sum of 2 for two throws is 1/6; however, if we observe that the first throw is 3, then the probability of having a sum of 2 is zero, since there is no way we can reach a total of 2 if the first throw exceeds 2. This example illustrates the concept of conditional probability. If we denote A and B as two events, the conditional probability of A occurring given that B has occurred is denoted by a conditional probability
The conditional probability equals the probability of A and B both occurring divided by the probability of B occurring,
(I.12)
In conditional probability, Bayes’ theorem (formula) is one of the most important and fundamental concepts that one should remember.
For two events A and B, the conditional probability P(A|B) is obtained by
(I.13)
Given that events B1, B2, …, Bn are mutually exclusive to each other, and suppose that event A has occurred, we can now determine the probability of event of B1, B2, …, Bn occurring by using a more general form of Bayes’ theorem:
(I.14)
Bj is called the prior probability and P(Bj|A) is the posterior probability.
Using events to describe randomness is quite flexible; however, it is not suitable for numeric analysis. We need a more formal way to represent random events and outcomes of experiments. These quantities of random event outcomes, in the format of real values, are known as random variables.
A random variable is a variable whose value is subject to the outcome of random events. Each value is associated with a chance of such value occurring, which is the probability of the random variable. The function that describes the relations between the values of random variables and their associated probability is called the probability distribution function, sometimes called the probability density function (p.d.f), or probability mass function.
For example, if we let X be the random variable for the outcome of a throw of a fair die, then
and their probabilities are specified as follows:
For another experiment of tossing two fair dice, if we define the random variable X as the sum of the values of two dice, then
And the probability distribution function for this random variable is
The reader can verify that the above values are mutually exclusive and the sum of all the probabilities equals 1, that is to say,
Here, we formally define the probability distribution function.
The discrete probability function is defined as follows: If the values of the random variables are either finite or countable (a countable set means that you can assign 1-to-1 numbering from natural numbers, 1, 2, 3, …, to the elements of the set; a countable set may be finite or infinite).
We denote the discrete random variable as X, and its possible values as {x1, x2, x3, …, xn, …}, the p.d.f of X is denoted as
and
The cumulative distribution function, denoted as F(X), is defined as
(I.15)
For example, if we toss a fair die, the cumulative probability F(4) is obtained as
If the random variable X is a real value from a real value interval, then X is called a continuous random variable. The probability function of a continuous random variable is defined as
And for the cumulative probability function F(a),
(I.16)
Interpreting this graphically, the cumulative probability of a continuous random variable is the area to the left of the value under the p.d.f.
The expected value of random variable X is also sometimes called the mean value of X, or the first moment of X, denoted as E[X], or sometimes μ.
If X is a discrete random variable with probability density function of p(x), then the expected value of X is defined as
(I.17)
We can see the expected value of a discrete random variable is the weighted sum of its value in the sample space multiplied by its corresponding probability.
If X is a continuous random variable, then its expected value is defined as
(I.18)
Expectation of a function of a random variable:
If X is a discrete random variable with probability density function of p(x), and suppose g is a real-value function defined on X, then the expected value of the composite function is
(I.19)
If X is a continuous random variable with probability density function of f(x), and suppose g is a real-value function defined on X, then the expected value of the composite function is
(I.20)
The variance of the random variable X is another important measure of the variable, usually denoted as σ2. The variance for a discrete random variable is defined as
(I.21)
So for a discrete random variable
(I.22)
For a continuous random variable,
(I.23)
In the next section, we will review some of the popular random variables and distribution functions.
The Bernoulli random variable has the following distribution:
For an experiment, there are possible two cases, success or failure (such as tossing a coin, if we call a head success and a tail failure). If we let X = 1 if a success occurs, and X = 0 if a failure occurs, then X is called a Bernoulli random variable, with a probability of p, 0 ≤ p ≤ 1:
The expected value E[X] for a Bernoulli random variable is
And the variance is
Suppose we perform the above Bernoulli experiment n times. If we denote X as the number of successes occurring in the n trials, then X is called a binomial random variable. The p.d.f of a binomial random variable is
(I.24)
where
The expected value for a binomial random variable is
(I.25)
and the variance of a binomial random variable is
(I.26)
A random variable X is said to be a Poisson random variable if X takes a value from the nonnegative integers set, {0, 1, 2, 3, …}, and the probability function is
(I.27)
The expected value of a Poisson distribution is
(I.28)
and the variance of a Poisson distribution is the same, that is
(I.29)
The general uniform random variable defined on the interval [a,b] has the following p.d.f given by
(I.30)
Figure I.1 illustrates the p.d.f. for a uniform random variable.
The cumulative probability distribution for a uniform random variable is
(I.31)
The expected value for the uniform distribution is
(I.32)
and the variance for the uniform distribution is
(I.33)
In the chapter on queuing theory, we saw that the exponential distribution was used widely. An exponential random variable with parameter λ (λ > 0) has the following distribution function:
(I.34)
Figure I.2 illustrates the exponential function.
The cumulative distribution function F is
(I.35)
The expected value of the exponential function is
(I.36)
and the variance of the exponential variable is
(I.37)
The probability distribution function for a normal random variable is
(I.38)
The normal distribution is illustrated in Figure I.3.
As easily seen, the normal distribution is a symmetric bell-shaped curve about its mean value μ.
The expected value is
(I.39)
The variance is
(I.40)
X is said to be a lognormal random variable if its logarithmic log (X) is a normal random variable. The lognormal distribution has the following form with parameters μ and σ:
(I.41)
When parameters μ and σ change, the shape of the lognormal function also changes. μ is called its scale parameter and σ is called its shape parameter. Figures I.4 and I.5 illustrate the lognormal function with different parameters.
The expected value of the lognormal distribution is
(I.42)
and the variance of the lognormal distribution is
(I.43)
A Weibull random variable has the following distribution function:
(I.44)
where α > 0, β > 0.
Figure I.6 illustrates the Weibull distribution. The Weibull distribution is a more general form as it has the following characteristics:
The mean for the Weibull distribution is
(I.45)
and the variance is
(I.46)
Γ() is the gamma function, which is
(I.47)
3.148.102.166