Simulation has had an on-again, off-again history in actuarial practice. For example, in the 1970s, aggregate loss calculations were commonly done by simulation, because the analytic methods available at the time were not adequate. However, the typical simulation often took a full day on the company's mainframe computer, a serious drag on resources. In the 1980s, analytic methods such as the recursive formula discussed in Chapter 9 and others were developed and found to be significantly faster and more accurate. Today, desktop computers have sufficient power to run complex simulations that allow for the analysis of models not amenable to analytic approaches.
In a similar vein, as investment vehicles become more complex, insurance contracts may include interest-sensitive components and financial guarantees related to stock market performance. These products must be analyzed on a stochastic basis. To accommodate these and other complexities, simulation has become the technique of choice.
In this chapter, we provide some illustrations of how simulation can be used to address some complex modeling problems in insurance, or as an alternative to other methods. It is not our intention to cover the subject in great detail but, rather, to give you an idea of how simulation can help. Study of simulation texts such as Ripley [105] and Ross [108] provides many important additional insights. In addition, simulation can also be an aid in evaluating some of the statistical techniques covered in earlier chapters. This use of simulation will be covered here, with an emphasis on the bootstrap method.
The beauty of simulation is that once a model is created, little additional creative thought is required.1 When the goal is to determine values relating to the distribution of a random variable S, the entire process can be summarized in the following four steps:
Two questions remain. One is postponed until Section 19.3. The other one is: What does it mean to generate a pseudorandom variable? Consider a random variable X with cdf . This is the real random variable produced by some phenomenon of interest. For example, it may be the result of the experiment “collect one automobile bodily injury medical payment at random and record its value.” We assume that the cdf is known. For example, it may be the Pareto cdf,
Now consider a second random variable, , resulting from some other process but with the same Pareto distribution. A random sample from , say , is impossible to distinguish from one taken from X. That is, given the n numbers, we could not tell if they arose from automobile claims or something else. Thus, instead of learning about X by observing automobile claims, we could learn about it by observing . Obtaining a random sample from a Pareto distribution is still probably difficult, so we have not yet accomplished much.
We can make some progress by making a concession. Let us accept as a replacement for a random sample from a sequence of numbers , which is not a random sample at all, but simply a sequence of numbers that may not be independent, or even random, but was generated by some known process that is related to the random variable . Such a sequence is called a pseudorandom sequence because anyone who did not know how the sequence was created could not distinguish it from a random sample from (and, therefore, from X). Such a sequence is satisfactory for our purposes.
The field of developing processes for generating pseudorandom sequences of numbers has been well developed. One fact that makes it easier to provide such sequences is that it is sufficient to be able to generate them for the uniform distribution on the interval (0,1). That is because, if U has the uniform(0,1) distribution, then when the inverse exists will have as its cdf. Therefore, we simply obtain uniform pseudorandom numbers and then let . This is called the inversion method of generating random variates. Specific methods for particular distributions have been developed and some will be discussed here. There is a considerable literature on the best ways to generate pseudorandom uniform numbers and a variety of tests have been proposed to evaluate them. Make sure the method you use is a good one.
Table 19.1 The chi-square test of simulated Pareto observations.
Interval | Observed | Expected | Chi-square |
0–100 | 2,519 | 2,486.85 | 0.42 |
100–250 | 2,348 | 2,393.15 | 0.85 |
250–500 | 2,196 | 2,157.04 | 0.70 |
500–750 | 1,071 | 1,097.07 | 0.62 |
750–1,000 | 635 | 615.89 | 0.59 |
1,000–1,500 | 589 | 610.00 | 0.72 |
1,500–2,500 | 409 | 406.76 | 0.01 |
2,500–5,000 | 192 | 186.94 | 0.14 |
5,000–10,000 | 36 | 38.78 | 0.20 |
10,000– | 5 | 7.51 | 0.84 |
Total | 10,000 | 10,000 | 5.10 |
When the distribution function of X is continuous and strictly increasing, the equation will have a unique value of x for any given value of u and a unique value of u for any given x. In that case, the inversion method reduces to solving the equation for x. In other cases, some care must be taken. Suppose that has a jump at so that and . If the uniform number is such that , the equation has no solution. In that situation, choose c as the simulated value.
It is also possible for the distribution function to be constant over some interval. In that case, the equation will have multiple solutions for x if u corresponds to the constant value of over that interval. Our convention (to be justified shortly) is to choose the largest possible value in the interval.
Discrete distributions have both features. The distribution function has jumps at the possible values of the variable and is constant in between.
Many random number generators can produce a value of 0 but not a value of 1 (though some produce neither one). This is the motivation for choosing the largest value in an interval where the cdf is constant.
The second question is: What value of n should be used? This will be answered after some special simulation cases are discussed.
In this section, we will look at a few special cases where either the inversion method may not be the best (or easiest) choice or the situation warrants some additional thoughts.
Recall from Section 4.2.3 that the distribution function for a discrete mixture can be written as
It may be difficult to invert this function, but it may be easy to invert the individual cdfs. This suggests a two-step process for simulating from a mixture distribution.
When following the progress of a portfolio of life insurance or annuity policies, it is necessary to simulate the time or age of death or other decrements. Two approaches will be discussed here.
First, suppose that the portfolio of policies is to be followed from period to period. This may be necessary because other random factors, such as investment earning, may need to be updated as well. If we are looking at a single policy, then there is a probability for each decrement that applies. For example, a policy that is a certain age and duration may have, in the next period, a death probability of 0.01, a lapse probability of 0.09, a disability probability of 0.03, and a probability of continuing as a healthy policyholder of 0.87. Simulating the outcome is simply obtaining a value from a discrete random variable with four possible outcomes. Now suppose that the portfolio contains 250 policyholders with these probabilities and the individual outcomes are independent. This is a multinomial distribution, but it can be broken down into three steps using a result from this distribution.
The second case is where all that is needed is the age or time at which an event occurs. In a single-decrement setting, this is a discrete distribution and a single lookup will provide the answer. The multiple-decrement setting is a bit more complicated. However, simulation can be accomplished with a single lookup provided that some care is taken.
Consider Example 19.6, where it was necessary to simulate a value from a binomial distribution with and . The creation of a table requires 251 rows, one for each possible outcome. While some computer applications make the table lookup process easy, writing code for this situation will require a looping process where each value in the table is checked until the desired answer is reached. In some cases, there may be a more efficient approach based on a stochastic process. Such processes generate a series of events and the time of each event. Counting of the number of outcomes in a fixed time period, such as one year, produces the simulated result. The simulation of the event times may be of use in some situations. However, it should be noted that it is the timing that ensures the modeled distribution, not the reverse. That is, for example, it is possible to have a binomially distributed number of events in a given time period with a different process (from the one used here) generating the random times. While the random variable is being interpreted as the number of events in a fixed time period, no such time period is required for the actual situation being simulated. The theory that supports this method is available in Section 6.6.3 of the third edition of this book [73].
The process that creates the timing of the events starts by having an exponential distribution for the time of the first event. The time for the second event will also have an exponential distribution, with the mean depending on the number of previous events. To simplify matters, the time period in question will always be of length 1. If the period is other than 1, the times that are simulated can be viewed as the proportion of the relevant period. The process is, in general, noting that the first event carries an index of 0, the second event an index of 1, and so on:
All that remains is to determine the formulas for the exponential means.
This is the simplest case. To simulate values from a Poisson distribution with mean , set for all k.
As usual, let the binomial parameters be m and q. From them, calculate and . Then, .
It should be noted that because the binomial distribution cannot produce a value greater than m, if , then the simulation stops and the simulated value is set equal to m. Note that for all binomial distributions, so if the algorithm were to continue, the next simulated time would be at infinity, regardless of the value of .
The process is the same as for the binomial distribution, but with different formulas for c and d. With parameters r and , the formulas are and .
This procedure provides additional insight for the negative binomial distribution. In Section 6.3, the distribution was derived as a gamma mixture of Poisson distributions. A motivation is that the distribution is the result for a portfolio of drivers, each of which has a Poisson-distributed number of claims, but their Poisson means vary by a gamma distribution. However, there is also some evidence that individual drivers have a negative binomial distribution. In the above simulation, the value of d is always positive. Thus, with each claim the parameter increases in value. This reduces the expected time to the next claim. Thus, if we believe that drivers who have claims are more likely to have further claims, then the negative binomial distribution may be a reasonable model.2.
It is always sufficient to be able to simulate Z, a standard normal random variable. Then, if , let . If X is lognormal with parameters and , let . The inversion method is usually available (for example, the NORM.INV function in Excel®). However, this method is not as good in the tails as it is in the central part of the distribution, being likely to underrepresent more extreme values. A simple alternative is the Box–Muller transformation [16]). The method begins with the generation of two independent pseudouniform random numbers and . Then, two independent standard normal values are obtained from and . An improvement is the polar method, which also begins with two pseudouniform values. The steps are as follows:
The polar method requires more programming work due to the rejection possibility at step 3, but is superior to other methods.
A question asked at the beginning of this chapter remains unanswered: How many simulations are needed to achieve a desired level of accuracy? We know that any consistent estimator will be arbitrarily close to the true value with high probability as the sample size is increased. In particular, empirical estimators have this attribute. With a little effort, we should be able to determine the number of simulated values needed to get us as close as we want with a specified probability. Often, the central limit theorem will help, as in the following example.
The method for working with percentiles is not as satisfying as the other two examples. When the goal is to estimate the mean or a probability, we were able to work directly with a normal approximation and an estimate of the standard deviation of the estimator. A similar approach can be taken with the estimated percentiles. However, the formula for the asymptotic variance is
The problem is that while p is known and can be replaced by its estimate, the density function of the simulated variable is not known (recall that we are performing simulations because basic quantities such as the pdf and cdf are not available). Thus, it is likely to be difficult to obtain an estimated value of the variance that can, in turn, be used to estimate the required sample size.
The recursive method for calculating aggregate loss distribution, presented in Chapter 9, has three features. First, the recursive method is exact up to the level of the approximation introduced. The only approximation involves replacing the true severity distribution with an arithmetized approximation. The approximation error can be reduced by increasing the number of points (that is, reducing the span). Second, it assumes that aggregate claims can be written as with independent and the identically distributed. Third, the recursive method assumes that the frequency distribution is in the or classes.
There is no need to be concerned about the first feature because the approximation error can be made as small as desired, though at the expense of increased computing time. However, the second restriction may prevent the model from reflecting reality. The third restriction means that if the frequency distribution is not in one of these classes (or can be constructed from them, such as with compound distributions), we will need to find an alternative to the recursive method. Simulation is one such alternative.
In this section, we indicate some common ways in which the independence or identical distribution assumptions may fail to hold and then demonstrate how simulation can be used to obtain numerical values of the distribution of aggregate losses.
When the are i.i.d., it does not matter how we go about labeling the losses, that is, which loss is called , which one , and so on. With the assumption removed, the labels become important. Because S is the aggregate loss for one year, time is a factor. One way of identifying the losses is to let be the first loss, be the second loss, and so on. Then let be the random variable that records the time of the jth loss. Without going into much detail about the claims-paying process, we do want to note that may be the time at which the loss occurred, the time it was reported, or the time payment was made. In the latter two cases, it may be that , which occurs when the report of the loss or the payment of the claim takes place at a time subsequent to the end of the time period of the coverage, usually one year. If the timing of the losses is important, we will need to know the joint distribution of .
There are two common situations in which the assumption of independence does not hold. One is through accounting for time (and, in particular, the time value of money) and the other is through coverage modifications. The latter may have a time factor as well. The following examples provide some illustrations.
We now complete the two examples using the simulation approach. The models have been selected arbitrarily, but we should assume they were determined by a careful estimation process using the techniques presented earlier in this text.
If the distribution of interest is too complex to admit an analytic form, simulation may be used to estimate risk measures such as VaR and TVaR. Because VaR is simply a specific percentile of the distribution, this case has already been discussed. The estimation of TVaR is also fairly straightforward. Suppose that is an ordered simulated sample from the random variable of interest. If the percentile being used is p, let , where [·] indicates the greatest integer function. Then, the two estimators are
We know that the variance of a sample mean can be estimated by the sample variance divided by the sample size. While is a sample mean, this estimator will underestimate the true value. This is because the observations being averaged are dependent and, as a result, there is more variability than is reflected by the sample variance. Let the sample variance be
Manistre and Hancock [85] show that an asymptotically unbiased estimator of the variance of the estimator of TVaR is
Simulation can help in a variety of ways when analyzing data. Two are discussed here, both of which have to do with evaluating a statistical procedure. The first is the determination of the p-value (or critical value) for a hypothesis test. The second is to evaluate the MSE of an estimator. We begin with the hypothesis testing situation.
When testing hypotheses, p-values and significance levels are calculated assuming the null hypothesis to be true. In other situations, there is no known population distribution from which to simulate. For such situations, a technique called the bootstrap may help (for thorough coverage of this subject, see Efron and Tibshirani [34]). The key is to use the empirical distribution from the data as the population from which to simulate values. Theoretical arguments show that the bootstrap estimate will converge asymptotically to the true value. This result is reasonable because as the sample size increases, the empirical distribution becomes more and more like the true distribution. The following example shows how the bootstrap works and also indicates that, at least in the case illustrated, it gives a reasonable answer.
In many situations, determination of the MSE is not so easy, and then the bootstrap becomes an extremely useful tool. While simulation was not needed for the example, note that an original sample size of 3 led to 27 possible bootstrap values. Once the sample size gets beyond 6, it becomes impractical to enumerate all the cases. In that case, simulating observations from the empirical distribution becomes the only feasible choice.
From Example 10.12 the MSE of this unbiased estimator was shown to be .
18.216.244.98