To better understand mathematical simulation, let us turn immediately to a simple
example.
Say you want to model a queuing system in a retail environment because you wish to
optimize how many tills to use and what type of queuing system you can use. You could
build mathematical equations for each major part of the experience. For instance,
the time to process each person at the till could be represented as a function of
the number of items they are buying and the payment method (cash or card). A simple
equation in this regard could be:
(Time to process person “i”) = Constant + β1(No. items) + β2(Card payment) + ϵ.
Here, the time to process a given person is defined as a linear function of:
-
A constant intercept plus
-
the number of items being bought multiplied by “β1” which is a time parameter, plus
-
the 0/1 dummy variable representing payment method (where Card = 1 and cash = 0) multiplied
by time parameter “β2“ plus
-
a certain random error “ϵ” included to represent deviations from this average model.
Therefore, this is essentially the same as a regression line. Say that the constant
is 23.56 seconds, β1 = 4.3 (which means that a cashier processes an item every 4.3 seconds on average),
and β2= 10.85 (which means that it takes the cashier 10.85 seconds longer to process a card
payment than a cash payment).
Then, if a simulated person has ten items and pays with a card, we can expect the
time at the till to be:
(Time to process person “i”) = 23.56 + (4.3 seconds x 10 items) + (10.85 seconds to
represent the extra time it takes to process a card) = 77.41 seconds.
So far, this is an example of a deterministic model, in that the parameters and data is set and we get a specific outcome.
However, simulation models are usually most useful when we build stochastic models,
which include the element of probability. Say, for instance, that we program SAS to
use the above equation but to generate a random, hypothetical sample of shoppers drawn
from a specified distribution of number of items and with a certain randomized chance
of being either a cash or card payer. The distribution of items could be specified
as a particular shape, say the normal or lognormal distributions as seen in Chapter
7, so that a random person could have any number of shopping items in the range but
the total distribution of shoppers is determined by the shape. (For example, you could
specify that most simulated shoppers be allocated close to the average number of items
and only a few shoppers be given an extremely low or high number of items.) Both the
distribution of items and of payment methods could be drawn from observed, known,
historical distributions seen in past shopper patterns.
Now, what happens if we decide to look at multiple queues and types of tills, set
up on different systems? For instance, do we have one queue feeding multiple queues,
or one queue per till? What happens if we have cash-only tills, and if so, how many
should we have? We could add some more elements and some more equations using the
same principles, update the equations to express the interrelationships of elements
across the whole experience of the queuing system, and perhaps compare different systems.
Telling SAS to simulate distributions of shoppers would allow us to look for a system
that – according to the simulated distribution – minimizes the average person’s time
in the queue. We could also test the effect of changing the time parameters on these
systems.
As you can see, the data and analyses elements exist in multiple places in this very
simple example:
-
We probably used historical data analysis to derive the average time parameters for
the equation.
-
We need to derive the simulated data of shopper items, payment details, and other
such elements. As noted above, observed distributions based on actual, past shopper
data would typically be used.
-
We need an algorithm to estimate when the time in queue has been minimized.
Such mathematical simulation models can be used to optimize many processes and systems
across organizations.