10.6. Equation Formulation Tests and the Soap Industry Model

Equation formulation tests build further confidence in the model by demonstrating that equations and numerical parameters in the model are consistent with available facts and data from the mental, written and numerical databases.

There are three common tests, mentioned earlier, and for convenience repeated here:

  • Dimensional consistency – Are all equations dimensionally correct without the use of parameters that have no real-world counterpart?

  • Parameter verification – Are parameters consistent with descriptive and numerical knowledge?

  • Extreme conditions – Does each equation make sense even when its inputs take on extreme values?

To illustrate the tests, we review a selection of equation formulations describing consumer behaviour which are taken from the market sector of the soap industry model.

10.6.1. Substitution of Bar Soap by Shower Gel

The substitution rate of bar soap by shower gel is expressed as the product of Traditional English Bar Soap volume and the fractional rate of substitution, as shown in Figure 10.11. The equation is a standard stock depletion formulation algebraically similar to the workforce departure rate in Chapter 5's simple factory model and to independents' capacity loss in the oil producers' model. Identical formulations apply to the substitution rates of Moisturising Bar Soap and Me Too Bar Soap. The fractional rate of substitution is defined as a fixed proportion, 0.005 per month, of bar soap volume. This proportion captures two shared beliefs among managers: one is that remaining bar soap consumers will gradually convert to shower gels (unless they first adopt liquid soap); and the second is that all players in bar soap are experiencing a similar substitution process. Note that the equation is dimensionally balanced, though it requires careful consideration of the units of measure to demonstrate the balance. On the right, the units of bar soap volume are thousands of tonnes per month, which is a measure of demand. On the left, the units of the substitution rate are thousands of tonnes per month per month, which is a measure of the rate of change of demand. To achieve a balance it is necessary to introduce an extra parameter on the right-hand side of the equation called the fractional rate of substitution. This parameter has a real-world meaning. It corresponds to the fraction of remaining bar soap demand substituted each month by shower gel. Its unit of measure is 'proportion per month' (or 1/month, since the proportion itself is a dimensionless quantity). Therefore, the dimensions of the multiplicative expression on the right reduce to thousands of tonnes per month per month, which balances with the left. The management team were aware of consumers' gradual conversion to shower gels and, when pressed to think hard about the process, were able to estimate an annual loss rate of about 6 per cent per year, which re-scales to the monthly figure of 0.005 per month shown.

Figure 10.11. Equations for bar soap volume and substitution by shower gel

10.6.2. Brand Switching Between Competing Bar Soap Products

Brand switching is driven by promotions and advertising. However, management believed that the personal care market is commoditised with little long-term loyalty except the inertia of buying habits. Hence, although promotional price discounts and advertising campaigns win customers, any volume gains are reversible if (as usually happens) rivals respond with equivalent or better discounts and more advertising. In Figure 10.12, this to-and-fro of consumer demand is shown as a bi-flow, called the 'Branded Bar Soap Users' Switch Rate', and is formulated as the sum of two independent terms: volume change due to price promotions and volume change due to advertising. Note that the units of measure of the users switch rate are thousands of tonnes per month per month – identical to the units of the substitution rate mentioned above, because both belong in the same stock and flow network. An identical formulation is used for brand switching between liquid soaps. The formulation can also be adapted for consumers switching between branded and private label soaps by simply removing the term for volume change due to advertising (since Me Too products are not advertised and tend to attract price sensitive consumers who pay close attention to relative prices).

Figure 10.12. Equations for brand switching

Consumers' response to promotions and advertising are important assumptions in the model. Are the formulations plausible? In this case, a combination of descriptive knowledge and numerical data was used to build confidence in the equations. For brevity we will focus on the formulations for promotions and leave aside the independent effect of advertising. How do promotions work? Despite the wide variety of soaps on offer, Old English's management team believed the differences between each players' soaps were small and that consumers' choice between two brands is based on relative price including promotions. To capture this effect, the model uses the concept of 'consumer response to promotional price', which is formulated in Figure 10.12 as a function of the ratio of the promotional price of Traditional English Bar Soap to the promotional price of Moisturising Bar Soap. In this formulation the promotional price of one brand is used as a reference point for comparison with the other brand, which is an empirical generalisation often used in modelling consumer choice (Meyer & Johnson, 1995).

Figure 10.13. Consumer response to promotional price in branded bar soaps

The price ratio affects volume change according to the graphical function shown in Figure 10.13. The function was calibrated using time series data for relative price and volume obtained from A.C. Neilsen's report of volumes and sales by distribution channel – a reliable and well-respected source of information in fast-moving consumer goods. For confidentiality reasons, the scale of the horizontal axis is suppressed. Nevertheless, it is clear than when the price ratio takes a value '1', meaning that competing branded products are priced the same as Old English, then the change in volume is zero, as expected (unless the Old English brand has a higher reputation among consumers, leading them to pay a premium, which is represented as a price ratio greater than 1). Also the function is downward sloping as common sense would suggest. Hence, when the price ratio is less than '1' (Traditional English Bar Soap is priced below Moisturising Bar Soap) the function takes a value greater than zero and there is a net flow of volume into Traditional English Bar Soap. Moreover, this volume change is proportional to the stock of Moisturising Bar Soap Volume since a price discount on Traditional English Bar Soap attracts consumers of Moisturising Bar Soap. Conversely when the price ratio is greater than '1' (Traditional English Bar Soap is priced above Moisturising Bar Soap) the function takes a value less than zero and there is a net flow of volume in the opposite direction into Moisturising Bar Soap. In this case, the volume change is proportional to the stock of Traditional English Bar Soap Volume. The switch in the stock that drives volume change is formulated as a logical if-then-else function.[]

[] It is unusual to find if-then-else equation formulations in system dynamics models of decision making. In fact, such formulations should normally be avoided since they suggest the modeller is adopting a perspective on the problem situation that is too close to operational detail and therefore in danger of missing the flux of pressures that come to bear on operating policy. In this case, an if-then-else equation formulation seemed unavoidable and the modeller accepted a pragmatic compromise. (Review the comments in Chapter 7 about the modeller's quest to view the firm from the perspective of the CEO or member of the board.)

Figure 10.14. Consumer response to promotional price in branded and private label soaps

Source: From Supporting Strategy: Frameworks, Methods and Models, Edited by Frances O'Brien and Robert Dyson, 2007, © John Wiley & Sons Limited. Reproduced with permission.

Equivalent downward sloping functions were used to depict competition between branded and private label products. The effect of different value perceptions for competing products can be deduced from the relative slope of the functions shown in Figure 10.14. For example, the slope of the dashed line for two products with similar perceived value (Traditional English and Moisturising Bar Soaps) is much steeper than the slope of either of the solid lines for two products with different perceived value (Traditional English and Me Too Bar Soaps or Moisturising and Me Too Bar Soaps). Consumers are more likely to switch between two products perceived similarly than two products perceived differently, which implies that supermarkets need to sustain bigger price differentials with respect to branded products in order to lure customers from branded products or to avoid losing customers. The relative slopes made sense to the management team and coincided with their judgemental knowledge of the market, thereby helping to build their confidence in the formulations.

A similar price response curve was devised for liquid soap and is shown in Figure 10.15. For comparison, the equivalent curve for bar soap is shown as a dotted line. Once again the relative slopes made sense to the management team.

Figure 10.15. Consumer response to promotional price for bar and liquid soaps

Source: From Supporting Strategy: Frameworks, Methods and Models, Edited by Frances O'Brien and Robert Dyson, 2007, © John Wiley & Sons Limited. Reproduced with permission.

To summarise, this review of formulations for consumer behaviour illustrates how a combination of mental, written and numerical data helps to build confidence in the model's equations, even among people who do not normally think about their business algebraically. Wherever possible, the same process of grounding was used with the other equations. The parameterised model was then simulated and anomalous behaviour rectified by modifying equations and parameters while ensuring they were still consistent with the available judgemental and factual information. The model was then ready for evaluating the new product strategy.

At this stage of a modelling project a great deal has been accomplished. A complex situation has been represented in a visual conceptual model and in matching equations. The individual equations have been calibrated to be consistent with available knowledge. If all this work has been carried out with integrity, rigour and professionalism then confidence in the model will already be well established.

10.6.3. Model Behaviour Tests and Fit to Data

Model behaviour tests help to assess how well a model reproduces the dynamic behaviour of interest. The proper use of such tests is to uncover flaws in the structure or parameters of the model and to determine whether they matter relative to the model purpose. A useful starting point is to ask whether model simulations fit observed historical behaviour. One way to assess goodness-of-fit is to devise formal metrics. Intuitively, such metrics compute, point-by-point, the discrepancy between simulated and real data and then take an average over the relevant time horizon (Sterman, 1984). An example is the mean absolute error (MAE) defined as the sum of the absolute differences between model generated data points Xm and actual data points Xa divided by the total number of data points n. Division by n ensures the metric is normalised.


The lower the mean absolute error, the better the fit. Hence, if the simulated data perfectly matches the actual data, point-by-point, then the mean absolute error is zero. If the simulated data is displaced in any way from the actual data (systematically or randomly) then the mean absolute error takes a positive value. Of course the temptation is to ask 'what is a good number for MAE?', but there is no simple answer to this question. In practice it is best to use the metric alongside visual comparisons of actual and simulated data as a way to assess whether or not fit is improving when parameter and structural changes are made to the model.

An alternative metric is the mean square error (MSE), defined as the sum of the squares of the differences between the model generated data points Xm and actual data points Xa divided by the total number of data points n.


The lower the mean square error the better the fit. There is no particular criterion for deciding which of these two metrics, MAE or MSE, is better. Either can help modellers to objectively assess whether the fit of a given model to data is improved or worsened as its parameters are varied within plausible bounds. However, the mean square error has the advantage that it can be usefully decomposed into three separate components, known as Theil's Inequality Statistics (Theil, 1966), which correspond to informal visual measures of fit that modellers often use. The inequality statistics measure how much of the mean square error between simulated and actual trajectories can be explained by: (1) bias UM resulting from a difference between means; (2) unequal variation or 'stretch/shrinkage' US due to a difference in standard deviation; and (3) unequal covariation UC caused by phase shift or unexplained variability. The components can each be expressed in terms of standard statistical measures as shown below and are defined in such a way that their sum (UM + US + UC) is identically equal to 1:


The unequal mean or bias statistic UM is the square of the difference between the mean of model-generated data points and the mean of the actual data points divided by the mean square error MSE.


The unequal variation or stretch/shrinkage statistic US is the square of the difference between the standard deviation of the model-generated data points sm and the standard deviation of the actual data points sa, divided by the mean square error MSE.


The unequal covariation statistic UC is the product of the standard deviation of model-generated data points sm and the standard deviation of actual data points sa multiplied by double the complement of the correlation coefficient (1 − r), divided by the mean square error MSE.

Consider the inequalities between the two trajectories shown in Figure 10.16, which are taken from the soap industry model. Line 1 shows model-generated behaviour, while line 2 is actual time series data for the same variable plotted on a scale from zero to six over a 72-month time horizon. A visual comparison suggests the two lines follow a similar path of growth and saturation. However, the actual time series contains a short-term cycle that is absent from the simulation. Also, the simulated line has a slight upward bias. The mean square error (MSE) between the two trajectories is 0.2, reflecting the broad similarity in the shape and timing of the growth trajectories. Using the Thiel inequality statistics to decompose the error we find the following results. The bias UM is 0.39, so 39 per cent of the mean square error arises from the difference of means, reflecting the upward displacement of the simulated data points. The stretch US is 0.04, so only 4 per cent of the mean square error arises from stretch or shrinkage, as there is no discernable cyclicality in the simulated data points to mimic, at smaller amplitude, the cycle in the actual data. The unequal covariation UC is 0.58, which means that 58 per cent of the mean square error arises from point-by-point differences caused by the unexplained cyclicality in the data. We will return later to an interpretation of the practical significance of these differences.

Figure 10.16. Comparing actual and simulated data

Qualitative tests of fit are also widely used in practice – eyeballing the magnitude, shape, periodicity and phasing of simulated trajectories and comparing to past behaviour. Such tests may appear ad hoc at first glance, but they are well suited to situations in which observations about real-world behaviour arise from a patchwork of mental, verbal and written data sources. In many modelling projects, time series data is sparse and incomplete. However, it is still possible to build confidence in model fit by ensuring that simulated trajectories are correctly scaled, pass through recognised data points, have the appropriate periodicity and relative phasing and are consistent with reliable anecdotal information about past behaviour.

An interesting example of such confidence building with limited data is provided by the World Dynamics model mentioned in Chapter 1. The model's purpose was to understand the long-term dynamics of population in a world of limited resources and the problems facing humanity when industrial activity exceeds the finite carrying capacity of the earth. The time horizon of the study was two hundred years, from 1900 to 2100, and the model contained highly aggregated stock variables such as population, natural resources, capital investment, quality of life and pollution as well as a host of other inter-locking auxiliary variables. The behaviour over time of population was central to the model's arguments, so it was important that its historical behaviour looked realistic. The simulated trajectory of population was calibrated using just two data points and a logical argument about the dynamics of reinforcing feedback. The following quote from World Dynamics illustrates the reasoning:

Taking a world population of 1.6 billion in 1900 (the start of the simulation) and a world population of 3.6 billion in 1970 (at the time of the study), the cumulative growth rate has averaged 1.2 per cent per year. This is the difference between the birth rate and the death rate which we will here take as the difference between the coefficients BRN (birth rate normal) and DRN (death rate normal). A value of 0.04 for BRN (meaning that the annual birth rate is 4 per cent of the population) and a value of 0.028 for DRN (2.8 per cent of the population per year) would satisfy this 1.2 per cent difference and would be compatible with observed demographic rates for the first three-quarters of the 20th century. The reciprocal of death rate normal DRN of 0.028 implies a life expectancy at birth of 36 years (including infant mortality).

Effective calibration of the population trajectory was achieved with sparse but reliable data. By knowing population at two widely separated points in time it was possible to estimate the net growth rate of population. Then, by choosing plausible coefficients for the birth rate and death rate (the two parameters that control exponential growth), it was possible to ensure that the behaviour of population was realistic and correctly scaled over the entire historical period of the simulation from 1900 to 1970.

10.6.4. Tests of Fit on Simulations of the Soap Industry Model – The Base Case

No simulation model of a business or social system will ever replicate historical data perfectly. Nevertheless, if reliable time series data are available they should be used to scale the model as accurately as possible and to correct erroneous or poorly known parameter estimates and inappropriate assumptions. The objective should be to remove discrepancies that are important to the model's purpose. Any remaining discrepancies (and there will always be some) are those the model cannot explain. Remember, too, that the aim of the model is to provide an endogenous structural explanation of behaviour. Hence, it is not acceptable to 'introduce fudge factors or exogenous variables whose sole function is to improve the historical fit of the model' (Sterman, 2000, Chapter 21).

The soap industry project used historical data to build confidence in the base case simulations. The base case replicates the decision-making processes followed by the management team during the development of the liquid soap business. For comparison there was monthly data available on sales volume and price by product and by company over a period of six years. The intention in showing the base case to managers was to help them understand how their normal way of running the business led to the actual situation they were facing. In other words, the simulation moved them from actors in the business to spectators of their strategies, similar to playing a videotape of the performance of a football team after a match. Hence, it was important to compare the simulations with real data.

Figure 10.17 shows the simulated and actual sales volume for branded liquid soaps over a period of 72 months. Although the trajectories do not match perfectly, the magnitude and main trends are similar. Old English's new Antibacterial Liquid Soap volume (line 1) grows quickly during the first 36 months, as actually happened (line 2) and as the management team had hoped. From the simulator we can infer that this growth is due to two managerial actions: trialing and price reductions. The trialing effort is complemented with a large reduction in the retail price of liquid soaps that boosts the adoption rate. Meanwhile, the branded competitor's new Creamy Liquid Soap volume (line 3) grows much more slowly, as also happened in real life (line 4).

Figure 10.17. Branded liquid soaps simulated and real volumes

Source: From Supporting Strategy: Frameworks, Methods and Models, Edited by Frances O'Brien and Robert Dyson, 2007, © John Wiley & Sons Limited. Reproduced with permission.

After month 36, two factors reduce the growth rate of Antibacterial Liquid Soap. First, Old English's management stops reducing the price of the new product due to the early success of the launch. Sales volume after three years matches the expected market size and managers do not want to further erode the revenues from liquid soap. Second, the steady reduction in the number of bar soap users begins to slow market growth, despite the intensity of marketing actions.

It is clear from the comparison of simulated and actual data that the model realistically captures the growth trend in sales volume during the early years of liquid soap and the subsequent levelling-off. These were strategically important features of the developing liquid soap market with practical lessons for management. One lesson from the simulation is that Old English's managers might have been able to further exploit the potential of the new market with more intense marketing actions at the beginning of the process. A corollary is that later marketing action is much less effective.

However, the model cannot explain everything about movements in the sales volume of liquid soaps. There is more volatility in the actual time series (lines 2 and 4) than in the simulated time series (lines 1 and 3). Specifically, the real data for Antibacterial Liquid Soap volume (line 2) contains cyclicality with a period of about one year superimposed on S-shaped growth. In contrast, the simulation exhibits smooth S-shaped growth. However, failure to capture the short-term cycle is not a serious deficiency, because the model's purpose is to examine long-term growth strategy.

In principle, the fit could be improved by adjusting the model's parameters, but in practice the model had been carefully tuned, and so the remaining discrepancies show the limited scope, in a well-grounded model such as this, for modellers to alter trajectories. Remember that important parameters such as consumers' response to promotional price (in Figures 10.14 and 10.15) have been obtained from reliable sources, independent of the time series data. These relationships cannot be modified simply to improve the fit between simulations and data. They are part of the feedback structure of the model that pre-determines its dynamic behaviour and limits the range of trajectories it can generate. Good fit is not easy to achieve in closed-loop models and is not merely a matter of twiddling parameters.

Continuing with the base case, Figure 10.18 shows simulated and real retail prices in the branded liquid soap market. Old English reduces price at an early stage to stimulate growth (lines 1 and 2). Some time later, Global Personal Care also reduces liquid soap price as a reaction to erosion of market share (lines 3 and 4). Global Personal Care's price falls until it slightly undercuts Old English's price, in an effort to sustain market share. When Global Personal Care reduces its prices, there are two effects. The first effect is to attract more bar soap users into liquid soap, which expands the liquid soap market as a whole. The second effect is to reverse the flow of customers switching from Creamy Liquid Soaps to Antibacterial Liquid Soaps.

Figure 10.18. Branded liquid soaps simulated and real prices

Source: From Supporting Strategy: Frameworks, Methods and Models, Edited by Frances O'Brien and Robert Dyson, 2007, © John Wiley & Sons Limited. Reproduced with permission.

Supermarkets also reduce their prices as lines 1 and 2 (simulated and real volume) in Figure 10.19 show. Even though supermarkets are obtaining more income from trade margin (line 4) due to growth in branded liquid soap sales, the retailers' desire to maintain market share (line 3) is reducing supermarkets' prices. When supermarkets' market share increases, prices stabilise.

Figure 10.19. Me Too Liquid Soap: Simulated and real price; simulated market share and income from trade margin

Source: From Supporting Strategy: Frameworks, Methods and Models, Edited by Frances O'Brien and Robert Dyson, 2007, © John Wiley & Sons Limited. Reproduced with permission.

The base case simulations explain the development of the liquid soap segment and the credibility of this explanation is reinforced by the visual fit with historical data. Two particular features stand out. First, an equilibrium price for Antibacterial and Creamy Liquid Soaps is established once both firms satisfy their evolving market performance goals. Supermarkets also achieve an equilibrium price once they acquire adequate bargaining power (represented here as a market share goal). Second, Old English's volume in the new product segment reaches a plateau after about four years, suggesting that growth potential is less than management had expected. This plateau is partly due to the high equilibrium price of liquid soap that reduces its attractiveness to more price-sensitive bar soap users. In addition, Global Personal Care stops losing customers to Old English when it matches Old English's price. Meanwhile, supermarkets' liquid soap volume grows strongly in the last two years of the simulation, which can be inferred from the steady rise in market share (line 3 in Figure 10.19). This growth in sales is driven by the simulated price gap between branded and Me Too products, which can be seen by comparing the price trajectories in Figures 10.18 and 10.19. However, close inspection of the time series data shows the real price gap is smaller than the simulated price gap suggesting the need to revise the pricing formulation for Me Too liquid soap. As the price gap is made smaller, then supermarkets' sales volume and market share will reach a plateau and the liquid soap market will settle into an equilibrium similar to the mature bar soap market.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.82.21