© Robert D. Brown III 2018
Robert D. Brown IIIBusiness Case Analysis with Rhttps://doi.org/10.1007/978-1-4842-3495-2_2

2. Setting Up the Analysis

Robert D. Brown III1 
(1)
Cumming, Georgia, USA
 

The purpose of analysis is to produce and communicate effectively and clearly helpful insights about a problem. Good analysis begins with good architecture and good housekeeping of the analysis structure, elements, relationships, and style.

The Case Study

The following is a simple case study that I often use when I teach seminars on quantitative business analysis. It’s simple enough to convey a big-picture idea, yet challenging enough to teach important concepts related to structuring the thoughts and communications about complex business interactions, representing uncertainty, and managing risk. There is no doubt the business case could contain many more important details than are presented here. The idea, however, is not to develop an exhaustive template to use in every case; rather, the key idea is to demonstrate the use of the R language in just enough detail to build out your own unique cases as needed.

We will think through the case study in two sections. The first section, “Deterministic Base Case,” describes the basic problem for the business analysis with single value assumptions. These assumptions are used to set up the skeletal framework of the model. Once we set up and validate this deterministic framework, we then expand the analysis to include the consideration of uncertainties presented in the second section, “The Risk Layer,” that might expose our business concept to undesirable outcomes.

Deterministic Base Case

ACME-Chem Company is considering the development of a chemical reactor and production plant to deliver a new plastic compound to the market. The marketing department estimates that the market can absorb a total of 5 kilotons a year when it is mature, but it will take five years to reach that maturity from halfway through construction, which occurs in two phases. They estimate that the market will bear a price of $6 per pound.

Capital spending on the project will be $20 million per year for each year of the first phase of development and $10 million per year for each year of the second phase. Management estimates that development will last four years, two years for each phase. After that, a maintenance capital spending rate will be $2 million per year over the life of the operations. Assume a seven-year straight-line depreciation schedule for these capital expenditures. (To be honest, a seven-year schedule is too short in reality for these kinds of expenditures, but it will help illustrate the results of the depreciation routine better than a longer schedule.)

Production costs are estimated to be fixed at $3 million per year, but will escalate at 3% annually. Variable component costs will be $3.50 per pound, but cost reductions are estimated to be 5% annually. General, sales, and administrative (GS&A) overhead will be around 20% of sales.

The tax rate is 38%. The cost of capital is 12%. The analytic horizon is 20 years. Determine the following:
  1. 1.

    Cash flow profile.

     
  2. 2.

    Cumulative cash flow profile.

     
  3. 3.

    Net present value (NPV) of the cash flow.

     
  4. 4.

    Pro forma table.

     
  5. 5.

    Sensitivity of the NPV to low and high changes in assumptions .

     

The Risk Layer

The market intelligence group has just learned that RoadRunner Ltd. is developing a competitive product. Marketing believes that there is a 60% chance RoadRunner will also launch a production facility in the next four to six years. If they get to market before ACME-Chem, half the market share will be available to ACME-Chem. If they are later, ACME-Chem will maintain 75% of the market. In either case, the price pressure will reduce the monopoly market price by 15%.

What other assumptions should be treated as uncertainties in the base case?

Show the following:
  1. 1.

    Cash flow and cumulative cash flow with confidence bands.

     
  2. 2.

    Histogram of NPV.

     
  3. 3.

    Cumulative probability distribution of NPV.

     
  4. 4.

    Waterfall chart of the pro forma table line items.

     
  5. 5.

    Tornado sensitivity chart of 80th percentile ranges in uncertainties.

     

Abstract the Case Study with an Influence Diagram

When I ask people what the goal of business analysis should be, they usually respond with something like logical consistency or correspondence to the real world. Of course, I don’t disagree that those descriptors are required features of good business analysis; however, no one ever commissions business case analysis to satisfy intellectual curiosity alone. Few business decision makers ever marvel over the intricate nature of code and mathematical logic. They do, however, marvel at the ability to produce clarity while everyone else faces ambiguity and clouded thinking.

I assert that the goal of business case analysis is to produce and effectively communicate clear insights to support decision making in a business context. Clear communication about the context of the problem at hand and how insights are analytically derived is as important, if not more so, than logical consistency and correspondence. Although it is certainly true that clear communication is not possible without logical consistency and correspondence, logical consistency and correspondence are almost useless unless their implications are clearly communicated.

I think the reason many people forget this issue of clear communication is that, as analysts who love to do analysis, we tend to assume lazily that our analysis speaks for itself in the language we are accustomed to using among each other. We tend to forget that the output of our thinking is a product to be used by others at a layer that does not include the underlying technical mechanisms. Imagine being given an iPhone that requires the R&D laboratory to operate. An extremely small number of users would ever be able to employ such a device. Instead, Apple configures the final product to present a much simpler user interface to the target consumers than design engineers employ, yet they still have access to the power that the underlying complexity supports. Likewise, good business case analysis–that which supports the clear communication of insights to support decision making in a business context–should follow principles of good product design.

Before we write the first line of R code (or any code in any construct, even a spreadsheet) I recommend that we translate the context of the problem we have been asked to analyze to a type of flowchart that communicates the essence of the problem. This flowchart is called an influence diagram.

Influence diagrams are not procedural flowcharts. They do not tell someone the order of operations that would be implemented in code. They do, however, graphically reveal how information (both fixed and uncertain) and decisions coordinate through intermediate effects to affect the value of some target key figures of merit on which informed decisions are based. In most business case analyses, the target key figures of merit (or objective functions) tend to be a cash flow profile and its corresponding NPV.

The influence diagram shown in Figure 2-1 captures the essence of the deterministic base case. It reveals how starting assumptions work through interdependent calculations to the objective function, NPV. The structure of the influence diagram is built by connecting the assumptions to other nodes with solid arcs to indicate the flow of causality or conditionality. Intermediate nodes are then connected to other dependent nodes in sequence until some set of nodes converge on the objective or value node.
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig1_HTML.jpg
Figure 2-1

The influence diagram of the deterministic base case

Notice that assumptions are represented both as light blue ellipses and purple trapezoids. The light blue ellipses represent assumptions that are not necessarily innately fixed in value. They can change in the real world due to effects beyond our control. For example, notice that the Phase 1 construction duration is represented as an ellipse . Although it is true that this outcome in the real world might be managed with project management efforts , at the beginning of the project, no one can declare that the outcome will conform to an exactly known value. Many events, such as weather, supply chain failures, mechanical failures, and so on, could work against the desired duration. Likewise, some events could influence a favorable duration. For this reason, Phase 1 construction duration and the other light blue ellipses will eventually be treated as uncertainties. For now, though, as we set up the problem, we treat their underlying values as tentatively fixed. The other assumptions displayed as trapezoids represent facts that are fixed by constraints of the situation, by definition (i.e., the tax rate or depreciation schedule of capital expense), or are known physical constants or unit conversion constants.

Intermediate values are displayed as double-lined yellow ellipses. These nodes represent some operation performed on assumptions and other intermediate values along the value chain to the objective function. The objective function (or value function), usually a red hexagon or diamond, is the ultimate target of all the preceding operations.

The dark blue parallelogram in the influence diagram labeled Year represents an index. In this case, it is an index for a time axis defined by the initial assumption of the life span duration of the analysis but that functions as a basis for all the other values. Because this node can affect many or all of the nodes, we don’t connect it to the other dependent nodes to reduce the visual complexity of the diagram. We need to explain this condition to the consumers of our analysis.

Historically, we observe the convention that canonical influence diagrams do not contain feedback loops or cyclic dependencies. As much as possible, the diagram should represent a snapshot of the flow of causality or conditionality. However, if the clarity of the abstraction of the problem is enhanced by showing a feedback loop in a time domain, we can use a dotted line from the intermediate node back to an appropriate predecessor node. Keep in mind, though, that in the actual construction of your code, some variable in the loop will have to be dependent on another variable with a lagged subscript of a form that conceptually looks like the function

Var2 = f(Var1[t-n])

where t is the temporal index, and n is the lag constant.

Again, the influence diagram is supposed to capture the essence of the problem in an abstract form that does not reveal every detail of the procedural line code you will employ. It simply gives the big picture. Constructed carefully, though, the diagram will provide you as the analyst an inventory of the classes of key variables required in your code. You should also make a habit of defining these variables in a table that includes their units and subject matter experts (SMEs) or other resources consulted that inform the definitions or values used.

The influence diagram displayed in Figure 2-1 represents the base case deterministic model; that is, the model that does not include the risk considerations. Once we develop the base case code structure and produce our preliminary analysis, we will modify the influence diagram to include those elements that capture our hypotheses about the sources of risk in the problem.

Set Up the File Structure

Again, we should think of good business case analysis as a product for some kind of customer. As much as possible, all aspects of the development of our analysis should seek to satisfy the customer’s needs in regard to the analysis. Usually, the customers we have in mind are the decision makers who will use our analysis to make commitments or allocate resources. Decision makers might not be the only consumers of our analysis, though. There are a number of reasons why other analysts might use our analysis as well, but if we are not immediately or no longer available to clarify ambiguous code, their job becomes that much more difficult. There is a social cost that accompanies ambiguity, and my personal ethics guide me to minimize those costs as much as possible. With that in mind, I recommend setting up a file structure for your analysis code that will provide guidance about the content of various files.

The file directory that contains your analysis project should be appropriately and clearly named (e.g., BizSimWithR). Within that directory, create at least three other subdirectories:
  1. 1.

    Data: Contains text and comma-separated value (CSV) files that store key assumptions for your analysis.

     
  2. 2.

    Libraries: Contains text files of functions and other classes you might import that support the operation of your code.

     
  3. 3.

    Results: Contains output files of results you export from within your analysis (maybe for postprocessing with other tools) as well as graphic files that might be used in reporting the results of your analysis.

     

To avoid the problem of resetting the working directory of your projects, I also recommend that you build your file structure in the R application working directory. You can do this manually the way you would create any other file directory or you can use the dir.create() function of R. Consult the R Help function for the details of dir.create(), but to quickly and safely set up your project file directory, use the following commands in the R console:

dir.create("BizSimWithR", recursive = F)
dir.create("BizSimWithR/data", recursive = F)
dir.create("BizSimWithR/libraries", recursive = F)
dir.create("BizSimWithR/results", recursive = F)

Of course, you should create any other subdirectories that are needed to partition categories of files for good housekeeping .

Style Guide

Keeping with the theme of clear communication out of consideration for those who consume your code directly, I also recommend that you follow a style guide for syntax, file, variable, and function names within your R project. Google’s R Style Guide1 provides an excellent basis for naming conventions. Following this guide provides one more set of clues to the nature of the code others might need to comprehend.

In short, the Google convention follows these guidelines:
  1. 1.

    Separate words in the name of the file with an underscore.

     
  2. 2.

    End file names with .R.

     
  3. 3.
    Name variable identifiers.
    • Separate words in variables with a period, as in cash.flow. (This convention generally conflicts with that of Java and Python variable naming; consequently, its practice is forbidden in some coding shops. If you think your code might be reused in those environments, stick with the most universally applicable naming convention to avoid later frustrations and heartbreaks.)

    • Start each word in a function name with a capital letter. Make the function name a verb, as in CalculateNPV.

    • Name constants like the words in functions, but append a lowercase k at the beginning, as in kTaxRate.

     
  4. 4.
    Use the following syntax .
    • The maximum line length should be 80 characters.

    • Use two spaces to indent code. (Admittedly, I violate this one, as I like to use the tab to indent. Some rules are just meant to be broken. Most IDEs will now let you define a tab as two spaces in the Preferences pane.)

    • Place a space around operators.

    • Do not place a space before a comma, but always place one after a comma.

    • Do not place an opening curly brace on its own line. Do place a closing curly brace on its own line.

    • Use <-, not =, for variable assignment. (Arguments about the choice of assignment operator in R can approach religious levels of zeal. However, the <- operator is still the most widely adopted convention in R programming circles.)

    • Do not use semicolons to terminate lines or to put multiple commands on the same line .

     

Of course, always provide clear comments within your code to describe routines for which operation might not be immediately obvious (and never assume they are immediately obvious) and to tie the flow of the code back to the influence diagram. The complete Google R Style Guide presents more complete details and examples as well as other recommendations for file structuring.

I learned the value of these guidelines the hard way several years ago while working on a very complex financial analysis for a new petroleum production field. The work was fast paced, and my client was making frequent change requests. In an effort to save time (and not possessing the maturity to regulate change orders effectively with a client), I quit following a convention for clear communication, namely adding thorough comments to my code. As I was certainly tempting the Fates, I got hit by the bus of gall bladder disease and exited the remainder of the project for surgery and extended recuperation. Although my files were turned over to a very capable colleague, he could make little sense of many important sections of my code. Consequently, our deliverables fell behind schedule, costing us both respect and some money. I really hated losing the respect, even more than the money.

Write the Deterministic Financial Model

We need to write our code in the order of events implied by our influence diagram. Studying the diagram, we can see that the nodes directly related to the annual capital expense (CAPEX) are involved in driving practically everything else. This makes sense because no production occurs until the end of the first phase of construction. No revenues or operational expenses occur, either, until production occurs. Therefore, the first full code block we address will be the CAPEX block . Just before we get to that, though, we need to set up a data file to hold the key assumptions that will be used in the model.

Data File

To begin the coding, we create a data file named, say, global_assumptions.R and place it in the /BizSimWithR/data subdirectory of our coding project. The contents of the data file look like this:

# Global Model Assumptions
kHorizon <- 20 # years, length of time the model covers.
year <- 1:kHorizon # an index for temporal calculations.
kSampsize <- 1000 # the number of iterations in the Monte Carlo simulation.
run <- 1:kSampsize # the iteration index.
kTaxRate <- 38 # %
kDiscountRate <- 12 # %/year, used for discounted cash flow calculations.
kDeprPer <- 7 # years, the depreciation schedule for the capital.

The values in this file are those that we will treat as constants, constraints, or assumptions that are used throughout the model. Notice that next to each assumption we comment on the units employed and add a brief description. Also, we choose variable identifiers that are descriptive and will permit easy interpretation by another user who might need to pick up the analysis after becoming familiar with the context of the problem.

Next, we import this file into our main R text file ( Determ_Model.R ) using the source() command.

source("∼/BizSimWithR/data/global_assumptions.R")

If you are familiar with the concept of the server-side include in HTML programming, you will easily understand the source() command. It essentially reads the contents as written in the file into the R session for processing.

For the variables that we will perform sensitivity analysis around or ultimately treat as uncertainties, we create a CSV data file named risk_assumptions.csv that contains the values of the operational variables used in the deterministic and risk model (Figure 2-2).
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig2_HTML.jpg
Figure 2-2

The data table used for this business case in .csv format rendered in a spreadsheet

We create columns in this file not only to capture the names of the variables and their values, but their units and explanatory notes, as well.

We read the contents of the CSV file using the read.csv() command and assign it to a variable.

d.data <- read.csv("∼/BizSimWithR/data/risk_assumptions.csv")
The read.csv() function reads the data in the table structure of the file, then creates a data frame from it such that the header row names the columns, and the rows are created from the case values in the table. Anytime we need to refer to the contents of the assumption file while in the R console, we can do so quickly by calling d.data.
../images/461101_1_En_2_Chapter/461101_1_En_2_Figa_HTML.jpg

The data frame looks pretty much the same as the CSV file structure as it appears in Microsoft Excel; however, we will need to use the p50 values only for the initial deterministic analysis, so we slice out the p50 values from d.data and place them in their own vector. For the deterministic analysis, we will use only values of the first 13 variables. We will use the last five variables in the risk layer of the model.

# Slice the p50 values from data frame d.data.
d.vals <- d.data$p50[1:13]

The contents of the p50 vector look like this:

[1] 4.0e+07 2.0e+00 2.0e+07 2.0e+00 2.0e+06 3.0e+06 3.0e+00 3.5e+00 5.0e+00 2.0e+01 5.0e+00 5.0e+00 6.0e+00

The final step in the process of setting up the data for our deterministic analysis is to assign the p50 values in d.vals to variable names in the R file.

 # Assign p50 values to variables.
p1.capex <- d.vals[1]
p1.dur <- d.vals[2]
p2.capex <- d.vals[3]
p2.dur <- d.vals[4]
maint.capex <- d.vals[5]
fixed.prod.cost <- d.vals[6]
prod.cost.escal <- d.vals[7]
var.prod.cost <- d.vals[8]
var.cost.redux <- d.vals[9]
gsa.rate <- d.vals[10]
time.to.peak.sales <- d.vals[11]
mkt.demand <- d.vals[12]
price <- d.vals[13]

At this point, you might be wondering what the p10, p50, p90 values represent, as the deterministic base case description of the model only included single point values. Remember, though, that in the description of the influence diagram, I described the ellipses as representatives of uncertain variables . For the deterministic base case, we treat these values in a fixed manner. Ultimately, however, for the risk layer of the business case to be thoroughly considered, ranges for each of the variables will be assessed by SMEs as their 80th percentile prediction interval. The three characteristic points of those prediction intervals are the 10th, 50th, and 90th percentiles. We use the median values, the p50s, for the deterministic analysis. The p50s were the numbers used in the business case descriptions.

CAPEX Block

The first thing we need to establish in the actual calculation of the model is when the capital gets expended by phase. The phase calculation accomplishes this by telling R when the construction and operating phases occur. We can use 1 and 2 for the construction phases, and 3 for the maintenance phase.

# CAPEX Module
phase <- (year <= p1.dur) * 1 +
    (year > p1.dur & year <= (p1.dur + p2.dur)) * 2 +
    (year > (p1.dur + p2.dur)) * 3

The result is a vector across the year index.

[1] 1 1 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

From our influence diagram , we know that capex is a vector conditionally related to phase, captured by the following expression.

capex <- (phase == 1) * p1.capex / p1.dur +
    (phase == 2) * p2.capex / p2.dur +
    (phase == 3) * maint.capex

It produces the following vector .

[1] 2e+07 2e+07 1e+07 1e+07 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06 2e+06

Next, we need to handle the depreciation that will be subtracted from the gross profit to find our taxable income. The depreciation is based on the capital emplaced at the time of construction or when it is incurred (i.e., maintenance); however, it is not taken into account until it can be applied against taxable profit. This means that the capital incurred in Phase 1 won’t be amortized until Phase 1 is over and the plant begins generating revenue. Capital incurred in the Phase 2 construction and maintenance phases can be amortized starting in the year following each year of expenditure.

One way to handle this operation would be to use a for loop . The following code performs this process as already described.

We start with

depr.matrix <- matrix(rep(year, kHorizon), nrow = kHorizon,
      ncol = kHorizon, byrow = TRUE)
to initialize the following matrix.
../images/461101_1_En_2_Chapter/461101_1_En_2_Figb_HTML.jpg

Then, we run a loop to set up a depreciation schedule programmatically (instead of typing each one manually):

for (y in year) {
    if (y <= p1.dur) {
      depr.matrix[y, ] <- 0
    } else if (y == (p1.dur+1)) {
      depr.matrix[y, ] <- (depr.matrix[y, ] >= (1 + p1.dur)) *
        (depr.matrix[y, ] < (y+ kDeprPer)) * p1.capex / kDeprPer
    } else {
      depr.matrix[y, ] <- (depr.matrix[y, ] >= y) *
        (depr.matrix[y, ] < (y + kDeprPer)) * capex[y - 1] / kDeprPer
    }
}

This loop incorporates the logic that, in general, for a straight-line depreciation schedule of n years, capital C incurred in year Y is spread over years Y + 1, Y + 2, …, Y + n - 1, as C/n in each year. So, if the project incurs $10 million of construction capital in Year 3, a seven-year depreciation schedule would spread that out evenly over seven years ($10 million/7 years) starting in Year 4 and running through Year 10.

More specifically to our earlier depreciation code snippet, R looks in the initialization table and replaces the numbers in rows for the Phase 1 duration years with 0s. Starting in the row representing the first year after the end of Phase 1 when sales are produced and a taxable profit can be made, our code replaces any value less than or equal to the Phase 1 duration with a 0. Then it looks for values in the row that are greater than the Phase 1 duration and less than or equal to the Phase 1 duration plus the depreciation period, and it replaces those values with the appropriate capital value divided by the depreciation period . Finally, the remainder of values in the row are set to 0. This process is repeated for each phase. The result of our code block for depreciation produces the matrix shown in Figure 2-3.
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig3a_HTML.jpg../images/461101_1_En_2_Chapter/461101_1_En_2_Fig3b_HTML.jpg
Figure 2-3

The depreciation table for the capital costs over the model time horizon

Now, if we sum the columns across the years of depr.matrix , we get the total annual depreciation.

# Sum the columns of the depreciation matrix to find the annual
# depreciation.
depr <- colSums(depr.matrix)

Because R is a vectorized language, though, many people frown on the use of for loops . Instead we might consider replacing the loop with a variant of the apply() function combined with the ifelse() functions . Here, we use the sapply() function . The sapply() function is a type of functional for loop that steps across the elements of an index, applying the value of the step at the appropriate place in the expression defined after the function() statement.

A general for loop does this:

for (i in index) {
    expr[..., i, ...]
}

The sapply() function does this instead:

sapply(index, function(i) expr[..., i, ...]).

It’s a great way to reduce the visual complexity of the expressions we write as well as accelerate the speed of the code. We use this function more in the sections to come. In the meantime, to learn more about sapply(), simply call the help function in the R console using ?sapply or help(sapply).

Now, we replace the depreciation code block that employs the for loop with this sapply() function and nested ifelse() functions to accommodate the conditional requirements in the loop. Note that we transpose the results of sapply() with t() because the sapply() function produces a matrix result that is transposed from the for loop approach.

# Depreciation Module
depr.matrix <-
  t(sapply(year, function(y)
    ifelse(
      y <= p1.dur & year > 0,
      0,
      ifelse(
        y == (p1.dur + 1) & year < y + kDeprPer & year >= y,
        p1.capex / kDeprPer,
        ifelse((year >= y) & (year < (y + kDeprPer)),
               capex[y - 1] / kDeprPer, 0)
      )
    )
  )
)
depr <- colSums(depr.matrix)

We can now see what the depreciation calculation looks like graphically (Figure 2-4) with the plot() function using

plot(year,
       depr / 1e6,
       xlab = "Year",
       ylab = "Depreciation [$000,000]",
       type = "b")
Note that we scaled the depr values by 1 million to make the plot a little more readable, using the "b" parameter to plot both lines and points.
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig4_HTML.jpg
Figure 2-4

The total depreciation over time

Sales and Revenue Block

Sales begin in the year following the end of construction in Phase 1. Here we model the market adoption as a straight line over the duration from the start of Phase 2 to the time of peak sales.2

mkt.adoption <- pmin(cumsum(phase > 1) / time.to.peak.sales, 1).

(phase > 1) produces a vector of TRUEs starting from the position in the phase vector where Phase 2 starts, and FALSEs prior to that.

The cumsum(phase > 1) phrase coerces the logical states to integers (1 for TRUE, and 0 for FALSE), then cumulates the vector to produce a result that looks like this:

[1] 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18.

By dividing this vector by time.to.peak.sales , we get a vector that normalizes this cumulative sum to the duration of time required to reach peak sales.

[1] 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6

The pmin() function performs a pairwise comparison between each number in this vector and the maximum adoption rate of 100%.

[1] 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

Multiplying this adoption curve by the maximum market demand gives us the sales in each year.

sales <- mkt.adoption * mkt.demand * 1000 * 2000

Because the price of the product is given in “$/lb,” we convert the sales to pounds by multiplying it by the conversion factors 1,000 tons/kiloton and 2,000 lbs/ton.

By plotting revenue with

plot(year, sales / 1000,
       xlab = "Year",
       ylab = "Sales [000 lbs]",
       type = "b")
we observe the revenue profile (Figure 2-5) and confirm that it is behaving as we expect.
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig5_HTML.jpg
Figure 2-5

The annual unit sales response

Finally, revenue is given by

revenue <- sales * price.

By plotting revenue with

plot(year,
       revenue / 1000,
       xlab = "Year",
       ylab = "Revenue [$000]",
       type = "b")
we also observe the revenue profile (Figure 2-6) and confirm that it is behaving properly.
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig6_HTML.jpg
Figure 2-6

The annual revenue response

OPEX Block

The operational costs of the business case don’t start until after Phase 1. As in the case of the market adoption formula, we establish this with the term (phase > 1) for fixed costs.

fixed.cost <- (phase > 1) * fixed.prod.cost * (1 + prod.cost.escal / 100) ^ (year - p1.dur - 1 )

We represent the escalation of the fixed costs with the compounding rate term

(1 + prod.cost.escal / 100) ^ (year - p1.dur - 1 ).

The phrase in the exponent term ensures that the value of the exponent is 0 in Year 3. The combined effect of these two terms produces a series of factors that looks like this:

[1] 0.000000 0.000000 1.000000 1.030000 1.060900 1.092727 1.125509 1.159274 1.194052 1.229874 1.266770 1.304773 1.343916 1.384234 1.425761 1.468534 1.512590 1.557967 1.604706 1.652848.

Multiplying by the initial fixed cost value, we get

[1] 0 0 3000000 3090000 3182700 3278181 3376526 3477822 3582157 3689622 3800310 3914320 4031749 4152702 4277283 4405601 4537769 4673902 4814119 4958543.

Because the variable cost is a function of the sales, and we have already defined sales to start after the end of Phase 1, we won’t need the (phase > 1) term in the variable cost definition. We do, however, need another compounding factor to account for the decay in the variable cost structure.

(1 - var.cost.redux / 100) ^ (year - p1.dur - 1)

This series looks like

[1] 0 0 1.0000000 0.9500000 0.9025000 0.8573750 0.8145062 0.7737809 0.7350919 0.6983373 0.6634204 0.6302494 0.5987369 0.5688001 0.5403601 0.5133421 0.4876750 0.4632912 0.4401267 0.4181203.

The final expression we use for the variable cost takes the form

var.cost <- sales * var.prod.cost * (1 - var.cost.redux / 100) ^ (year - p1.dur - 1)

We finish up the remaining expense equations with the following expressions .

gsa <- (gsa.rate / 100) * revenue
opex <- fixed.cost + var.cost

Pro Forma Block

The pro forma calculations are rather straightforward. According to generally acceptable accounting principles (GAAP ) –more or less–we have
  • Gross profit = revenue - GS&A

  • OPEX = fixed cost + variable cost

  • Operating profit before tax = gross profit - OPEX - depreciation

  • Operating profit after tax = operating profit before tax - tax

  • Cash flow = operating profit after tax + depreciation - CAPEX

Implemented as R code , we have the following:

gross.profit <- revenue - gsa
op.profit.before.tax <- gross.profit - opex - depr
tax <- op.profit.before.tax * kTaxRate/100
op.profit.after.tax <- op.profit.before.tax - tax
cash.flow <- op.profit.after.tax + depr - capex
cum.cash.flow <- cumsum(cash.flow)

We can see the cash flow (Figure 2-7) in thousands of dollars.

plot(year,
       cash.flow / 1000,
       xlab = "Year",
       ylab = "Cash Flow [$000]",
       type = "b")
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig7_HTML.jpg
Figure 2-7

The annual cash flow response

We can also show the cumulative cash flow (Figure 2-8) in thousands of dollars.

plot(year,
       cum.cash.flow / 1000,
       xlab = "Year",
       ylab = "Cash Flow [$000]",
       type = "b")
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig8_HTML.jpg
Figure 2-8

The annual cumulative cash flow

Net Present Value

To obtain the NPV of the cash flow, we simply need to find the sum of the discounted cash flows over the year index. Following the convention that we count payments as occurring at the end of a time period, we first create a vector of discount factors to be applied to the cash flow.

discount.factors <- 1/(1 + kDiscountRate / 100) ^ year

which produces

[1] 0.8928571 0.7971939 0.7117802 0.6355181 0.5674269 0.5066311 0.4523492 0.4038832 0.3606100 0.3219732 0.2874761 0.2566751 0.2291742 0.2046198 0.1826963 0.1631217 0.1456443 0.1300396 0.1161068 0.1036668.

Figure 2-9 shows us the plot of the discount.factors.

plot(year,
       discount.factors,
       xlab = "Year",
       ylab = "Discount Factors",
       type = "b")
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig9_HTML.jpg
Figure 2-9

The discount factor profile

The discounted cash flow is then given by

discounted.cash.flow <- cash.flow * discount.factors

which produces

[1] -17857143 -15943878 -5748744 -3523004 3058044 4183361 5186660 5006359 4788540 3846642 3507746 3187060 2985618 2783313 2583740 2389570 2202730 2024555 1855908 1697283.

Figure 2-10 shows us the plot of the discounted.cash.flow .

plot(year,
       discounted.cash.flow,
       xlab = "Year",
       ylab = "Discounted Cash Flow [$000]",
       type = "b")
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig10_HTML.jpg
Figure 2-10

The annual discounted cash flow

Finally, we sum the values in this vector

npv <- sum(discounted.cash.flow)

to obtain NPV = $8,214,363.

It is very important to remember at this point that the NPV reported here is based on the deterministic median assumptions. It does not include any of the effects from the underlying uncertainty in our assumptions or that competition might impose. This NPV is unburdened by risks.

The NPV calculation is so important to financial analysis and so repeatably used that we should make a function out of it and keep it in a functions file that we would import every time we start a new analysis, just like we did at the beginning of this analysis with the data file.

CalcNPV = function(series, time, dr, eotp = TRUE) {
# series = the cash flow series
# time = index over which series is allocated
# dr = discount rate
# eotp = end of time period calculation (default is TRUE)
# or beginning of time period calculation (FALSE)
this.npv <- sum(series / (1 + dr) ^ (time - (1- eotp)))
}

Using this function, we would write the following to R with our currently defined assumptions .

npv <- CalcNPV(cash.flow, year, kDiscountRate / 100)

Remember that this business case deals with a project spanning two decades. The likelihood that the company will use the same discount rate over the entire horizon of concern here is small, as the company might take on or release debt, face changing interest rates on debt, experience changing volatility for the price of similar equity, and so on. More realistically, then, we might consider using an evolving discount rate for projects with long horizons. That would require stochastic simulation, which is more of a topic for Chapter 3. If a question about the importance of the discount rate arises for whether a project is worth pursuing, we can always test our sensitivity to rejecting or accepting a project on the sensitivity of the NPV to the potential evolution of the discount rate .

As a bonus , we can also find the internal rate of return (IRR) of the cash flow. The IRR is the discount rate that gives an NPV of $0. However, there are no known methods of finding the IRR in a closed form manner, so we generally use a while routine to find that value.

CalcIRR = function (series, time, irr0 = 0.1, tolerance = 0.00001) {
#Calculates the discount rate that produces a 0 NPV
# of a series of end-of-year cash flows.
# series = a vector of cash flows
# time = index over which series is allocated
# irr0 = the initial guess
# tolerance = the error around 0 at which the goal seek stops
my.irr <- c(irr0)
my.npv <- CalcNPV(series, time, my.irr[1])
i <- 1
while ( abs(my.npv[i]) > tolerance ) {
    if (i ==1) {
      if ( my.npv[i] > 0 ) {
        (my.irr[i + 1] = my.irr[i] * 2)
      } else {
        (my.irr[i + 1] <- my.irr[i]/2)
        }
      } else {
  # uses Newton-Raphson method of convergence
        slope.my.npv <- (my.npv[i] - my.npv[i - 1]) /
        (my.irr[i] - my.irr[i - 1])
        my.irr[i + 1] <- my.irr[i] - my.npv[i] / slope.my.npv
      }
    my.npv[i + 1] <- CalcNPV(series, time, my.irr[i + 1])
    i <- i + 1
    }
  my.irr <- my.irr[i]
}

Note that the CalcIRR() function calls the CalcNPV(). Therefore, we need to make sure to include the IRR() function in our functions file somewhere after the CalcNPV() function.

Be aware that the IRR most likely won’t have a unique solution if the cash flow switches over the $0 line multiple times. Regardless of the several cautions that many people offer for using IRR, it is a good value to understand if the conditions for finding it are satisfactory. Namely, by comparison to the discount rate, it tells us the incremental rate by which value is created (i.e., IRR > 0) or destroyed (i.e., IRR < 0).

Using the CalcIRR function now in our model, we write

irr <- CalcIRR(cash.flow, year)

and get back 0.1416301.

The IRR at 14.2% is better than our discount rate, 2.2% more, which should not surprise us because we have a positive NPV.

Write a Pro Forma Table

To create a pro forma table , we’d like to have all of the elements from the GAAP flow described earlier with the names of the GAAP elements in a column on the left, and the values for each element in a column by year.

The first thing we do is create an array composed of the pro forma elements, remembering to place a negative sign in front of the cost values.

pro.forma.vars <- array( c(sales, revenue, -gsa, gross.profit, -fixed.cost,
  -var.cost, -opex, -depr, op.profit.before.tax, -tax, op.profit.after.tax,
  depr, -capex, cash.flow ), dim = c(kHorizon, 14))

The 14 in the dim parameter refers to how many pro forma elements we put in the array.

We then assign that array to a data frame from the existing calculations.

pro.forma <- data.frame(pro.forma.vars)
This gives the table shown in Figure 2-11.
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig11_HTML.jpg
Figure 2-11

The raw pro forma table

Notice that R has oriented our pro forma with the result values in column form, and it uses a sequentially numbered X.n for the pro forma elements. Obviously, this is a little unsightly and uninformative, so we might want to replace the default column headers with header names we like better. Furthermore, we might prefer to transpose the table so that pro forma elements are on the left as opposed to across the top.

To accomplish this we first define a vector with names we like for columns.

pro.forma.headers <- c("Sales [lbs]", "Revenue", "GS&A", "Gross Profit", "Fixed Cost", "Variable Cost", "OPEX", "-Depreciation", "Operating Profit Before Tax", "Tax", "Operating Profit After Tax", "+Depreciation", "CAPEX", "Cash Flow")

Next, we coerce the default headers to be reassigned to the preferred names.

colnames(pro.forma) <- pro.forma.headers
rownames(pro.forma) <- year

Then, using the transpose function t(), we reassign the pro forma to a transposed form with the column headers now appearing as row headers.

pro.forma = t(pro.forma)
A few columns of the finished form look like the table in Figure 2-12.
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig12_HTML.jpg
Figure 2-12

The pro forma table formatted now with row names and time column headers

Conduct Deterministic Sensitivity Analysis

Deterministic sensitivity represents the degree of leverage a variable exerts on one of our objective functions, like the NPV . By moving each of our variables across a range of variation, we can observe the relative strength each variable exerts on the objective. However, there is an extreme danger of relying on this simplistic type of sensitivity analysis, as it can lead one to think the deterministic sensitivity is even close to the likely sensitivity. It most likely will not be if it uses ±x% type variation that is too common in business case analysis.

Understand that the primary reason for conducting the deterministic analysis at this point is mainly to confirm the validity of the logic of the model’s code, not necessarily to draw deep conclusions from it just yet. Our primary goal is to ensure that the the objective function operates directionally as anticipated.3 For example, if price goes up, and we don’t use conditional logic that makes revenue do anything other than go up, too, then we should see the NPV go up. If the construction duration lasts longer, we should see the NPV go down due to the time value of money. We really should not draw deep conclusions from our analysis until we consider all the deeper effects of uncertainty and risk, which we’ll discuss in Chapter 3.

The best way to conduct deterministic sensitivity analysis would be to use the low and high values associated with the range obtained from SME elicitations for each variable that will eventually be treated as an uncertainty. Although the uncertain sensitivity analysis is a little more involved than the deterministic type, it does build on the basis of the deterministic analysis. In Chapter 3, not only will we use the full range of potential variation supplied by SMEs for all the variables (including those that explore the effects of competition), but each variable will run through its full range of variation as we test each sensitivity point.

Before we set up the sensitivity analysis, recall that after we read the data from the CSV file , we stored the variables’ p50 values in a vector named d.vals. We’d like to do this:
  1. 1.

    Iterate across the d.vals vector.

     
  2. 2.

    Replace each value with a variant based on the range of sensitivity.

     
  3. 3.

    Calculate the variant NPV.

     
  4. 4.

    Store the variant NPV in an array that has as many rows as variables and as many columns as sensitivity points.

     
  5. 5.

    Replace the tested variable with its original value.

     

In pseudocode , this might look like this:

 # iterate across our list of variable values
 for (i in index of d.vals) {
 # iterate across the list of sensitivity points
     for (k in index of sensitivity points) {
 # replace the variable fixed value with each sensitivity point
       d.vals[i] <- d.vals.sens.points[i, k]
# calculate the NPV
      npv <- {...}
# record the value of the movement of the NPV
      npv.sens[i, k] <- npv
    }
# restore the current variables' base value
    d.vals[i] <- d.vals.original[i]
}

The way we implement this in real R code is to duplicate the original R file. Then we wrap the for loops around the original code. Before we do this, though, we set up parameters that control the looping after the variable values are assigned to the d.vals vector and before the individual assignment of value to each variable.

# Slice the values from data frame d.data.
d.vals <- d.data$p50[1:13]
d.vals.sens.points <- d.data[1:13, 2:4]
sens.point <- 1:3
len.d.vals <- length(d.vals)
len.sens.range <- length(sens.point)
# Sets up an initialized 14x3 array of 0s.
  npv.sens <- array(0, c(len.d.vals, len.sens.range))

The for loop code blocks surround our original code that found the NPV.

 for (i in 1:len.d.vals) {
   for (k in 1:len.sens.range) {
     d.vals[i] <- d.vals.sens.points[i, k]
         # Assign values to variables.
         ...
         npv.sens[i, k] <- npv
     }
     d.vals[i] <- d.data$value[i]
}

The resultant npv.sens matrix looks like this.

       [,1]    [,2]        [,3]
[1,]  11451979 8214363   3681699.6
[2,]  13077695 8214363    288329.9
[3,]  10748136 8214363   3146815.1
[4,]   6684537 8214363  10858019.4
[5,]   9938517 8214363   4766053.3
[6,]  11421355 8214363   1800378.4
[7,]   8786129 8214363   6949780.0
[8,]  25703301 8214363 -26763514.8
[9,]   2357020 8214363  18256935.0
[10,]  9888497 8214363   4866094.4
[11,] 11656823 8214363   1193338.6
[12,] -7779381 8214363  40201850.8
[13,]-2526832  08214363 75179728.2

To make this table more understandable, we can convert it to a data.frame with column names equal to the sensitivity points and the row names equal to the variable identifiers that are driving the NPV using the following coercions .

var.names <- d.data[1:13, 1]
sens.point.names <- c("p10", "p50", "p90")
rownames(npv.sens) <- var.names
colnames(npv.sens) <- sens.point.names

The result is a table for npv.sens that looks like this.

                         p10     p50         p90
p1.capex            11451979 8214363   3681699.6
p1.dur              13077695 8214363    288329.9
p2.capex            10748136 8214363   3146815.1
p2.dur               6684537 8214363  10858019.4
maint.capex          9938517 8214363   4766053.3
fixed.prod.cost     11421355 8214363   1800378.4
prod.cost.escal      8786129 8214363   6949780.0
var.prod.cost       25703301 8214363 -26763514.8
var.cost.redux       2357020 8214363  18256935.0
gsa.rate             9888497 8214363   4866094.4
time.to.peak.sales  11656823 8214363   1193338.6
mkt.demand          -7779381 8214363  40201850.8
price              -25268320 8214363  75179728.2

The way we interpret this table is that each value is the NPV for our business case when each variable is independently varied through their p10, p50, p90 values independently of the others. For example, if the price were to go to its low (or p10) value, the NPV would shift to -$25.3 million. On the other hand, if the price were to go to its high value, the NPV would shift to $75.2 million. Accordingly, the price, market demand, and variable production costs are the factors that could cause the NPV to go negative or financially undesirable. If we delude ourselves into thinking there will be no competition, the other variables don’t seem to cause enough change to worry about. We’ll leave the analysis to the risk layer of the model that accounts for uncertainties associated with the competitive effects.

The following code produces the sensitivity chart using the barplot() function .

 npv.sens.array <- array(0, c(len.d.vals, 2))
 npv.sens.array[, 1] <- (npv.sens[, 1] - npv.sens[, 2])
 npv.sens.array[, 2] <- (npv.sens[, 3] - npv.sens[, 2])
 rownames(npv.sens.array) <- var.names
 colnames(npv.sens.array) <- sens.point.names[-2]
# Calculates the rank order of the NPV sensitivity based on the
# absolute range caused by a given variable. The npv.sens.array
# is reindexed by this rank ordering for the bar plot.
npv.sens.rank <- order(abs(npv.sens.array[, 1] - npv.sens.array[, 2]), decreasing = FALSE)
ranked.npv.sens.array <- npv.sens.array[npv.sens.rank, ]
ranked.var.names <- var.names[npv.sens.rank]
rownames(ranked.npv.sens.array) <- ranked.var.names
par(mai = c(1, 1.75, 0.5, 0.5))
barplot(t(ranked.npv.sens.array) / 1000,
            main = "Deterministic NPV Sensitivity",
            names.arg = ranked.var.names,
            col = "light blue",
            xlab = "NPV [$000]",
            beside = TRUE,
            horiz = TRUE,
            offset = npv.sens[, 2] / 1000,
            las = 1,
            space = c(-1, 1),
            cex.names = 1,
            tck = 1)
The result of the code graphically reveals the information in our sensitivity table. The inherent pattern in the numbers, which remains difficult to discern in its tabular form, becomes immediately obvious in Figure 2-13. We typically refer to this kind of chart as a tornado chart .
../images/461101_1_En_2_Chapter/461101_1_En_2_Fig13_HTML.jpg
Figure 2-13

The deterministic tornado chart showing the ordered sensitivity of the NPV to the range of inputs used for the uncertainties

Although we can look up the meaning of the parameters for the barplot() function in R Help, I think it’s worth commenting on the last few parameters in the function as used here.
  • The beside = TRUE parameter sets the bar plot columns adjacent to each other as opposed to being stacked (i.e., beside = FALSE).

  • horiz = TRUE sets the tornado chart in a vertical orientation because the data frame for npv.sens.array has the names of the variables in the position that barplot() treats as the x axis.

  • offset moves the baseline of the bar plot, which is 0 by default, to some desired bias position. In our tornado chart, we want the baseline to be the NPV of the middle value of the sensitivity points.

  • las = 1 sets the names of the variables to a horizontal orientation.

  • space = c(-1,1) makes the bars in a row overlap. Namely, the -1 moves the upper bar in a row down by the distance of its width.

  • cex.names = 1 sets the height of the names of the variables to a percentage of their base height. I usually leave this in place so that I can quickly modify the value to test the visual aesthetics of changing their size.

Finally, the par(mai = c(1, 1.75, 0.5, 0.5)) statement, which we placed before the barplot function , sets the width of the graphics rendering window to accommodate the length of the names of the variables. You might want to adjust the second parameter of this term, here set to 1.75, to account for different variable names if you choose to expand the complexity of this model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.35.179