An example of making a value decision

To illustrate the example, we will walk through a business situation and then some R code that simulates a cost benefit curve for that decision. The code will use a fitted predictive model to calculate the net savings (or lack thereof) to generate a cost curve. The cost curve can then be used in a business decision on what proportion of units with predicted failures should have a proactive replacement.

Imagine you work for a company that builds diesel-powered generators. There is a coolant control valve that normally lasts for 4,000 hours of operation until there is a planned replacement. From analysis, your company has realized that the generators built two years prior are experiencing an earlier than expected failure of the valve.

When the valve fails, the engine overheats and several other components are damaged. The cost of failure, including labor rates for repair personnel and the cost to the customer for downtime, is an average of $1,000 USD. The cost of a proactive replacement of the valve is $253 USD.

Should you replace all coolant valves in the population? It depends on how high a failure rate is expected. In this case, about 10% of the current non-failed units are expected to fail before the scheduled replacement. Also, importantly, it matters how well you can predict the failures.

The following R code simulates this situation and uses a simple predictive model (logistic regression) to estimate a cost curve. The model has an AUC of close to 0.75. This will vary as you run the code since the dataset is randomly simulated:

#make sure all needed packages are installed
if(!require(caret)){
  install.packages("caret")
}
if(!require(pROC)){
  install.packages("pROC")
}
if(!require(dplyr)){
  install.packages("dplyr")
}
if(!require(data.table)){
  install.packages("data.table")
}

#Load required libraries
library(caret)
library(pROC)
library(dplyr)
library(data.table)

#Generate sample data
simdata = function(N=1000) {

#simulate 4 features
 X = data.frame(replicate(4,rnorm(N)))
 #create a hidden data structure to learn
 hidden = X[,1]^2+sin(X[,2]) + rnorm(N)*1
 #10% TRUE, 90% FALSE
 rare.class.probability = 0.1
 #simulate the true classification values
 y.class = factor(hidden<quantile(hidden,c(rare.class.probability)))
 return(data.frame(X,Class=y.class))
}

#make some data structure
model_data = simdata(N=50000)

#train a logistic regression model on the simulated data
training <- createDataPartition(model_data$Class, p = 0.6, list=FALSE)
trainData <- model_data[training,]
testData <- model_data[-training,]
glmModel <- glm(Class~ . , data=trainData, family=binomial)
testData$predicted <- predict(glmModel, newdata=testData, type="response")

#calculate AUC
roc.glmModel <- pROC::roc(testData$Class, testData$predicted)
auc.glmModel <- pROC::auc(roc.glmModel)
print(auc.glmModel)

#Pull together test data and predictions
simModel <- data.frame(trueClass = testData$Class,
                    predictedClass = testData$predicted)

# Reorder rows and columns
simModel <- simModel[order(simModel$predictedClass, decreasing = TRUE), ]
simModel <- select(simModel, trueClass, predictedClass)
simModel$rank <- 1:nrow(simModel)

#Assign costs for failures and proactive repairs
proactive_repair_cost <- 253   # Cost of proactively repairing a part
failure_repair_cost  <- 1000   # Cost of a failure of the part (include all costs such as lost production, etc not just the repair cost)

# Define each predicted/actual combination
fp.cost <- proactive_repair_cost # The part was predicted to fail but did not (False Positive)
fn.cost <- failure_repair_cost  # The part was not predicted to fail and it did (False Negative)
tp.cost <- (proactive_repair_cost - failure_repair_cost) # The part was predicted to fail and it did (True Positive). This will be negative for a savings.
tn.cost <- 0.0                     # The part was not predicted to fail and it did not (True Negative)

#incorporate probability of future failure
simModel$future_failure_prob <- prob_failure

#Function to assign costs for each instance
assignCost <- function(pred, outcome, tn.cost, fn.cost, fp.cost, tp.cost, prob){
 cost <- ifelse(pred == 0 & outcome == FALSE, tn.cost, # No cost since no action was taken and no failure
                    ifelse(pred == 0 & outcome == TRUE, fn.cost, # The cost of no action and a repair resulted
                           ifelse(pred == 1 & outcome == FALSE, fp.cost, # The cost of proactive repair which was not needed
                                  ifelse(pred == 1 & outcome == TRUE, tp.cost, 999999999))))   # The cost of proactive repair which avoided a failure
 return(cost)
}

# Initialize list to hold final output
master <- vector(mode = "list", length = 100)

#use the simulated model. In practice, this code can be adapted to compare multiple models
test_model <- simModel

# Create a loop to increment through dynamic threshold (starting at 1.0 [no proactive repairs] to 0.0 [all proactive repairs])
threshold <- 1.00
for (i in 1:101) {
 #Add predicted class with percentile ranking
 test_model$prob_ntile <- ntile(test_model$predictedClass, 100) / 100
 # Dynamically determine if proactive repair would apply based on incrementing threshold
 test_model$glm_failure <- ifelse(test_model$prob_ntile >= threshold, 1, 0)
 test_model$threshold <- threshold

 # Compare to actual outcome to assign costs
 test_model$glm_impact <- assignCost(test_model$glm_failure, test_model$trueClass, tn.cost, fn.cost, fp.cost, tp.cost, test_model$future_failure_prob)

 # Compute cost for not doing any proactive repairs
 test_model$nochange_impact <- ifelse(test_model$trueClass == TRUE, fn.cost, tn.cost) # *test_model$future_failure_prob)

 # Running sum to produce the overall impact
 test_model$glm_cumul_impact <- cumsum(test_model$glm_impact) / nrow(test_model)
 test_model$nochange_cumul_impact <- cumsum(test_model$nochange_impact) / nrow(test_model)

 # Count the # of classified failures
 test_model$glm_failure_ct <- cumsum(test_model$glm_failure)

 # Create new object to house the one row per iteration output for the final plot
 master[[i]] <- test_model[nrow(test_model),]

 # Reduce the threshold by 1% and repeat to calculate new value
 threshold <- threshold - 0.01
}

finalOutput <- rbindlist(master)
finalOutput <- subset(finalOutput,
                     select = c(threshold,
                               glm_cumul_impact,                                             glm_failure_ct,                                nochange_cumul_impact)
)

# Set baseline to costs of not doing any proactive repairs
baseline <- finalOutput$nochange_cumul_impact

# Plot the cost curve
par(mfrow = c(2,1))
plot(row(finalOutput)[,1],
    finalOutput$glm_cumul_impact,
    type = "l",
    lwd = 3,
    main = paste("Net Costs: Proactive Repair Cost of $", proactive_repair_cost, ", Failure cost $", failure_repair_cost, sep = ""),
    ylim = c(min(finalOutput$glm_cumul_impact) - 100,
            max(finalOutput$glm_cumul_impact) + 100),
    xlab = "Percent of Population",
    ylab = "Net Cost ($) / Unit")

# Plot the cost difference of proactive repair program and a 'do nothing' approach
plot(row(finalOutput)[,1],
    baseline - finalOutput$glm_cumul_impact,
    type = "l",
    lwd = 3,
    col = "black",
    main = paste("Savings: Proactive Repair Cost of $", proactive_repair_cost, ", Failure cost $", failure_repair_cost,sep = ""),
    ylim = c(min(baseline - finalOutput$glm_cumul_impact) - 100,
             max(baseline - finalOutput$glm_cumul_impact) + 100),
    xlab = "% of Population",
    ylab = "Savings ($) / Unit")
    abline(h=0,col="gray")

As can be seen in the resulting net cost and savings curves, based on the model's predictions, the optimal savings would be from a proactive repair program of the top 30 percentile units. The savings decreases after this, although you would still save money when replacing up to 75% of the population. After this point, you should expect to spend more than you save. The following set of charts is the output from the preceding code:

Cost and savings curves for the proactive repair $253 and failure cost at $1,000 scenario

Note how this changes in the following graph when the failure cost drops to $300 USD. At no point do you save money, as the proactive repair cost will always outweigh the reduced failure cost. This does not mean you should not do a proactive repair; you may still want to do so in order satisfy your customers. Even in such a case, this cost curve method can help in decisions on how much you are willing to spend to address the problem. You can rerun the code with proactive_repair_cost set to 253 and failure_repair_cost set to 300 to generate the following charts:

Cost and savings curves for the proactive repair $253 and failure cost at $300 scenario

And finally, notice how the savings curve changes when the failure cost moves to $5,000. You will notice that the spread between the proactive repair cost and the failure cost determines much of when doing a proactive repair makes business sense. You can rerun the code with proactive_repair_cost set to 253 and failure_repair_cost set to 5000 to generate the following charts:

Cost and savings curves for the proactive repair $253 and failure cost at $5,000 scenario

Ultimately, the decision is a business case based on the expected costs and benefits. ML modeling can help optimize savings under the right conditions. Utilizing cost curves helps to determine the expected costs and savings of proactive replacements.

Table of Contents for An example of making a value decision

Create new playlist

Sign In

Sign Up

Table of Contents for
An example of making a value decision