1 Introduction and objectives

This series of books aspires to be a practical reference guide to a range of numerical techniques and models that an estimator might wish to consider in analysing historical data in order to forecast the future. Many of the examples and techniques discussed relate to cost estimating in some way, as the term estimator is frequently used synonymously to mean cost estimator. However, many of these numerical or quantitative techniques can be applied in other areas other than cost where estimating is required, such as scheduling, or to determine a forecast of physical, such as weight, length or some other technical parameter.

This volume is a little bit of a mixed bag, but essentially consists mainly of modelling with some known statistical distributions including a liberal dose of estimating with random numbers. (Yes, that’s what a lot of project managers say.)

1.1 Why write this book? Who might find it useful? Why five volumes?

1.1.1 Why write this series? Who might find it useful?

The intended audience is quite broad, ranging from the relative ‘novice’ who is embarking on a career as a professional estimator, to those already seasoned in the science and dark arts of estimating. Somewhere between these two extremes of experience, there will be some who just want to know what tips and techniques they can use, to those who really want to understand the theory of why some things work and other things don’t. As a consequence, the style of this book is aimed to attract and provide signposts to both (and all those in between).

This series of books is not just aimed at cost estimators (although there is a natural bias there.) There may be some useful tips and techniques for other number jugglers, in which we might include other professionals like engineers or accountants who estimate but do not consider themselves to be estimators per se. Also, in using the term ‘estimator’, we should not constrain our thinking to those whose Estimate’s output currency is cost or hours, but also those who estimate in different ‘currencies’, such as time and physical dimensions or some other technical characteristics.

Finally, in the process of writing this series of guides, it has been a personal voyage of discovery, cathartic even, reminding me of some of the things I once knew but seem to have forgotten or mislaid somewhere along the way. Also, in researching the content, I have discovered many things that I didn’t know and now wish I had known years ago when I started on my career, having fallen into it, rather than chosen it (does that sound familiar to other estimators?)

1.1.2 Why five volumes?

There are two reasons:

Size … there was too much material for the single printed volume that was originally planned … and that might have made it too much of a heavy reading so to speak. That brings out another point, the attempt at humour will remain around that level throughout.

Cost … even if it had been produced as a single volume (printed or electronic), the cost may have proved to be prohibitive without a mortgage, and the project would then have been unviable.

So, a decision was made to offer it as a set of five volumes, such that each volume could be purchased and read independently of the others. There is cross-referencing between the volumes, just in case any of us want to dig a little deeper, but by and large the fives volumes can be read independently of each other. There is a common Glossary of Terms across the five volumes which covers terminology that is defined and assumed throughout. This was considered to be essential in setting the right context, as there are many different interpretations of some words in common use in estimating circles. Regrettably, there is a lack of common understanding by what these terms mean, so the glossary clarifies what is meant in this series of volumes.

1.2 Features you'll find in this book and others in this series

People’s appetites for practical knowledge varies from the ‘How do I?’ to the ‘Why does that work?’ This book will attempt to cater for all tastes.

Many text books are written quite formally, using the third person which can give a feeling of remoteness. In this book, the style used is in first person plural, ‘we’ and ‘us’. Hopefully this will give the sense that this is a journey on which we are embarking together, and that you, the reader, are not alone, especially when it gets to the tricky bits! On that point, let’s look at some of the features in this series of Working Guides to Estimating & Forecasting

1.2.1 Chapter context

Perhaps unsurprisingly, each chapter commences with a very short dialogue about what we are trying to achieve or the purpose of that chapter, and sometimes we might include an outline of a scenario or problem we are trying to address.

1.2.2 The lighter side (humour)

There are some who think that an estimator with a sense of humour is an oxymoron. (Not true, it’s what keeps us sane .) Experience gleaned from developing and delivering training for estimators has highlighted that people learn better if they are enjoying themselves. We will discover little ‘asides’ here and there, sometimes at random but usually in italics, to try and keep the attention levels up. ( You’re not falling asleep already, are you?) In other cases the humour, sometimes visual, is used as an aide memoire. Those of us who were hoping for a high level of razor-sharp wit, should prepare themselves for a level of disappointment!

1.2.3 Quotations

Here we take the old adage ‘A Word to the Wise …’ and give it a slight twist so that we can draw on the wisdom of those far wiser and more experienced in life than I. We call these little interjections ‘A word (or two) from the wise’ You will spot them easily by the rounded shadow boxes. Applying the wisdom of Confucius to estimating, we should explore alternative solutions to achieving our goals rather than automatically de-scoping what we are trying to achieve. The corollary to this is that we should not artificially change our estimate to fit the solution if the solution is unaffordable.

A word (or two) from the wise?

'When it is obvious that the goals cannot be reached, don't adjust the goals, adjust the action steps.'

Confucius
Chinese Philosopher
551-479 BC

1.2.4 Definitions

Estimating is not just about numbers but requires the context of an estimate to be expressed in words. There are some words that have very precise meanings; there are others that mean different things to different people (estimators often fall into this latter group). To avoid confusion, we proffer definitions of key words and phrases so that we have a common understanding within the confines of this series of working guides. Where possible we have highlighted where we think that words may be interpreted differently in some sectors, which regrettably, is all too often. I am under no illusion that back in the safety of the real world we will continue to refer to them as they are understood in those sectors, areas and environments.

As the title suggests, this volume is about random models with or without random events, so let’s look at what we mean by the term ‘random’.

Which do we mean in the context of this volume? Answer: both.

Definition 1.1 Random

Random is an adjective that relates to something that:

  • 1) occurs by chance without method, conscious decision or intervention.
  • 2) behaves in a manner that is unexpected, unusual or odd.

I dare say that some of the definitions given may be controversial with some of us. However, the important point is that they are discussed and considered, and understood in the context of this book, so that everyone accessing these books have the same interpretation; we don’t have to agree with the ones given here forevermore – what estimator ever did that? The key point here is that we are able to appreciate that not everyone has the same interpretation of these terms. In some cases, we will defer to the Oxford English Dictionary (Stevenson & Waite, 2011) as the arbiter.

1.2.5 Discussions and explanations with a mathematical slant for Formula-philes

These sections are where we define the formulae that underpin many of the techniques in this book. They are boxed off with a header to warn off the faint hearted. We will, within reason, provide justification for the definitions and techniques used. For example:

For the Formula-philes: Exponential inter-arrivals are equivalent to Poisson arrivals

Consider a Poisson Distribution and an Exponential Distribution with a common scale parameter of λ,

… which shows that the Average Inter-Arrival Time in a Time Period is given by the reciprocal of the Average Number of Customer Arrivals in that Time Period

1.2.6 Discussions and explanations without a mathematical slant for Formula-phobes

For those less geeky than me, who don’t get a buzz from knowing why a formula works (yes, it’s true, there are some estimators like that), there are the Formula-phobe sections with a suitable header to give you more of a warm comforting feeling. These are usually wordier with pictorial justifications, and with specific particular examples where it helps the understanding and acceptance.

For the Formula-phobes: One-way logic is like a dead lobster

An analogy I remember coming across reading as a fledgling teenage mathematician, but for which sadly I can no longer recall its creator, relates to the fate of lobsters. It has stuck with me, and I recreate it here with my respects to whoever taught it to me.

Sad though it may be to talk of the untimely death of crustaceans, the truth is that all boiled lobsters are dead! However, we cannot say that the reverse is true –not all dead lobsters have been boiled!

One-way logic is a response to many-to-one relationship in which there are many circumstances that lead to a single outcome, but from that outcome we cannot stipulate what was the circumstance that led to it.

Please note that no real lobsters were harmed in the making of this analogy.

1.2.7 Caveat augur

Based on the fairly well-known warning to shoppers: ‘Caveat Emptor’ (let the buyer beware) these call-out sections provide warnings to all soothsayers (or estimators) who try to predict the future, that in some circumstances we many encounter difficulties in using some of the techniques. They should not be considered to be foolproof or be a panacea to cure all ills.

Caveat augur

These are warnings to the estimator that there are certain limitations, pitfalls or tripwires in the use or interpretation of some of the techniques. We cannot profess to cover every particular aspect, but where they come to mind these gentle warnings are shared

1.2.8 Worked examples

There is a proliferation of examples of the numerical techniques in action. These are often tabular in form to allow us to reproduce the examples in Microsoft Excel (other spreadsheet tools are available). Graphical or pictorial figures are also used frequently to draw attention to particular points. The book advocates that we should ‘always draw a picture before and after analysing data.’ In some cases, we show situations where a particular technique is unsuitable (i.e. it doesn’t work) and try to explain why. Sometimes we learn from our mistakes; nothing and no-one is infallible in the wondrous world of estimating. The tabular examples follow the spirit and intent of Best Practice Spreadsheet Modelling (albeit limited to black and white in the absences of affordable colour printing), the principles and virtues of which are summarised here in Volume I Chapter 3.

1.2.9 Useful Microsoft Excel functions and facilities

Embedded in many of the examples are some of the many useful special functions and facilities found within Microsoft Excel (often, but not always, the estimator’s toolset of choice because of its flexibility and accessibility). Together we explore how we can exploit these functions and features in using the techniques described in this book.

We will always provide the full syntax as we recommend that we avoid allowing Microsoft Excel to use its default settings for certain parameters when they are not specified. This avoids unexpected and unintended results in modelling and improves transparency, an important concept that we discussed in Volume I Chapter 3.

Example:

The SUMIF(range, criteria, sum_range) function will summate the values in the sum_range if the criteria in range is satisfied, and exclude other values from the sum where the condition is not met. Note that sum_range is an optional parameter of the function in Excel; if it is not specified then the range will be assumed instead. We recommend that we specify it even if it is the same. This is not because we don’t trust Excel, but a person interpreting our model may not be aware that a default has been assumed without our being by their side to explain it.

1.2.10 References to authoritative sources

Every estimate requires a documented Basis of Estimate. In common with that principle, which we discussed in Volume I Chapter 3, every chapter will provide a reference source for researchers, technical authors, writers and those of a curious disposition, where an original, more authoritative or more detailed source of information can be found on particular aspects or topics.

Note that an Estimate without a Basis of Estimate becomes a random number in the future. On the same basis, without reference to an authoritative source, prior research or empirical observation becomes little more than a spurious unsubstantiated comment.

1.2.11 Chapter reviews

Perhaps not unexpectedly, each chapter summarises the key topics that we will have discussed on our journey. Where appropriate we may draw a conclusion or two just to bring things together, or to draw out a key message that may run throughout the chapter.

1.3 Overview of chapters in this volume

Volume V begins in earnest with a discussion in Chapter 2 on how we can model the research and development, concept demonstration or design and development tasks when we only know the objectives of the development, but not how we are going to achieve them in detail. One possible solution may be to explore the use of a Norden-Rayleigh Curve, which essentially is a repeating pattern of resource and cost consumption over time that has been shown empirically to follow the natural pattern of problem discovery and resolution over the life of such ‘solution development’ projects. However, using a Norden-Rayleigh Curve is not without its pitfalls and is not a panacea for this type of work. Consequently, we will explore some alternative options in the shape (as it were) of Beta, PERT-Beta, Triangular and Weibull Distributions.

Chapter 3 is based fundamentally on the need to derive 3-Point Estimates; we discuss how we can use Monte Carlo Simulation to model and analyse Risk, Opportunity and Uncertainty variation. As Monte Carlo Simulation software is generally proprietary in nature, and is often under-understood by its users, we discuss some of the ‘do’s and don’ts’ in the context of Risk, Opportunity and Uncertainty Modelling, not least of which is how and when to apply correlation between apparently random events! However, Monte Carlo Simulation is not a technique that is the sole reserve of the Risk Managers and the like; it can also be used to test other assumptions in a more general modelling and estimating sense. We conclude the chapter with the warning that the use of a Bottom-up Monte Carlo Simulation for Risk, Opportunity and Uncertainty analysis is fundamentally optimistic, a weakness that is often overlooked.

To overcome this weakness, we take a more holistic view of Risk Opportunity and Uncertainty in Chapter 4 by taking a top-down perspective and exploring the use of the ‘Marching Army’ technique and Uplift Factors to give us a more pessimistic view of our estimate. We offer up the Slipping and Sliding Technique as a simple way of combining the Top-down and Bottom-up perspectives to get a more realistic view of the likely range of outcomes. Finally, we throw Estimate and Schedule Maturity Assessments back into the mix that we introduced in Volume I Chapter 3, as a means of gauging the maturity of our final estimate.

In Chapter 5 we warn about the dangers of the much-used Risk Factoring or Expected Value Technique. We discuss and demonstrate how it is often innocently but ignorantly abused, to quantify risk contingency budgets. We discuss better ways to do this using a variation on the same simple concept.

The need for planning resurfaces in Chapter 6 with an introduction to Critical Path Analysis and discuss how this can be linked with Monte Carlo Simulation and Correlation to perform Schedule Risk Analysis. This chapter concentrates on how we can determine the Critical Path using a simple technique with Binary Numbers.

In the last chapter of this final volume we discuss Queueing Theory (it just had to be last, didn’t it?) and how we might use it in support of achievable solutions where we have random arisings (such as spares or repairs) against which we need to develop a viable estimate. We discuss what we mean by a Memoryless System and how with Monte Carlo Simulation we can generate meaningful solutions or options.

1.4 Elsewhere in the ‘Working Guide to Estimating & Forecasting’ series

Whilst every effort has been made to keep each volume independent of others in the series, this would have been impossible without some major duplication and overlap.

Figure 1.1 Principal Flow of Prior Topic Knowledge Between Volumes

Figure 1.1 Principal Flow of Prior Topic Knowledge Between Volumes

Whilst there is quite a lot of cross-referral to other volumes, this is largely for those of us who want to explore particular topics in more depth. There are some more fundamental potential pre-requisites. For example, in relation to the evaluation of Risks, Opportunities and Uncertainties discussed in this volume, the relevance of 3-Point Estimates was covered in Volume I, and a more thorough discussion of the probability and statistics that underpin Monte Carlo Simulation can be found in Volume II.

Figure 1.1 indicates the principal linkages or flows across the five volumes, not all of them.

1.4.1 Volume I: Principles, Process and Practice of Professional Number Juggling

This volume clarifies the differences in what we mean by an Estimating Approach, Method or Technique, and how these can be incorporated into a closed-loop Estimating Process. We discuss the importance of TRACEability and the need for a well-documented Basis of Estimate that differentiates an estimate from what would appear in the future to be little more than a random number. Closely associated with a Basis of Estimate is the concept of an Estimate Maturity Assessment, which in effect gives us a Health Warning on the robustness of the estimate that has been developed. IRiS is a companion tool that allows us to assess the inherent risk in our estimating spreadsheets and models if we fail to follow good practice principles in designing and compiling those spreadsheets or models.

An underlying theme we introduce here is the difference between accuracy and precision within the estimate, and the need to check how sensitive our estimates are to changes in assumptions. We go on to discuss how we can use factors, rates and ratios in support of Data Normalisation (to allow like-for-like comparisons to be made) and in developing simple estimates using an Analogical Method.

All estimating basically requires some degree of quantitative analysis, but we will find that there will be times when a more qualitative judgement may be required to arrive at a numerical value. However, in the spirit of TRACEability, we should strive to express or record such subjective judgements in a more quantitative way. To aid this we discuss a few pseudo-quantitative techniques of this nature.

Finally, to round off this volume, we will explore how we might use Benford’s Law, normally used in fraud detection, to highlight potential anomalies in third party inputs to our Estimating Process.

1.4.2 Volume II: Probability, Statistics and Other Frightening Stuff

Volume II is focused on the Statistical concepts that are exploited through Volumes III to V (and to a lesser extent in Volume I). It is not always necessary to read the associated detail in this volume if you are happy just to accept and use the various concepts, principles and conclusions. However, a general understanding is always better than blind acceptance, and this volume is geared around making these statistical topics more accessible and understandable to those who wish to adventure into the darker art and science of estimating. There are also some useful ‘Rules of Thumb’ that may be helpful to estimators or other number jugglers that are not directly used by other volumes.

We explore the differences between the different statistics that are collectively referred to as ‘Measures of Central Tendency’ and why they are referred to as such. In this discussion, we consider four different types of Mean (Arithmetic, Geometric, Harmonic and Quadratic) in addition to Modes and the ‘one and only’ Median, all of which are, or might be, used by estimators, sometimes without our conscious awareness.

However, the Measures of Central Tendency only tell us half the story about our data, and we should really understand the extent of scatter around the Measures of Central Tendency that we use; this gives us valuable insight to the sensitivity and robustness of our estimate based on the chosen ‘central value’. This is where the ‘Measures of Dispersion and Shape’ come into their own. These measures include various ways of quantifying the ‘average’ deviation around the Arithmetic Mean or Median, as well as how we might recognise ‘skewness’ (where data is asymmetric or lop-sided in its distribution), and where our data exhibits high levels of Excess Kurtosis, which measures how spikey our data scatter is relative to the absolute range of scatter. The greater the Excess Kurtosis, and the more symmetrical our data, then the greater confidence we should have in the Measures of Central Tendency being representative of the majority of our data. Talking of ‘confidence’ this leads us to explore Confidence Intervals and Quantiles, which are frequently used to describe the robustness of an estimate in quantitative terms.

Extending this further we also explore several probability distributions that may describe the potential variation in the data underpinning our estimates more completely. We consider a number of key properties of each that we can exploit, often as ‘Rules of Thumb’, but that are often accurate enough without being precise.

Estimating in principle is based on the concept of Correlation, which expresses the extent to which the value of one ‘thing’ varies with another, the value of which we know or have assumed. This volume considers how we can measure the degree of correlation, what it means and, importantly, what it does not mean! It also looks at the problem of a system of variables that are partially correlated, and how we might impose that relationship in a multi-variate model.

Estimating is not just about making calculations, it requires judgement, not least of which is whether an estimating relationship is credible and supportable, or ‘statistically significant’. We discuss the use of Hypothesis Testing to support an informed decision when making these judgement calls. This approach leads naturally onto Tests for ‘Outliers’. Knowing when and where we can safely and legitimately exclude what looks like unrepresentative or rogue data from our thoughts is always a tricky dilemma for estimators. We wrap up this volume by exploring several Statistical Tests that allow us to ‘Out the Outliers’; be warned however, these various Outlier tests do not always give us the same advice!

1.4.3 Volume III: Best Fit Lines and Curves, and Some Mathe-Magical Transformations

This volume concentrates on fitting the ‘Best Fit’ Line or Curve through our data, and creating estimates through interpolation or extrapolation and expressing the confidence we have in those estimates based on the degree of scatter around the ‘Best Fit’ Line or Curve.

We start this volume off quite gently by exploring the properties of a straight line that we can exploit, including perhaps a surprising non-linear property. We follow this by looking at simple data smoothing techniques using a range of ‘Moving Measures’ and stick a proverbial toe in the undulating waters of exponential smoothing. All these techniques can help us to judge whether we do in fact have an underlying trend that is either linear (straight line) or non-linear (curved).

We begin our exploration of the delights of Least Squares Regression by considering how and why it works with simple straight-line relationships before extending it out into additional ‘multi-linear’ dimensions with several independent variables, each of which is linearly correlated with our dependent variable that we want to estimate. A very important aspect of formal Regression Analysis is measuring whether the Regression relationship is credible and supportable.

Such is the world of estimating, that many estimating relationships are not linear, but there are three groups of relationships (or functions) that can be converted into linear relationships with a bit of simple mathe-magical transformation. These are Exponential, Logarithmic and Power Functions; some of us will have seen these as different Trendline types in Microsoft Excel.

We then demonstrate how we can use this mathe-magical transformation to convert a non-linear relationship into a linear one to which we can subsequently exploit the power of Least Squares Regression.

Where we have data that cannot be transformed in to a simple or multi-linear form, we explore the options open to us to find the ‘Best Fit’ curve, using Least Squares from first principles, and exploiting the power of Microsoft Excel’s Solver.

Last, but not least, we look at Time Series Analysis techniques in which we consider a repeating seasonal and/or cyclical variation in our data over time around an underlying trend.

1.4.4 Volume IV: Learning, Unlearning and Re-Learning Curves

Where we have recurring or repeating activities that exhibit a progressive reduction in cost, time or effort we might want to consider Learning Curves, which have been shown empirically to work in many different sectors.

We start our exploration by considering the basic principles of a learning curve and the alternative models that are available, which are almost always based on Crawford’s Unit Learning Curve or the original Wright’s Cumulative Average Learning Curve. Later in the volume we will discuss the lesser used Time-based Learning Curves and how they differ from Unit-based Learning Curves. This is followed by a healthy debate on the drivers of learning, and how this gave rise to the Segmentation Approach to Unit Learning.

One of the most difficult scenarios to quantify is the negative impact of breaks in continuity, causing what we might term Unlearning or Forgetting, and subsequent Re-learning. We discuss options for how these can be addressed in a number of ways, including the Segmentation Approach and the Anderlohr Technique.

There is perhaps a misconception that Unit-based Learning means that we can only update our Learning Curve analysis when each successive unit is completed. This is not so, and we show how we can use Equivalent Units Completed to give us an ‘early warning indicator’ of changes in the underlying Unit-based Learning.

We then turn our attention to shared learning across similar products or variants of a base product through Multi-Variant Learning, before extending the principles of the segmentation technique to a more general transfer of learning across between different products using common business processes.

Although it is perhaps a somewhat tenuous link, this is where we explore the issue of Collaborative Projects in which work is shared between partners, often internationally with workshare being driven by their respective national authority customers based on their investment proportions. This generally adds cost due to duplication of effort and an increase in integration activity. There are a couple of models that may help us to estimate such impacts, one of which bears an uncanny resemblance to a Cumulative Average Learning Curve (I said that it was a tenuous link.)

1.4.5 Volume V: Risk, Opportunity, Uncertainty and Other Random Models

This is where we are now. This section is included here just to make sure that the paragraph numbering aligns with the volume numbers! (Estimators like structure; it’s engrained; we can’t help it.)

We covered this in more detail in Section 1.3, so we will not repeat or summarise it further here.

1.5 Final thoughts and musings on this volume and series

In this chapter, we have outlined the contents of this volume and to some degree the others in this series, and described the key features that have been included to ease our journey through the various techniques and concepts discussed. We have also discussed the broad outline of each chapter of this volume, and reviewed an overview of the other volumes in the series to whet our appetites. We have also highlighted many of the features that are used throughout the five volumes that comprise this series, to guide our journey and hopefully make it less painful or traumatic.

One of the key themes in this volume is how estimators can use random number simulation models such as Monte Carlo to estimate and forecast costs and schedules. Some of the main uses of these techniques include the generation of 3-Point Estimates for Risk, Opportunity and Uncertainty, for both cost and schedule. The same techniques can be used to develop models underpinned by Queueing Theory, which help establish operation models and support Service Availability Contracts.

However, the danger of toolsets which offer bespoke Monte Carlo Simulation capability, is that a ‘black-box’ mentality can set in, and estimators, forecasters and schedulers often plug numbers in at one end and extract them from the other without really understanding what is happening by default in-between. One of the intents of this volume is to take that ‘black-box’ and give it more of a more transparent lid, so that we better understand what is going on inside!

A word (or two) from the wise?

'No matter how many times the results of experiments agree with some, theory, you can never be sure that the next time the result will not contradict the theory.'

Stephen Hawking
British Physicist
b.1942

However, we must not delude ourselves into thinking that if we follow these techniques slavishly that we won’t still get it wrong some of the time, often because assumptions have changed or were misplaced, or we made a judgement call that perhaps we wouldn’t have made in hindsight. A recurring theme throughout this volume, and others in the series, is that it is essential that we document what we have done and why; TRACE-ability is paramount. The techniques in this series are here to help guide our judgement through an informed decision-making process, and to remove the need to resort to ‘guesswork’ as much as possible.

As Stephen Hawking (1988, p.11) reminds us just because a model appears to work well, is no guarantee that it is correct, or that we won’t get freak results!

TRACE: Transparent, Repeatable, Appropriate, Credible and Experientially-based

References

Hawking, S (1988) A Brief History of Time, London, Bantam Press.

Stevenson, A & Waite, M (Eds), (2011) Concise Oxford English Dictionary (12th Edition), Oxford, Oxford University Press.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.97.202