We have completed two-thirds of our journey through sustainability engineering for systems engineers, and our final push is through the landscape of supportability engineering. While we strive to avoid failures through design for reliability, failures are nonetheless inevitable at least because it is at least possible, if not indeed likely, that we may not be able to anticipate all the possible failure modes in a system. So a well-designed system, anticipating that outages will occur, incorporates features that help minimize their duration. In the extract from a system history diagram shown in Figure 10.1, the outage period is divided into two parts: an initial period of time devoted to preparation for repair and a subsequent period of time devoted to execution of repair. This distinction is made so that each duration may be minimized separately by the application of techniques particular to the needs of the two activities of preparation and execution. Maintainability engineering, the body of knowledge connected with execution of repairs, was the subject of Part II of this book. We now turn to supportability engineering, the body of knowledge connected with preparation for repair, in Part III.
This chapter begins the study of supportability engineering by examining support requirements. To do this, we need to know
We begin with a discussion of supportability as a system attribute and how systems engineers influence supportability by creation of appropriate requirements and undertaking or commissioning quantitative studies for system features and attributes that directly affect supportability. As we did previously with reliability and maintainability, we list and discuss various supportability effectiveness criteria and figures of merit that are routinely used in the creation of support requirements. These help us understand what makes a good support requirement and form the basis for interpreting support requirements and comparing support performance against requirements. The chapter closes with a review of current best practices in support requirements development.
Time was when support was considered synonymous with logistics. When, for example, multilevel repair schemes were first implemented, logistics constituted a clearly visible large cost. However, continued operation of more and more sophisticated repair operations made it clear that there were many other factors involved in ensuring that repairs could be executed quickly. These other factors are grouped under the heading of supportability.
Supportability is a system property comprehending preparation for effective and efficient repair. We may say that supportability is the degree to which a system contains features and procedures that enable rapid, inexpensive, and error-free repair to be carried out. In other words, supportability addresses the degree to which the system is prepared to execute maintenance. Supportability comprehends the properties and operations that make a system more or less ready to have maintenance performed on it in a speedy, low-cost, and error-free manner. Sometimes, the synonym serviceability is used.
When a system failure occurs, users are vitally concerned with rapid restoration of system functioning. In addition, while an outage persists, external failure costs mount up:
Through study of work, industrial engineers have determined that the activity of repair is fruitfully divided into two parts: one part concerning preparation for repair and the other part concerning execution of repair. This division promotes more effective and efficient service restoration because it allows for the study of each part separately and application of techniques and tools adapted to each part separately. This chapter concerns the first part, preparation for repair. Chapters 10 and 11 concerned the second part, execution of repair.
Language tip: Refer to Figure 10.1. The figure shows a typical outage interval divided into two parts: a part involving preparation for repair and a subsequent part involving execution of repair. The first part, the preparation for repair, is a support interval, and its duration is a support time or supportability time. This interval is normally considered to include activities directly connected with preparation for the particular repair to be proximately undertaken. Such activities include collecting the tools, instructions, and spare parts required for this repair, clearing the workstation of clutter from a previous repair, assembling the staff necessary for the repair, etc., but as pertaining to only the individual repair that is about to be begun. We will describe these as “online” supportability activities. On the other hand, supportability as a system property may include other factors that promote more efficient, speedier repair as a general property of the repair facility and processes. Such factors may involve the design of repair workstations and their layout in the repair facility, creation of broadly applicable documentation and workflow management policies and supporting tools, spares inventory provisioning and management, and other more general design features beneficial to repair execution overall. Supportability actions taken for the general improvement of the repair facility, without respect to any particular repair, will be called “off-line” supportability and are discussed in Chapter 13, Design for Supportability. In the literature, supportability is used to describe both of these types of activities, usually without further distinction.
Language tip: Support and supportability are not the same thing. We use support to refer to specific activities undertaken to prepare for efficient and effective repair. Support includes things like spares inventory size determination, provision of tools and documentation, repair workstation ergonomics, and other specific processes and procedures needed to make repairs go smoothly. Supportability, on the other hand, is the result of support procedures and processes. It is the degree to which the system is prepared to be repaired quickly, inexpensively, and in an error-free manner. Effective support procedures and processes make for good supportability. Haphazard, unplanned, or otherwise inadequate support procedures and processes mean that supportability will not be as good. In this book, we sometimes use “degree of supportability” and “supportability” as the same thing even though “degree of supportability” is redundant. Finally, not every document or discussion makes a distinction between “support” and “supportability.” It is a contention of this book that being careful about language is a necessary condition for systems engineers to be better able to do their jobs effectively.
Certain factors under the control of the system development team can improve supportability. These should be considered as part of the team’s efforts to produce a system that meets customer needs for short outages. Relevant factors include
This list contains only off-line factors in the sense introduced in Section 12.2.1. Appropriate attention paid to these factors can help improve the supportability of the design. Section 13.3 describes implementation issues, including some quantitative models that would be useful in the optimal allocation of resources to these tasks.
The scope of supportability engineering includes any activities that take place before execution of repair begins that promote faster, more efficient repair. In keeping with the factors cited in Section 12.2.2, these may include
As usual, the task of measuring and monitoring supportability falls to the creation of suitable support effectiveness criteria and figures of merit, developing requirements for these, and collecting data from repair and maintenance support operations to enable determination of the degree to which support requirements are helpful and are being met. We discuss support effectiveness criteria and figures of merit in Section 12.4.
Once effectiveness criteria and figures of merit have been defined for the support factors that are important to the system or service operation, requirements for these may be developed. Requirements form part of the overall support concept for the system or service. Not all support concepts need be alike. Support requirements focus development and management attention on those particular supportability issues that are important for the design of the product, system, or service. We introduce the notion of the support concept in Section 12.3. Sections 12.5 and 12.6 discuss design and interpretation of support requirements.
We refer to the overall plan to support a system or service as the system (or service) support concept. This plays the same role in system or service design for supportability as the system (or service) maintenance concept does for maintainability. The system support concept comprehends
At the very early stages of design, when the support concept is first defined, one should not expect firm answers for all these issues. However, it is in keeping with the spirit of prevention and quality engineering to begin thinking about these issues as soon as is practical. The support concept should be continually updated and become more precisely specified as greater understanding and specificity of the design and how it is to be repaired are attained. When the system development is complete, the support concept, or plan to support the system, should contain specific elements addressing each of the items listed earlier.
While services are intangible, it is still necessary to prepare for service restoration when a service outage occurs. In addition to system support issues that need to be addressed for elements of the service delivery infrastructure, support for a service also comprehends
The most basic support effectiveness criterion is the overall amount of time required to complete all tasks preliminary to, or in preparation for, a particular repair. Because it is related to a particular repair, and not the processes connected with repairs in general, this is an on-line supportability effectiveness criterion. Call this the support time and denote it by W. This time is represented by the first part of the outage interval pictured in Figure 10.1. As usual, we treat W as a random variable because of many potential noise factors influencing this duration. A figure of merit reflecting the complete supportability picture is the probability that W does not exceed x (x ≥ 0 is a discretionary variable). This is the distribution of W and as such is often difficult to estimate. As usual, abbreviations are employed: the mean support time is the most common, and the median support time is also useful if a figure of merit is needed that is less influenced by extremely large or extremely small values.
Support time comprises other individual activity durations, and when it is desirable to focus attention on these other durations, individual effectiveness criteria may be defined for them. These include
Creation of support effectiveness criteria for these contributory components of support time is in keeping with the key concept in the systems engineering that if it is necessary to focus attention on a particular supportability issue, then one or more quantitative measurements related to that issue should be devised, monitored, and publicly tracked. The variables covered by this reasoning typically are arrived at during the “check” portion of the Deming Cycle when procedures have been established but unsatisfactory performance is taking place. When unsatisfactory performance is noted, it is necessary to determine
Should a special cause be identified as negatively affecting an operation, Table 12.1 may be helpful in diagnosing underlying ailments. That is, all the symptoms listed in column 1 of the table happen from time to time. It is up to the process manager to determine whether their occurrence is part of normal statistical fluctuation within the process capability or is an extraordinary occurrence pointing to a change in underlying process conditions that needs to be tracked down and remedied. A control chart [2] is a good, simple approach to this discrimination task.
Table 12.1 Supportability Tracking Variables
Symptom | Possibly Indicating |
Repair workstation buffer overflows | Inadequate workstation staffing or facilities, failure to adequately characterize the time needed for a repair, or other workstation deficiency |
Frequent stockoutsa | Improper inventory sizing |
Replenishment order delays | Inadequate management of suppliers and/or logistics functions |
Excess time consumed in a particular repair operation | Inadequate documentation, training, tools, etc., and inadequate characterization of activities needed for a repair |
Failure to meet a fault location time requirement | Improper design of, or error(s) in, fault location routines |
Lost shipment(s) | Lack of robustness in the supply network |
a A stockout is an instance in which the inventory of a spare part is depleted (empty) at a time the spare part is needed.
Support effectiveness criteria may be developed for each symptom listed. For instance, if frequent stockouts are noted, a support effectiveness criterion “number of stockouts per week” may be implemented. If finer control were needed, this effectiveness criterion could be defined for each spare part type in the facility. Successive values of effectiveness criteria may be tracked on control charts to ensure that design and/or process changes in the repair facility are undertaken only in response to true signals (special causes) rather than to normally expected statistical fluctuation within the process capability. While a good deal of benefit may be obtained from even the simplest control charts, more complicated situations may benefit from some of the more advanced control chart techniques [2].
In addition to time- or duration-related support effectiveness criteria, effectiveness criteria relating to other important supportability factors may be found useful. Some examples include
Some of these are discrete (count) effectiveness criteria. As is true with maintainability, there are a great many supportability variables that can potentially be tracked. A good principle is to choose a minimal set of useful variables that will give an adequate picture of supportability given the particular needs of the system or service. As noted previously, you can measure anything you like as long as there is a demonstrated need for and benefit from the measurement (outweighing its cost), you give the measurement a descriptive name, and it is used consistently throughout the development.
As with reliability requirements and maintenance requirements, support requirements may be written in terms of either relevant effectiveness criteria or in terms of related figures of merit. The choice determines how the requirement will be interpreted and how supportability data will be analyzed to verify conformance with the requirement. In addition, guiding principles for support requirements are similar to those for reliability and maintenance requirements. Requirements should be written
In addition, it is beneficial to be able to make some judgment about whether a requirement is being met without having to wait for an extended period of time. Neither the customer’s nor the supplier’s interest is served if it takes 20 years to determine whether a requirement has been met or not. This is sometimes a problem for reliability requirements, where durations of operating times tend to be (and it is better if they are) long. It is easier to accomplish this objective in maintenance and support requirements because the durations of maintenance and support events tend to be shorter.
The most basic online support elapsed time requirement places a limit on the support time itself, as, for example, “the support time shall not exceed 1 hour for a repair of type A, 2 hours for a repair of type B, and 3 hours for all other repair types, when repair is undertaken using the specified procedures.” In this example, the requirement is written in terms of a support effectiveness criterion, and as we have done previously for reliability and maintenance requirements, we interpret a requirement written for an effectiveness criterion as applying to every relevant instance (in this case, every repair commenced). Elapsed time requirements may also be written in terms of appropriate figures of merit, such as the mean support time, median support time, 90th percentile support time, etc. In that case, the requirement is interpreted as pertaining to a population of relevant instances (in this case, repairs commenced). As before, if a census of the population is available, simple arithmetic allows one to see whether the requirement is being met. Otherwise, standard statistical inference procedures, similar to those considered in Chapters 2, 5, and 10, are needed to estimate the probability that the requirement is being met.
Other important online support elapsed time effectiveness criteria include
The total time elapsed between when spare parts are ordered and the delivery of those parts is an important example of an important off-line support duration effectiveness criterion. Other off-line support effectiveness criteria that are continuum variables but are not necessarily duration-based include
Here are some additional examples of support duration requirements:
In addition to monitoring important elapsed times as indicators of process efficiency, support effectiveness criteria involving counting discrete items may be the subject of requirements. Some of these include
Our framework for interpreting support requirements is the same as we have used for reliability and maintenance requirements. If a support requirement is written in terms of an effectiveness criterion, then
If a support requirement is written in terms of a supportability figure of merit, then
These considerations apply whether the variable in the requirement is continuum (e.g., an elapsed time, duration, or cost variable) or discrete (e.g., a count variable). Here are two examples.
Example: The requirement is “there shall be at most one stockout in the facility, over all inventoried part types, in each month.” The following data were collected over a period of 2 years: 0, 0, 0, 3, 0, 1, 1, 0, 0, 0, 0, 0, 2, 0, 5, 0, 0, 0, 0, 0, 1, 0, 1, 0. Each observation is the number of stockouts in the facility over 1 month. The requirement is written on a supportability effectiveness criterion. The data form a census of the facility’s stockouts over a period of 2 years. The number of months in which the requirement is not met is 3. If no changes are made to the inventory operation, we may estimate the probability that the requirement will be met in a given month by treating these observations as a sample from some (currently invisible) future stream of data. The sample proportion of months in which the requirement is met is 21/24 = 0.88 with a standard error of 0.05. A 95% confidence interval for the probability that the requirement will be met in a future month is
Example: The requirement is “the mean logistics delay time for repair jobs in the facility shall not exceed 1 hour.” The following data were collected in the facility on logistics delays (in hours): 0, 0, 4, 0.5, 0, 0, 0, 0.3, 14, 0, 0, 1, 0, 0, 0, 0, 2.1, 0, 0, 0.7. If these 20 observations represent a census of the jobs in the facility over some period of time, the fact that the sample mean of these data is 1.13 shows that the requirement is not met for these 20 jobs. If instead the data represent a sample from some period of operation of the facility (not all logistics delays were measured over that period of time), we may estimate the probability that the requirement is met over that period of time. The sample standard deviation is 3.18, so the probability that the requirement is met over this period of time is 0.48, the probability that a normal random variable having mean 1.13 and standard deviation 3.18 is less than or equal to 1. In this example, we may wish to examine observation 14 to determine whether it is an unusually large value, or whether the process commonly throws out values that large or larger.2 If we eliminate this observation from the data, we now have a sample mean of 0.45 and a sample standard deviation of 0.98. The probability that the requirement is met over this time period is now 0.71, quite a bit higher than before. The decision about whether observation 14 is routine (within the process capability) or due to a special cause has important consequences.
After significant efforts to design for reliability and assure that failures and outage in high-consequence systems are infrequent, it is still of interest to complete as speedily as possible corrective actions on any failures that do occur. Section 10.7 offered some suggestions for minimizing the time required for repair execution in high-consequence systems; a capsule summary of those recommendations is essentially to consider implementing all design for maintainability practices in a high-consequence system. In other words, while profitability may still be an important factor in high-consequence systems, the serious consequences of failures and outages make it reasonable to assert that systems engineers need to justify exclusions of these practices, rather than having to justify inclusions as would be the case for ordinary (not high-consequence) systems. The same holds true for supportability. Chapter 13 discusses many online and off-line designs for supportability practices. In most systems that are not high consequence, only a few of these practices will be implemented, depending on the particular needs of the system and its business case. Because external failure costs in a high-consequence system are large, consider requiring that the omission of any design for supportability practices be justified in the system’s business case—where, in a system that is not high consequence, the inclusion of design for supportability practices may need to be justified.
As was the case with maintenance, recommended practices for developing support requirements are based on contemporary quality engineering principles and quantitative reasoning. Maintainability is concerned with speedy, low-cost, and error-free repair. Supportability engineering prepares the way so that is becomes possible to execute speedy, low-cost, and error-free repair. Depending on whether maintenance is performed by the customer, the supplier, or a third party, various opportunities for optimizing parts of the support environment are created.
Once support requirements are in place, use the design for supportability techniques in Chapter 13 to arrange the system and its supporting infrastructures to meet the requirements. Success here creates a stronger value proposition for the customer when the customer is responsible for repairs. In case the supplier performs repairs, the requirements influence the cost and profitability of the repair operation.
Three cases need to be considered: for products and systems, maintenance may be performed by the customer (owner/purchaser) of the equipment, by the supplier, or by a third party under contract. A service is normally maintained by the service provider.
Better supportability may be used to improve the value proposition for a prospective system purchaser when the purchaser will be responsible for support. This is common in the defense and similar industries in which standards prevail and requirements are negotiated between customers and suppliers. Support needs in this case include assistance with various support factors, such as
When the supplier performs maintenance, there is incentive for cutting cost while remaining effective. Good supportability is needed here so that repair operations, if not actually turning a profit, remain low in cost so they do not degrade the profitability of the system. Because repair operations are under control of the supplier, they have the opportunity to optimize all facets of repair operations, including a robust FRACAS implementation that offers the opportunity to learn from failures with a short feedback loop time. A supplier in this case should carefully consider the factors covered in Chapter 13 and implement optimization models based on the introductory material in Chapter 13 to promote a low-cost and effective repair enterprise.
In case maintenance is performed by a third party unrelated to the supplier or to the customer, contractual provisions determine how effective and efficient the operation can be for the customer (system purchaser). The third party also becomes a potential customer of the supplier for materials, tools, spare parts, etc., needed to carry out repairs. Quality of repairs may be more difficult to control unless incentives are arranged so that the third party has a stake in not only speed and low cost but also quality (error-free repairs).
It may seem that only in the case of maintenance performed under contract by a supplier should cost and labor-hour factors be significant. However, even in the case where maintenance is to be performed by the customer, it is in the supplier’s interest to understand and factor into support requirements the cost and labor-hour burdens created by maintenance procedures. Customers want to see that the supplier understands their needs and is acting to help them succeed. When possible, the supplier should work with customers to understand their processes and arrange the system to better align with those processes to improve supportability.
The business case for the system should be considered when developing the system or service support concept. Support for low-value products or products that become obsolete quickly may not need to be as extensive as support for systems with long useful lives or for high-consequence systems. It may be possible to contemplate building an optimization model to help decide the appropriate level of support. Such a model would require specific information on costs, profitability objectives, projected sales, etc., that might be known only probabilistically, so stochastic optimization might be required to carry out models like these. The mathematics and even the application of these models are beyond the scope of this book. Suffice it to say that even informal consideration of these factors will add value to system or service development.
Many of the factors affecting supportability may be fruitfully studied quantitatively with a variety of operations research models. Logistics, transportation, inventory sizing and management, staff sizing, and other factors bearing on the ability to repair a system or service effectively and efficiently are widely studied, and results for even quite complicated scenarios can be found in the literature. In this book, Chapter 13 discusses some of these in some detail. In keeping with the spirit of the book, not all topics are covered in detail, and references are provided for you to start your own exploration of these models where desired. Topics included here are discussed because systems engineers need to know the kinds of things that need to be done in the sustainability disciplines without necessarily needing to know the details needed to carry out the studies themselves. In practice, this means that support requirements need to be guided by quantitative models of relevant support operations, and the systems engineer will normally be a supplier and a customer for the specialists who develop and use the models. As a supplier, be prepared to communicate customer needs for support that were determined from a formal or informal balance between those needs and the business case for the system or service. As a customer, be prepared to use the results of quantitative modeling to develop requirements on a rational basis.
We have recommended writing requirements in quantitative form so that it is possible to collect data and verify whether they are being met. Routine verification using a systematic, repeatable process approach is recommended so that a realistic understanding of realized supportability may be acquired. As we discussed for maintainability, we do not want to take action every time we see a requirement not being met because measurements on any process subject to noise factors will exhibit some degree of fluctuation. It’s important not to waste resources responding to every fluctuation in measurement—you want to reserve corrective action for those cases where a real change in the process is indicated. Maintenance requirements and support requirements are similar in this regard. They both concern time durations that are relatively short and count events that happen fairly frequently. So many of the same ideas concerning management by fact are similar for support requirements as they are for maintenance requirements. The ideas discussed in Section 10.8.4 may be applied here equally well.
This chapter is concerned with the creation of effective support requirements. The property of the system that comprehends its readiness for rapid, low-cost, and error-free repair is called supportability. The chapter stresses beginning to create a support plan as soon as a design concept is developed and continually updating the support plan as the design concept matures. Several support effectiveness criteria and figures of merit are reviewed, including continuum variables and count variables. Interpretation and verification of support requirements are discussed through the use of statistical sampling and analysis techniques for support-related data. The chapter prepares for the discussion of design for supportability to be covered in Chapter 13.
18.118.99.7