16

WHEN AND HOW TO USE METRICS

A CHECKLIST

There is nothing intrinsically pernicious about counting and measuring human performance. We all tend to project broad-ranging conclusions based on our inevitably limited experience, and measured data can serve as a useful counterpoint to those subjective judgments. The sort of measurements with which this book is concerned are performance metrics that quantify human achievement and failure. There are legitimate metrics of performance in almost every organization.

In our case studies, we’ve seen many instances in which metrics has been useful and effective.

In policing, computerized statistics of the incidence of crimes (Compstat) were used to good purpose, to discover where problems were greatest and where police resources were best deployed. It ran into problems only when officials used the threat of demotion or lack of promotion against those lower in the hierarchy to try to bring down the reported crime rates.

In universities, faculty evaluations can be enhanced by numerical data about publications and teaching. The metrics go awry when they are used mechanically by those who are not in a position to evaluate the accuracy and significance of the data.

In primary and secondary education, standardized tests can be used to inform teachers of how much or how little their students are learning in particular subjects. Teachers can consult with their colleagues, and adjust their methods and curriculum as a result. Problems arise when the tests become the primary basis on which teachers and schools are rewarded or punished.

In medicine, Peter Pronovost’s Keystone project demonstrates how effective diagnostic metrics can be in lowering the incidence of medical errors, when what is measured accords with the professional values of practitioners. The success of the Geisinger medical system illustrates the remarkable improvements made possible by computerized measurement when integrated into an institutional culture based on cooperation, where the setting of measurement criteria and the evaluation of performance are done by teams that include physicians as well as administrators. In both cases, metrics were used in ways that appealed to intrinsic motivation and to professionalism. But elsewhere in the medical system, as we’ve seen, the use of reward for measured performance sometimes proved fruitless or led to perverse outcomes.

Reflections on the best use of performance metrics by the U.S. Army in its counterinsurgency campaigns showed that while standardized metrics are often deceptive, metrics developed to fit the specific case, especially by practitioners with local experience, could be genuinely informative. The challenge in such cases is to abandon universal templates and discover what is worth counting, and what the numbers actually mean in their local context.

As we’ve seen time and again, measurement is not an alternative to judgment: measurement demands judgment: judgment about whether to measure, what to measure, how to evaluate the significance of what’s been measured, whether rewards and penalties will be attached to the results, and to whom to make the measurements available.

Should you find yourself in a position to set policy, here are the questions you should ask, and the factors you should keep in mind, in considering whether to use measured performance, and if so, how to use it. They constitute a checklist of successful performance measurement. Given what we’ve said about the hazards of metric fixation, consider at every point that the best use of metrics may be not to use it at all.

THE CHECKLIST

  1.  What kind of information are you thinking of measuring? The more the object to be measured resembles inanimate matter, the more likely it is to be measureable: that is why measurement is indispensable in the natural sciences and in engineering. When the objects to be measured are influenced by the process of measurement, measurement becomes less reliable. Measurement becomes much less reliable the more its object is human activity, since the objects—people—are self-conscious, and are capable of reacting to the process of being measured. And if rewards and punishments are involved, they are more likely to react in a way that skews the measurement’s validity. By contrast, the more they agree with the goals of those rewards, the more likely they are to react in a way that enhances the measurement’s validity.

  2.  How useful is the information? Always begin by reminding yourself that the fact that some activity is measureable does not make it worth measuring, indeed, the ease of measuring may be inversely proportional to the significance of what is measured. To put it another way, ask yourself, is what you are measuring a proxy for what you really want to know? If the information is not very useful or not a good proxy for what you’re really aiming at, you’re probably better off not measuring it.

  3.  How useful are more metrics? Remember that measured performance, when useful, is more effective in identifying outliers, especially poor performers or true misconduct. It is likely to be less useful in distinguishing between those in the middle or near the top of the ladder of performance. Plus, the more you measure, the greater the likelihood that the marginal costs of measuring will exceed the benefits. So, the fact that metrics is helpful doesn’t mean that more metrics is more helpful.

  4.  What are the costs of not relying upon standardized measurement? Are there other sources of information about performance, based on the judgment and experience of clients, patients, or parents of students? In a school setting, for example, the degree to which parents request a particular teacher for their children is probably a useful indicator that the teacher is doing something right, whether or not the results show up on standardized tests. In the case of charities, it may be most useful to allow the beneficiaries to judge the results.

  5.  To what purposes will the measurement be put, or to put it another way, to whom will the information be made transparent? Here a key distinction is between data to be used for purposes of internal monitoring of performance by the practitioners themselves versus data to be used by external parties for reward and punishment. For example, is crime data being used to discover where the police ought to deploy more squad cars or to decide whether the precinct commander will get a promotion? Or is a surgical team using data to discover which procedures have worked best or are administrators using that same data to decide whether the hospital will be financially rewarded or penalized for its scores? Measurement instruments, such as tests, are invaluable, but they are most useful for internal analysis by practitioners rather than for external evaluation by public audiences who may fail to understand their limits. Such measurement can be used to inform practitioners of their performance relative to their peers, offering recognition to those who have excelled and offering assistance to those who have fallen behind. To the extent that they are used to determine continuing employment and pay, they will be subject to gaming the statistics or to outright fraud.

Remember that, as we’ve seen, performance metrics that link reward and punishment may actually help reinforce intrinsic motivation when the goals to be rewarded accord with the professional goals of the practitioners.1 If, on the other hand, the scheme of reward and punishment is meant to elicit behavior that the practitioners consider useless or harmful, the metrics are more likely to be manipulated in the many ways we’ve explored. And if the practitioners are too geared toward extrinsic reward, they may well react by focusing their activity on what is measured and rewarded, at the expense of other facets of their work that may be equally important. For all these reasons, “low stakes” metrics are often more effective than when the stakes are higher.

Recall that direct pay-for-performance works best to the degree that people are motivated by extrinsic reward rather than intrinsic motivation, that is, when they care about making more money rather than about the other potential benefits of their work, social and intellectual. That may be because they are in a field, such as finance, in which people measure their own vocational success almost entirely in terms of the amount they earn. (As we’ve noted, that doesn’t preclude them from using their earnings for a wide range of purposes, including selfless ones.) It is when the job offers few other attractions—when it is repetitious and leaves little room for the exercise of choice, for example replacing windshields or preparing hamburgers—that pay for measured performance is more likely to work.

  6.  What are the costs of acquiring the metrics? Information is never free, and often it is expensive in ways that rarely occur to those who demand more of it. Collecting data, processing it, analyzing it—all of these take time, and their expense is in the opportunity costs of the time put into them. To put it another way, every moment you or your colleagues or employees are devoting to the production of metrics is time not devoted to the activities being measured. If you’re a data analyst, of course, producing metrics is your primary activity. For everyone else, it’s a distraction. So, even if the performance measurements are worth having, their worth may be less than the costs of obtaining them. Remember, too, that those costs in human time and effort are themselves almost impossible to calculate—another reason to err on the side of caution.

  7.  Ask why the people at the top of the organization are demanding performance metrics. As we’ve noted, the demand for performance measures sometimes flows from the ignorance of executives about the institutions they’ve been hired to manage, and that ignorance is often a result of parachuting into an organization with which one has little experience. Since experience and local knowledge matter, lean toward hiring from within. Even if there is someone smarter and more successful elsewhere, their lack of particular knowledge of your company, university, government agency, or other organization may not outweigh the benefits of hiring from within.

  8.  How and by whom are the measures of performance developed? Accountability metrics are less likely to be effective when they are imposed from above, using standardized formulas developed by those far from active engagement with the activity being measured. Measurements are more likely to be meaningful when they are developed from the bottom up, with input from teachers, nurses, and the cop on the beat. That means asking those with the tacit knowledge that comes from direct experience to provide suggestions about how to develop appropriate performance standards.2 Try to involve a representative group of those who will have a stake in the outcomes.3 In the best of cases, they should continue to be part of the process of evaluating the measured data.

Remember that a system of measured performance will work to the extent that the people being measured believe in its worth. So far, in this chapter, we’ve taken the perspective of those in a position to decide whether and how to institute metrics. But what if you are not in such a position, if you’re further down in the organizational hierarchy, where you are expected to execute metrics—a mid-level manager, say, or the head of an academic department? Then, you face a choice. If you believe in the goals for which the information is being collected, then your challenge is to provide accurate data in the most efficient way possible, one that demands the least time of you and those you manage. If, by contrast, you believe that the goals are dubious and the process wasteful, you might try to convince your superiors of that (perhaps by giving them a copy of this book). If that fails, then your task is to provide data in a way that takes the least time, meets minimal standards of acceptability, and won’t harm your unit.

If you’re near the top of the organization, making decisions about metrics, reread the previous paragraph, keeping in mind the different ways in which those below you might react. Metrics works best when those measured buy into its purposes and validity.4

  9.  Remember that even the best measures are subject to corruption or goal diversion. Insofar as individuals are agents out to maximize their own interests, there are inevitable drawbacks to all schemes of measured reward. If, as is currently still the case, doctors are remunerated based on the procedures they perform, that creates an incentive for them to perform too many procedures that have high costs but produce low benefits. But pay doctors based on the number of patients they see, and they have an incentive to see as many patients as possible, and to skimp on procedures that are time-consuming but potentially useful. Compensate them based on successful patient outcomes, and they are more likely to cream, avoiding the most problematic patients.5

That doesn’t mean that performance measures should be abandoned just because they have some negative outcomes. Such metrics may still be worth using, despite their anticipatable problems: it’s a matter of trade-offs. And that too is a matter of judgment.

10.  Remember that sometimes, recognizing the limits of the possible is the beginning of wisdom. Not all problems are soluble, and even fewer are soluble by metrics. It’s not true that everything can be improved by measurement, or that everything that can be measured can be improved. Nor is making a problem more transparent necessarily a step to its solution. Transparency may make a troubling situation more salient, without making it more soluble.

In the end, there is no silver bullet, no substitute for actually knowing one’s subject and one’s organization, which is partly a matter of experience and partly a matter of unquantifiable skill. Many matters of importance are too subject to judgment and interpretation to be solved by standardized metrics. Ultimately, the issue is not one of metrics versus judgment, but metrics as informing judgment, which includes knowing how much weight to give to metrics, recognizing their characteristic distortions, and appreciating what can’t be measured. In recent decades, too many politicians, business leaders, policymakers, and academic officials have lost sight of that.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.201.55