9

MEDICINE

Nowhere are metrics in greater vogue than in the field of medicine. Nowhere, perhaps, are they more promising. And the stakes are high.

But here too, metrics play a variety of roles—some genuinely useful, some of more dubious worth.

One role is informational and diagnostic: the process of keeping track of various methods and procedures, and then comparing the outcomes, makes it possible to determine which are most successful. The successful methods and procedures can then be followed by others.

Another is publicly reported metrics, intended to provide transparency to consumers, and a basis for comparison and competition among providers.

Yet another is pay-for-performance, in which accountability is backed up with monetary rewards or penalties. Advocates of the use of metrics in medicine often discuss these very different roles in the same breath.

The great push in recent decades has been for metrics to be used not only to improve safety and effectiveness but also to contain costs.

THE FINANCIAL PUSH TO CONTROL COSTS

The impetus to employ metrics to control costs has come from a number of directions, and arises from a variety of motives. For years, medical costs have been rising more quickly than national income, and they are projected to continue to do so for at least the next decade: in 2014, the health sector made up 17.5 percent of the American economy, and is expected to reach 20.1 percent by 2025. There are some good reasons for that: health expenditure is what economists call a “luxury good”—the richer people are, the more they are willing to spend on it. Then there is the fact that as the baby boom generation ages, that large cohort of the population is reaching the age of maximal medical expenditures. Add to that the availability of more specialty drugs and the faster growth in drug prices. The adoption of the Affordable Care Act meant that an ever higher percentage of healthcare spending in the United States would be by the government, with the share of total health expenditures paid for by federal, state, and local governments projected to increase to 47 percent by 2025.1

The increasing cost of healthcare has led both private insurers and government insurers (the National Health Service in Britain; and Medicare, Medicaid, and the Veterans Administration in the United States) to put pressure on doctors and hospitals to lower reimbursement rates and to improve outcomes. At the same time as pressure to control costs is escalating, the new technology of electronic health records has made the collection of medical data more readily obtainable, creating a temptation to exploit the data to identify problems. The upshot has been a huge increase in public reporting and in pay-for-performance, both of which were hailed as cures for the ills of the healthcare system in the United States and abroad. The problems are real enough: third-party payers, whether insurance companies or government agencies such as Medicaid and Medicare, do need reliable evidence that doctors and hospitals are providing services in an effective and cost-efficient manner. But the touted cures have sometimes proved almost as bad as the diseases they were meant to treat.

RANKING THE AMERICAN MEDICAL SYSTEM

But before we examine those purported cures, it is worth visiting the most influential performance metrics used to characterize the American healthcare system, metrics frequently cited as evidence of the need for more accountability and for paying for measured performance. They come from the World Health Organization’s “World Health Report 2000,” which ranked the United States healthcare system as thirty-seventh among the nations of the world, and stated, “It is hard to ignore that … the United States was number 1 in terms of healthcare spending per capita but ranked 39th for infant mortality, 43rd for adult female mortality, 42nd for adult male mortality, and 36th for life expectancy.”2 Scott W. Atlas, a physician and healthcare analyst, has scrutinized and contextualized these claims, which turn out to be more than a little misleading.

Most of us assume that the WHO rankings measured the overall level of health. But actual health outcomes accounted for only 25 percent of the ranking scale. Half of the points awarded were for egalitarianism: 25 percent for “health distribution,” and another 25 percent for “financial fairness,” where “fairness” was defined as having everyone pay the same percent of their income for healthcare. That is, only a system in which the richer you are, the more you pay for healthcare was deemed fair. The criterion, in short, was ideological. The fact that there was a number attached (37th) gave it the appearance of objectivity and reliability.3 But in fact, the overall performance ranking is deceptive.

What about the figures for mortality and life expectancy? These, it turns out, are influenced in large part by factors outside the medical system, factors having to do with culture and styles of life. Obesity tends to foster chronic and debilitating illnesses such as type-II diabetes and heart disease—and Americans are, on average, more obese than citizens of other nations (though some of the others are catching up quickly). Cigarette smoking also contributes mightily to heart disease, cancer, and other ailments, and may do so decades after a person gives up the habit. Americans, it turns out, were heavy smokers by international standards for generations, up through the 1980s. Americans have disproportionately high rates of death from gunshot wounds, another factor that is lamentable, but has almost nothing to do with the medical system.4 Moreover, the United States is an ethnically heterogeneous country, and some ethnic groups (such as African Americans) have disproportionately high rates of infant mortality, reflecting social, cultural, and possibly genetic factors.5 In short, many of the problems of American health are a function not of the medical system but of social and cultural factors beyond the medical system. When it comes to diagnosing and treating disease, Atlas notes, American medicine is among the best in the world.6

Here, as in other areas such as education and public safety, many of the most important factors making for relative success or failure lie beyond the formal systems that we try to measure and hold accountable. Getting enough exercise; eating right; keeping firearms out of irresponsible hands; and refraining from smoking, overconsumption of alcohol, drugs, and hazardous sex—these are the main factors contributing to health and longevity. Physicians and public health officials should try to influence them—and try they do. But these life-style patterns are largely matters beyond their control. We must keep that in mind in evaluating the purported failures of American medicine. Yet even if we take the alarmist metrics of the WHO report with a grain of salt, it is still true that healthcare in the United States is expensive and open to improvement.

METRICS AS SOLUTION

Perhaps the most popular trend in American health policy is the promotion of performance metrics, accountability, and transparency. Measured performance is supposed to allow practitioners to better assess clinical practices and to track their implementation; allow insurers to reward success and penalize failure; and through ratings and report cards, create transparency in ways that will allow patients to make more informed choices about medical providers.

One booster is Michael E. Porter of the Harvard Business School, whose “value agenda” includes the application of management metrics to medicine. Porter claims,

Rapid improvement in any field requires measuring results—a familiar principle in management. Teams improve and excel by tracking progress over time and comparing their performance to that of peers inside and outside their organization. Indeed, rigorous measurement of value (outcomes and costs) is perhaps the single most important step in improving health care. Wherever we see systematic measurement of results in health care—no matter what the country—we see those results improve.7

Porter is a great believer in public reporting of outcomes, which is thought to provide a powerful incentive for improving performance. That makes sense—in theory.

THREE TALES OF SUCCESS

Porter points to the Cleveland Clinic as a pioneer of his recommended approach. The clinic annually publishes fourteen “outcome books” that document its performance in treating a remarkable variety of ailments. A look at those documents (which are available online) indicates a high rate of success in each category. And the Cleveland Clinic attracts patients from around the world.

A convincing example of the potential virtues of medical metrics, also touted by Michael Porter, comes from the Geisinger Health System, a physician-led, not-for-profit, integrated system that serves some 2.6 million people in Pennsylvania, many of them rural and poor. Geisinger is a showcase for progressive healthcare in the United States—and with good reason.8 A pioneer in the use of electronic health records, Geisinger in 1995 began to invest more than $100 million in its electronic health records system, and gave doctors an incentive to have their patients sign up for an online portal. That system allows for the ready transmission of information to providers in the system, and for the monitoring of performance of the units, including individual physicians. The system employs nurse case-managers for patients at high risk, who educate patients about their condition, monitor them, review their care plans and medications, and make follow-up appointments. The two most costly and widespread conditions in American healthcare are diabetes and heart disease. In the Geisinger system, patients with such conditions are treated by an integrated team of physicians and physician assistants, pharmacists, dieticians, and more. Rather than parceling out treatment to a series of providers, whose contact with one another might be minimal, Geisinger employs a more holistic approach. Some 20 percent of physician compensation is tied to goals related to cutting costs, improving quality of care, and patient satisfaction, while the other 80 percent of compensation is based on fee-for-service. Through its panoply of innovative programs, Geisinger has succeeded in lowering costs and improving patient outcomes.

One of the more unequivocally successful uses of metrics in medicine is the use of performance measures to reduce hospital-induced infections acquired from “central lines.” Central lines are the flexible catheter tubes inserted into a large vein through the neck or chest, as a conduit for medicines, nutrients, and fluids. Central lines are among the most common elements of modern hospital medicine—and, until recently, one that contributed the most to complications. That is because the catheters provide a ready avenue of infection, infections that are deadly in the worst cases, and are costly to treat even in the best cases. In 2001 it was estimated that in the United States there were some 82,000 blood infections associated with central lines. The costs per infection ranged from $12,000 to $56,000. Almost 32,000 people died.9

Since then, the rate of acquired infections has dropped dramatically, thanks in no small part to the efforts of Peter J. Pronovost, a critical-care specialist at Johns Hopkins University hospital in Baltimore. Together with his colleagues, he developed a program based on a checklist of five standard yet simple procedures that in combination reduced the likelihood of central-line-induced infection. After applying his program at Johns Hopkins, Pronovost supervised its application at a hospital system in Michigan, in what was known as the “Michigan Keystone ICU Project.” Similar programs have since been implemented throughout the United States, as well as in England and Spain. The results have been dramatic: blood stream infections dropped by 66 percent, saving thousands of lives and millions of dollars.

The Keystone project includes gathering monthly data on infection rates, which are reported to the leaders of intensive care units and to top hospital officials. The results are discussed with the larger staff, with an eye to learning from mistakes. This is an instance of diagnostic metrics. It provides data that can be used by a practitioner (physician), or internally within an institution (hospital), or shared among practitioners and institutions to discover what is working and what is not, and to use that information to improve performance.

The Keystone project involved extensive use of diagnostic metrics, as well as some psychic incentives in the form of peer pressure. Pronovost himself accounts for its success by the fact that the project worked through clinical communities, working toward common professional goals and treating central line–induced infections as a solvable social problem. Seeing their infection rate compared to other hospitals also created peer pressure, to try to keep up with or exceed the success rate of peer institutions.

WHAT SHOULD WE CONCLUDE FROM THESE SUCCESSES?

The Cleveland Clinic, Geisinger, and the Keystone project are frequently cited as proof of the efficacy of measuring performance, and with reason. Yet when we dig more deeply, we find that the metrics matter because of the way they are embedded into a larger institutional culture.

Is the success of the Cleveland Clinic a function of the fact that the Clinic publishes its outcomes? Or is the Clinic eager to publicize its outcomes precisely because they are so impressive? In fact, the Cleveland Clinic was one of the world’s great medical institutions before the rise of performance metrics, and it maintains that standing in the age of performance metrics. But to conclude that there is a causal relationship between the clinic’s quality and the publication of its performance metrics is to fall prey to the fallacy of post hoc ergo propter hoc. The success may have far more to do with local conditions—the ways in which the organizational culture of the Cleveland Clinic makes use of metrics—than with quality measurement per se.10

Metrics at Geisinger are effective because of the way in which they are embedded in a larger system. Crucially, the establishment of measurement criteria and the evaluation of performance are done by teams that include physicians as well as administrators. The metrics of performance, therefore, are neither imposed nor evaluated from above by administrators devoid of firsthand knowledge. They are based on collaboration and peer review. Geisinger also uses its metrics to continuously improve its performance in outpatient care for a variety of conditions. Here is how Glenn D. Steele, a physician who presided over the transformation of the Geisinger system as CEO, accounts for its successes: “Our new care pathways were effective because they were led by physicians, enabled by real-time data-based feedback, and primarily focused on improving the quality of patient care,” which “fundamentally motivated our physicians to change their behavior.” Crucial too was the fact that “the men and women who actually work in the service lines themselves chose which care processes to change. Involving them directly in decision making secured their buy-in and made success more likely.” What we can learn from the Geisinger example is the importance of having providers develop and monitor performance measures. The fact that the measures were in keeping with their own professional sense of mission was crucial.

Peter Pronovost, who spearheaded the reduction of central line infections, believes that “The Keystone ICU project demonstrated the potential of voluntary efforts that rely on intrinsic motivation through peer norms and professionalism.” He’s not opposed to supplementing these appeals with public reporting and monetary incentives. But his own interpretation is that the improvement in medical outcomes was brought about primarily by “a shift in clinicians’ belief—by showing them that the rate of infection was not inevitable and could be controlled, in a way that appealed to their professional ethos as doctors and nurses.”

However, the conclusion drawn by the U.S. government’s Centers for Medicare and Medicaid Services was to initiate public reporting of the infection rates in 2011, and a year later, to begin penalizing hospitals with higher infection rates by withholding reimbursements. That created a structure of incentives very different from the institutional successes we’ve examined so far, which relied more on intrinsic than extrinsic motivations.

THE BROADER PICTURE: METRICS, PAY-FOR-PERFORMANCE, RANKINGS, AND REPORT CARDS

When we dig deeper into the record of performance metrics in the field of medicine, the successes of the Cleveland Clinic, Geisinger, and Keystone seem more the exception than the rule.

Most of the professionals who write about medical metrics have a vested interest in the effectiveness of measuring performance. Their careers are based in no small part on the efficacy of gathering and analyzing data. Thus the many studies demonstrating the lack of efficacy or very limited efficacy of publicly released accountability metrics should be read as testimony against interest. The healthcare journals and academic literature are replete with such studies, as we’ll see. To be sure, they more often end with a plea for more data, more studies, and more refined metrics, rather than a bald declaration that metrics have proved futile.11 But the fact that these studies in failure come from those who are by no means antipathetic to measured performance makes them all the more significant.12

The argument for accountability and transparency is based on the premise that the public release of metrics of success and failure will influence the behavior of patients, professionals, and organizations. Patients will act as consumers, comparing the cost of care with relative success rates. Doctors will recommend patients to specialists with high performance scores. Insurers will flock to hospitals and providers who supply the best care at the lowest price. Doctors and hospitals will feel pressure to improve their scores, lest their reputation and their income suffer.13

To test whether the theory holds true in reality, a group of experts from the Scientific Institute for Quality of Healthcare (IQ Healthcare), at the Radboud University Nijmegen Medical Centre in the Netherlands, examined the existing evidence to see how widely accessible information on research pertaining to a variety of health issues impacted provider and patient/consumer behavior as well as patient outcomes. They included controlled before-and-after studies, which compare behavior before and after the introduction of publicly available medical metrics for a wide range of conditions, such as heart attacks. The Dutch experts found that in some cases, hospitals did indeed initiate improvements in their processes. But, in contradiction to the prediction of accountability advocates, there was no lasting effect on patient outcomes.

That may be a product of the relationship between medical research and medical practice. The populations upon which medical research is based differ from the real populations that doctors and hospitals treat. Plausible medical interventions (such as controlling blood sugar to try to prevent diabetes) are tested on relatively small groups of patients, and to isolate the effects of the intervention, such studies deliberately exclude patients with multiple medical problems. But in the real world, patients often do have multiple medical problems (comorbidities), so that the effect of the tested intervention often disappears. That might explain why simply following the recommended procedures does not necessarily lead to improved outcomes.14

Nor, according to the Dutch experts, did the publication of metrics affect patient behavior in choosing a provider or hospital. Their conclusion: “The small body of evidence available provides no consistent evidence that the public release of performance data changes consumer behavior or improves care.”15

Another prominent use of metrics is in pay-for-performance (P4P) schemes. Here the incentive structure is straightforward: physicians receive some substantial part of their remuneration for having reached some measured target, such as following recommended procedures (checklists), or cutting costs, or improving outcomes.

In the United Kingdom, the National Health Service (NHS) began to adopt P4P as a key feature of its compensation arrangements with primary care physicians in the mid-1990s, a feature that was extended by the Tony Blair administration. In the United States, private health plans and employer groups have increasingly adopted P4P programs, as have state governments. And P4P provisions are an important part of the remuneration that physicians receive from Medicare as part of the Affordable Care Act of 2010.16 Medicare administrators have tried to reward a variety of measured outcomes, including surgical results, using as a criterion the rate of survival until thirty days after surgery.

Another prominent form of medical metrics is the public ranking of doctors and hospitals in the form of “medical report cards.” New York State pioneered the publication of such data; in England, the Department of Health began in 2001 to publish annual “star ratings” for public healthcare organizations; and England recently became the first country to mandate the publication of “outcome data” for surgeons across nine surgical specialties. In 2015 the American news reporting organization ProPublica published the complication rates for some 17,000 surgeons across the United States.17 Report cards and rankings are also published by the nonprofit “Joint Commission” on medical accreditation, and by private profit-making rankings, such as the website Healthgrades or US News and World Report. The notion behind all of these groups is that doctors and hospitals will have an incentive to perform better in order to improve their reputations for safety and efficacy, and ultimately their market share of the potential patient population. For hospitals, these rankings are important for status and “brand management.”18

There is now a large social scientific literature on the impact of pay-for-performance and public performance metrics in the United States, the United Kingdom, and elsewhere. What is quite astonishing is how often these techniques—so obviously effective according to economic theory—have no discernable effect on outcomes.19

A recent study in the Annals of Internal Medicine, for example, looked at the fate of Medicare patients in the years since public reporting of hospital mortality rates began in 2009. According to the authors, “We found that public reporting of mortality rates has had no impact on patient outcomes. We looked at every subgroup. We even examined those that were labeled as bad performers to see if they would improve more quickly. They didn’t. In fact, if you were going to be faithful to the data, you would conclude that public reporting slowed down the rate of improvement in patient outcomes.”20 As if that were not enough of a problem, many of these public rankings, such as ProPublica’s surgical report card, are based on what experts regard as dubious criteria, as likely to be misleading as genuinely illuminating.21

Another recent report, this time from the Rand Corporation, came to similar conclusions. Most studies of pay-for-performance, it noted, examined process and intermediate outcomes rather than final outcomes, that is, whether the patient recovered. “Overall,” it reports, “studies with stronger methodological designs were less likely to identify significant improvements associated with pay-for-performance programs. And identified effects were relatively small.”22 Nor was this finding new. Social scientists who studied pay-for-performance schemes in the public sector in the 1990s concluded that they were ineffective. Yet such schemes keep getting introduced: a triumph of hope over experience, or of consultants peddling the same old nostrums.23

When metrics used for public rankings or pay-for-performance do affect outcomes, it is often in ways that are unintended and counterproductive. And whether productive or unproductive, they typically involve huge costs, costs that are rarely considered by the advocates of pay-for-performance or transparency metrics.

Among the intrinsic problems of P4P and public rankings are goal diversion. As a report from Britain notes, P4P programs “can reward only what can be measured and attributed, a limitation that can lead to less holistic care and inappropriate concentration of the doctor’s gaze on what can be measured rather than what is important.” The British P4P program led to lower quality of care for those medical conditions that were not part of the program. In short, it leads to “treating to the test.” And it is simply impossible to provide reliable criteria of measurement for the treatment of many patients, such as the frail elderly, who suffer from multiple, chronic conditions.24

Physician report cards create as many problems as they solve. Take the phenomenon of risk-aversion. Numerous studies have shown that cardiac surgeons became less willing to operate on severely ill patients in need of surgery after the introduction of publicly available metrics. In New York State, for example, the report cards for surgeons report on postoperative mortality rates for coronary bypass surgery, that is, what percentage of the patients operated upon remain alive thirty days after the procedure. After the metrics were instituted, the mortality rates did indeed decline—which seems like a positive development. But only those patients who were operated upon were included in the metric. The patients who the surgeons declined to operate on because they were more high-risk—and hence would bring down the surgeon’s score—were not included in the metrics. Some of these sicker patients were referred to the Cleveland Clinic, and so the outcomes of their procedures did not show up in the New York metrics. As a result of this “case selection bias” (that is, creaming) some sicker patients were simply not operated on. Nor is it clear that the improvement in postoperative outcomes in New York State was a result of the publication of the metrics. It turns out that the same improvement occurred in the neighboring state of Massachusetts, where there was no public reporting of data.25

The phenomenon of risk-aversion means that some patients whose lives might be saved by a risky operation are simply never operated upon. But there is also the reverse problem, that of overly aggressive care to meet metric targets. Patients whose operations are not successful may be kept alive for the requisite thirty days to improve their hospital’s mortality data, a prolongation that is both costly and inhumane.26

To be sure, there are some real advantages to publicly available metrics of surgeon success and of hospital mortality rates. Their publication can point out very poor performers, who may then cease practicing, in the case of surgeons—a sifting process all the more valuable in a profession in which practitioners are reluctant to dismiss incompetent fellow members of the guild. Or the lower-level performers can take steps to improve their measured performance, in the case of hospitals. But the tendency here, as with so many performance metrics, is to glean the low-hanging fruit, and then expect a continuingly bountiful harvest. That is to say, there are immediate benefits to discovering poorly performing outliers.27 The problem is that the metrics continue to get collected from everyone. And at some point the marginal costs exceed the marginal benefits.

Just how costly and burdensome the pursuit of ever more medical metrics has become is evident in a recent report from the Institute of Medicine.28 At major medical centers, the cost of reporting quality measures to government regulators and insurers amounted to 1 percent of net revenue. Administrative costs for measurement and related activities are estimated at $190 billion per year. Then there is the unmeasureable cost of providers entering data into the government’s Patient Quality Reporting Systems. Larger medical practices must pay external firms to enter the data; in smaller practices, it is sometimes left to the physicians themselves. In addition to the tangible costs of gathering, inputting, and processing this tsunami of data, there are the incalculable opportunity costs of what doctors and other clinicians might have done with the time they must devote to inputting data. Moreover, the time invested is largely uncalculated and uncompensated. It typically falls out of consideration when medical costs are discussed.29 “Ironically,” the Institute of Medicine study reports, “the rapid proliferation of interest, support, and capacity for new measurement efforts for a variety of purposes—including performance assessment and improvement, public and funder reporting, and internal improvement initiatives—has blunted the effectiveness of those efforts.”

Donald M. Berwick is a leading advocate of improvement through measurement who served as the Administrator of the Centers for Medicare and Medicaid from 2010 to 2011. Reporting requirements have become so burdensome and redundant that Dr. Berwick recently declared, “We need to stop excessive measurement…. I vote for a 50 percent reduction in all metrics currently being used.”30

Add to this the psychic costs of treating medicine as if it were primarily a profit-making enterprise. Berwick captured this brilliantly in his article, “The Toxicity of Pay for Performance”:

“Pay for performance” reduces intrinsic motivation. Many tasks, especially in health care, are potentially intrinsically satisfying. Relieving pain, answering questions, exercising manual dexterity, being confided in, working on a professional team, solving puzzles, and experiencing the role of a trusted authority—these are not at all bad ways to spend part of one’s day at work. Pride and joy in the work of caring is among the many motivations that do result in “performance” among health care professionals. In the rancorous debates about compensation, fees, and reimbursement that so occupy the time of health care leaders and clinicians today, it is all too easy to neglect, or even to doubt, the fact that nonfinancial and intrinsic rewards are important in the work of medical care. Unfortunately, neglecting intrinsic satisfiers in work can inadvertently diminish them.31

Berwick’s article appeared more than two decades ago. It seems to have had no effect. The tidal wave of pay-for-performance continues to rise.

A TEST CASE: REDUCING READMISSIONS

Among the most touted uses of measurement are Medicare’s metrics for unplanned readmissions to hospitals within thirty days of discharge, which demonstrates both the promise and the problems of metrics. Hospital admissions are expensive, and one motive has been to reduce costs. Readmissions were also thought to be a result of inadequate patient care, and so lowering the number of admissions would be a sign of improved care. In 2009, Medicare began public reporting by all acute care hospitals of readmission rates within thirty days of discharge, a form of transparency metrics. The thirty-day readmission metric covered patients who had been treated for major medical conditions (heart attacks, heart failure, strokes, pneumonia, chronic obstructive pulmonary disease, coronary artery bypass), and two common surgical procedures, hip or knee replacements. (The metrics are publicized on Medicare’s “Hospital Compare” website.) Then in 2012, Medicare went from public reporting to paying for performance, by imposing financial penalties on hospitals with higher than average rates.32 The public reporting of performance and the monetary penalization of failure served as a stimulus for hospitals to take measures to limit readmissions and, since hospital admissions are expensive, to cut costs. Hospitals began taking additional steps to try to ensure that patients leaving the hospital would not have to return. That included better coordination with primary care providers, and trying to ensure that patients had access to the medicines prescribed to them. The fines levied upon low-performing hospitals were intended to motivate them to provide better care for their patients, so that they would not have to return to the hospital.

Hospital readmissions have indeed declined, a much-touted success for performance metrics. But how much of that success is real?

The falling rate of reported readmissions was due in part to gaming the system: instead of formally admitting returning patients, hospitals placed them on “observation status,” under which the patient stays in the hospital for a period of time (up to several days), and is billed for outpatient services rather than an inpatient “admission.” Alternatively, the returning patients were treated in the emergency room. Between 2006 and 2013, such observation stays for Medicare patients increased by 96 percent. That meant that about half the drop in readmissions was actually due to patients who had in fact returned to the hospital but were treated as outpatients. (To complicate matters, a later analysis indicated that the hospitals that lowered their readmission rates were not the ones that increased the number of patients under observation.) The metrics of readmission thus improved, but not necessarily the quality of patient care.

Not all hospitals gamed the system: some really did examine and refine their procedures to actually improve patient outcomes and lower Medicare costs by reducing readmissions. But others simply improved their ability to manipulate the labels under which patients were categorized in judging performance.33

There were other negative consequences. As of 2015, about three-quarters of the reporting hospitals were penalized by Medicare. Tellingly, major teaching hospitals—which tend to see more difficult patients—were disproportionately affected.34 So were hospitals in poverty-stricken areas, where patients were less likely to be well taken care of (or to take care of themselves) after their initial discharge from the hospital.35 Attaining the goal of reduced admissions depends not only on the steps that the hospital takes to educate the patient and provide necessary medications, but also on many factors over which the hospital has little control: the patient’s underlying physical and mental health, social support system, and behavior. Such factors point to another recurrent issue with medical metrics: hospitals serve very different patient populations, some of whom are more prone to illness and less able to take care of themselves once discharged. Pay-for-performance schemes try to compensate for this by what is known as “risk adjustment.” But calculations of the degree of risk are at least as prone to mismeasurement and manipulation as other metrics. In the end, hospitals that serve the most challenging patient population are most likely to be penalized.36 As in the case of schools punished for the poor performance of their students on standardized tests, by penalizing the least successful hospitals, performance metrics may end up exacerbating inequalities in the distribution of resources—hardly a contribution to the public health they are supposed to improve.

A BALANCE SHEET

Most healthcare delivery organizations now use metrics for quality improvement purposes, from bettering outcomes for specific procedures to optimizing operations for an entire institution. This internal use of metrics of performance is of great value in helping hospitals and other medical institutions to enhance the safety and efficacy of their medical care. But metrics tend to be most successful for those interventions and outcomes that are almost entirely controlled by and within the organization’s medical system, as in the case of checklists of procedures to minimize central line–induced infections. When the outcomes are dependent upon more wide-ranging factors (such as patient behavior outside the doctor’s office and the hospital), they become more difficult to attribute to the efforts or failures of the medical system. Geisinger’s success in managing population health offers hope. But it does so in a context in which diagnostic metrics play a part in a larger institutional culture, in which such metrics are developed and evaluated by practitioners, in keeping with their professional ethos.

The use of metrics to reward performance, either through monetary or reputational rewards, is much more problematic. There is increasing resort to metrics tied to monetary incentives and public rankings. Whether they are adding or subtracting to the costs and benefits of healthcare remains an open question.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.125.169