CHAPTER 10

Converging Data for Better Healthcare

Digital healthcare data has become abundant due to the conversion from paper to electronic health records (EHRs) in the past 10 years. In addition to EHRs, much more data are being captured from medical devices, remote monitoring, sensors, fitness devices, and health related apps on mobile devices. There are more sources of nonmedical data on individuals such as financial, shopping, and social media. Together, these multiple sources of data provide a wealth of information on an individual that can be analyzed to determine the factors that contribute the most to a person’s health and illness or potential illness. This chapter describes market research, big data versus small data, how data are used for predictions, and precision medicine.

The analysis of big data needs to be guided by a strategy and purpose, which will also determine what data sources are likely to contribute the most to the new insights. While both qualitative and quantitative data are present in healthcare, it is the quantitative data that are finally being captured in EHRs plus patient generated data, device data, genomic data, drugs, and sensors combined with other data sources that holds the promise to deliver actionable information.

Super-Sized Data

Big data is the revolution du jour in the commercial marketplace and healthcare. People in the industry talk about it as having solutions and insights that were never possible in the small data world.

What is big data? It’s not simply a single large dataset; it is connecting multiple large data sources, some novel to healthcare, to solve problems and provide insights.

There are multiple definitions of big data. The traditional Oxford English Dictionary (OED) definition is “data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges.”

The most common definition1 of big data is it contains the “3Vs”—volume, velocity, and variety. A fourth V is often added for veracity.

Volume—scale of the data. Sheer volume of data.

Velocity—analysis of streaming data. Sensors collecting real-time data and transmitting it for storage.

Variety—different forms of data. Social media has multiple forms, from Facebook to YouTube videos. Every wireless monitor and sensor might have its own data form.

Veracity—uncertainty of the data. Moving from small samples to big data means that structured and unstructured data are collected and combined. Data sources may contain imprecise data that are included in analysis.

Another definition, which is relevant to this chapter, comes from authors of the book Big Data by Viktor Mayer-Schönberger and Kenneth Cukier: “The ability of society to harness information in novel ways to produce useful insights or goods and services of significant value” and “… things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value.”

To put some perspective around big data, one needs to become familiar with units of storage. Most of us are used to gigabytes (109) and terabytes (1012) in terms of storing data on PCs and servers. Big data is referenced in exabytes (1018), zettabytes (1021), and yottabytes (1024). Simply calculated, each increase is 1,000 times the previous unit—1,000 exabytes = 1 zettabyte, and 1,000 zettabytes = 1 yottabyte.

Every day, 2.5 exabytes of data are created. Ninety percent of the world’s data has been created in the past two years. This mind-boggling amount of data is collected and stored in social network profiles, blogs, mobile devices, cloud applications, data warehouses, legacy documents, and more. Healthcare’s data, like other industries, will grow at an annual rate of 40 percent, meaning it won’t take long to reach zettabyte capacity.

The following are some examples of how big data are being used in other industries:

  • IBM’s Watson computer became famous for winning the game show Jeopardy! Banks now use Watson to detect fraud. Through partnerships with 14 of the top cancer clinics in the country, Watson will help doctors select the best possible treatment based on the cancer’s DNA.

  • Amazon created algorithms from customer data to recommend purchases customized for each shopper.

  • Google used big data to improve spell-checking software by mining word searches of millions of users.

Some drivers pushing the big data movement in healthcare are:

  • Need for health information exchange (HIE) to meet CMS meaningful use incentives.

  • Cost control through value-based purchasing by Medicare, Medicaid, and private payers.

  • Emphasis on patient engagement and patient-centered care.

  • Changes in technology that enables access to and utilization of big data sources.

  • A need to address population health.

Big Data Attack

We live in a world of hyperdata; big data’s evolution. Big data comes from new and nontraditional sources—thanks to advancements in technologies that can process vast quantities of data faster than ever before. Not only can we capture more data, but it also can be stored in ever smaller and cheaper physical spaces. “Hyperdata comes to you on the spot, and you can analyze it and act on it on the spot,” said Bernt Wahl, an industry fellow at the Center for Entrepreneurship and Technology at the University of California, Berkeley. “It will be in regular business soon, with everyone predicting and acting the way Amazon instantaneously changes its prices around.”

The ability to link data from many sources and utilize real-time data can provide insights and answer key problems. Where are some of the big sources of healthcare data? In addition to EHRs’ clinical data and payers’ claim data, data come from genomics and proteomics, patient-held devices such as Fitbits that generate real-time data, and remote monitoring devices.

Apps are another huge source of rapidly growing data for healthcare. Due to the high number of free or relatively inexpensive apps available for diet and exercise, 20 percent of smartphone users have at least one health app on their phone. About 30 percent of adults share their health information on social media with other patients, and a little less than 50 percent share their information with their doctors and hospitals.

Shared information by patients can be a source of timely feedback on side effects and effectiveness on drugs. For example, the website Iodine provides expert medical information on specific conditions, lists easy-to-understand upsides and downsides of a medication, compares treatments, finds top-rated medications, offers side effects, shows pictures of medicines, suggests alternatives, and collects overall patient experiences that can be filtered by gender and age as well as detailed reviews. As consumers demand more information, sites like Iodine that provide feedback based on personal experiences provide valuable supplemental information on healthcare products and services.

Qualitative and Quantitative Research

Market research has an increasingly important role in our healthcare system. If a healthcare company wants to find out what a customer wants, test its advertising, gauge reaction to a new product, or measure the strength of its brand, it can use a variety of market research techniques to build the story.

Market research is commonly divided into two types of techniques: qualitative and quantitative.

Qualitative research is generally exploratory, while quantitative research provides a measure of how many people think, feel, or behave in a certain way and uses statistical analysis to determine the results. In this type of research, focus group discussions and in-depth interviews have always been very popular. Traditionally, qualitative research has been conducted in person, although now numerous qualitative approaches to data collection can be employed online.

Over the past decade, ethnography has become increasingly popular as a qualitative tool in market research to observe the behavior of people in a natural context in order to identify unmet needs and new product opportunities. Some companies now offer approaches to “mobile ethnography” in which the participant records what they do on their smartphone—including taking photos and video clips.

As computing power has increased and technology has enabled new means of data collection, quantitative market research has seen an explosion of advanced analytic methods. At its simplest level, most quantitative research consists of a structured survey conducted among the target market. Closed-end and open-end questions are administered with complex skip patterns via web-hosted surveys that people take on their computer or smartphone. Responses are then tallied and the analysis forms the basis of a report.

Research participants are provided by companies who specialize in recruiting people who are willing to participate and fit the study criteria.

Many different types of advanced quantitative tools are now available for predicting consumer preference and choice.

  • Discrete choice analysis. Respondents are presented with a choice of products that vary systematically on different product features. In the case of computers, this might be screen size, battery life, processor type, brand, or price. Analysis of these data enables a company to predict consumer preference, and even expected revenue and profit when other variables are included, for different types of products comprised of different features. This predictive power can help save a company millions of dollars in product development, manufacturing, and marketing costs.

  • Marketing Return on Investment Analysis (MROI) involves the analysis of large datasets that include marketing variables such as spending on TV, print, radio, digital, and outdoor advertising in conjunction with sales data, macro-economic variables, seasonal sales patterns, brand awareness measures, and so on. The results of MROI analysis helps guide company decision making on how to adjust marketing expenditures to achieve its business goals.

  • Segmentation research entails surveys among the population to determine how different groups of customer cluster together into groups that have similar tastes, attitudes, media habits, buying patterns, and so on. The segments are often illustrated via very visual “personas” that typify a segment member. When these segments are socialized within a company, they become very powerful communication tools so that everyone understands who their customers are—from the boardroom to the retail floor.

  • Customer satisfaction research surveys users of products or services on different dimensions of engagement or interaction (e.g., experience with a bank’s website, phone support, and in-person tellers). Regression analysis of the data can help identify the key drivers of customer satisfaction and any problem areas that need to be addressed.

Current trends in market research point to the “power” of big data and analytics. As more and more information becomes available about customers, including their online behaviors, exposure to advertising, shopping patterns, and purchase history, advanced analytics can increasingly predict well in advance what customers are going to do before they do it. All of this information can be used to maximize sales as well as customer loyalty and retention.

Making Data Meaningful

Data by itself are not useful; it is what we do with it that matters. It is the collection, merging, and analysis of data that brings the insights and information to solve problems, improve operations, and deliver care that is tailored to the patient for better results. One ideal way to use big data is to collect clinical, genomic, claims, outcomes, and social data to create predictive analytics. The benefit of big data is taking information and converting it into a quantifiable format that can be used in analyses to come up with insights and predictions.

The capability to analyze big data has improved in the past several years with advances in technology and tools to collect and conduct analyses on real-time data. For instance, unstructured data such as physician notes can be quickly searched using programs like Python (an open-source programming language), R (free statistical programming language), Hadoop, and natural language processing tools.

Healthcare has more data available for analyses digitally than ever before. In addition to formal practitioner medical records, healthcare consumers are creating their own data through fitness devices, home health devices such as blood pressure monitors and scales, and watches. The exabytes of data filling health systems pose new challenges for healthcare leaders:

  • How can the data be turned into information that can be acted upon?

  • When is it needed?

  • What’s the best way to use big data?

  • Who in the organization is responsible for coming up with a plan for data analysis?

  • Are new resources and skills necessary to make the data useful?

  • What are the priorities for information to improve the organization?

  • How much will it cost and what is the return on investment?

Clearly, these questions take big data out of the realm of being the sole responsibility of IT departments, and into the realm of clinical care leaders and administrators.

Five Reasons to Harness Hyperdata

The television show “Criminal Minds” has a data analyst, Penelope Garcia, who queries databases to create a list of suspects based on pieces of information using multiple sources. With each additional piece of information from her colleagues, she narrows down the list of suspects until she reaches one person who fits all, or most of, the known facts. Penelope hunts for the person by cross-analyzing many data sources. Once the suspect is identified, she can read details about the person from criminal records, news stories, high school yearbooks, military records, property ownership, family history, medical records, and many more data troves across the nation in addition to public data and social media. This analysis can be done quickly with the right input sources.

This would be ideal in the healthcare world. But, in reality, most healthcare systems don’t have access to all the sources of data that they would like. In fact, many don’t know even what database sources they want to use in their analyses. As more companies collect and sell data to both healthcare and non-healthcare to third-party users, this will inevitably change.

The five biggest reasons to harness hyperdata are the following:

#1. Internet of Things

The analyst firm Gartner projects that by 2020 there will be more than 25 billion connected devices in the Internet of Things (IoT). In terms of healthcare, any device that generates data about a person’s health and sends that data into the cloud will be part of the IoT; wearables are the most common example. Many people now can wear fitness devices and use apps that track their progress, and use medical devices that send data into the cloud.

Data from IoT will be available to integrate into analyses. The question will be figuring out what data are relevant for each analytical path.

#2. Consumer Engagement

The consumer is now at the center of the healthcare marketplace. The volume of health information is increasing and can help consumers make informed decisions about care, and patients are becoming increasingly empowered to actively participate in their own health. As transparency increases, more information is available to consumers for decision making. New companies are merging the quality and cost data so that it is easier to use, and creating apps to support consumers engaging in their own care.

#3. Reduce the Cost of Healthcare

Big data can be used to predict who needs healthcare and when. Knowing what treatments work can help better target care for the most expensive patients, reduce unnecessary re-admissions, avoid adverse events, and more. Existing conditions can be diagnosed earlier, which result in better patient outcomes. And that ultimately means a higher-quality and less expensive healthcare system. Data sources will need to include social and retail data to find lifestyle indicators. The objective is to understand as much as possible about a consumer as early as possible to pick up signs of serious illness early enough so that treatment is more simple, effective, and less expensive than a later diagnosis when the disease has advanced.

#4. Potential for Precision Medicine

There are many ways that data can be a catalyst for precision medicine: Linking medical treatment data with outcomes to an individual’s characteristics has the tremendous potential for breakthroughs. Drugs can be targeted using unique patient characteristics and genomic data to determine the most effective and safe treatment. In addition, combining data sources such as tumor registries with specific de-identified patient information can be useful in feeding models to predict disease treatments.

#5. Advancing Population Health

One definition of population health is the health outcomes of a group of individuals including the distribution of outcomes within the group.2 Massive databases and information generated through social media are used to support public health surveillance, including detecting disease outbreaks, new infections, and patterns of patient harm. Actionable data are necessary to get the benefits from big data.

One widely used example of big data happened in 2009, when Google predicted the spread of a new flu virus, H1N1, in the United States down to specific regions and states. With no vaccine available and fear that this could become a worldwide pandemic, public health officials’ strategy was to slow the spread of H1N1. In order to do that, they needed to know where the flu had already hit. The CDC’s process of collecting flu reports takes two weeks. This was too slow for public health agencies to prepare for the flu hitting their area. Google could tell public health agencies exactly where the flu was in real time by looking at Internet searches. Their volume of data, three billion search queries a day, gives them hyperdata to immediately analyze. The story3 is that 450 million different models were created to compare their predicted cases against the CDC’s 2007 and 2008 actual flu cases. Google found 45 search terms that, when fed into the model, had a strong correlation with the location of flu cases. Google’s model could tell where the flu had spread in real time and provide the information to public health officials in time to be useful and warn ERs of potentially high numbers of flu patients.

David’s Data Versus Goliath’s Data

The right data can lead to breakthroughs in healthcare, whether that’s through medical research or drug trials.

The scientific method has guided research for decades using statistical techniques for small and large sample sizes. It is the standard by which research is evaluated and judged to be a good study or one that lacks rigor and therefore has questionable results. It begins with a theory that explains the world in some way. From that theory, a hypothesis is stated and then tested against reality using fundamental factors. Statistical analyses are applied to the results to determine how reliable the results are, and if they show a true difference or if the difference is due to chance.

Samples of the population are used because it is not practical to find all individuals who meet the objective of the study, recruit them, match controls to test subjects, and cost. Big datasets, at their best, contain all or nearly all the data—n=all. Samples and sampling techniques, and sampling errors are not involved when n=all. Sampling is a shortcut when all the data are not available. Big data provides data at the macro level while losing accuracy at the micro level. As Mayer-Schönberger and Cukier articulate in their book4 about the subject, big data is messy, whereas samples of data are curated, precise, and limited. This is especially true of healthcare where the many variables in people, including their lifestyle, diet, and health history, in addition to habits, financial means, and purchase history, make the data messy.

To understand the nuances, it’s important to understand the differences between big data and small data. Small data, or data samples, are used to find causality in controlled experiments. Demonstrating cause and effect is extremely difficult in healthcare. Experiments provide insight into cause and effect by demonstrating what outcome occurs when a particular factor is manipulated. Experiments vary greatly in goal and scale, but always rely on repeatable procedure and logical analysis of the results.

A sample is carefully selected to test a hypothesis. The sample is divided into groups. There must be a control group, one that does not have any changes done to it. In the other group, the variables being tested are changed—usually one at a time—while holding all other variables constant. Changes are measured and the results are compared with the control group. This method is used to prove causality, meaning that one factor causes the change in another variable to occur. Ideally, all variables in an experiment are controlled (accounted for by the control measurements). In such an experiment, if all controls work as expected, it is possible to conclude that the experiment works as intended, and that results are due to the effect of the tested variable.

In medical research that involves humans as the subjects, there is a concern about bias in the sample selected or in the conditions (environmental, lifestyle, physiology) of the individuals. Age, gender, and smoking habits are general characteristics that are controlled (characteristics matched across groups) in medical research so they do not have an impact on the results. Researchers, depending upon each study’s objectives, control additional characteristics.

In most research, the investigator attempts to eliminate sample bias by getting a random sample of subjects. For example, people who live in rural areas may not have access to high-speed Internet connections, whereas urban residents have high-speed access 24 hours every day. A study looking at usage of online health visits would be underrepresented if the sample included only rural residents. These kinds of biases are what researchers try to avoid when selecting their sample. The sample’s goal is to be representative of the target population.

Drug trials are probably the most commonly known controlled test experiments among the public. In fact, there have been cases where a drug being tested has such an overwhelming positive effect on patients that the clinical trials were ended early because it was considered to be unethical to allow the control group to not receive the benefits.

What’s the difference in analyses of small data and big data? Instead of using controlled experiments with small data, big data analyses are not controlled experiments. All data, not a sample, is available to be analyzed so it is an observational study or natural experiment. Natural experiments are empirical studies in which the individual’s exposure to the experimental or control conditions are determined by nature instead of deliberate exposure by the investigators.

Examining big data is different from examining small data because the data are not clean or scrubbed, but associations between different variables will be more visible and treated as a correlation that requires further detailed examination. The same association may also show up in small data analyses, but discarded because the sample size made the association weak. Or the association may not have appeared at all in small data because only one variable is altered at a time in order to find cause and effect.

In data where n=all, what is observed is what happened. The associations that show up are real and not caused by testing or test conditions. Spurious correlations—when the correlation between two variables is not due to any direct relation between them, but to an unknown variable or coincidence—must be eliminated in any credible investigation.

Correlations can explain the what, but not the why, as we see next.

The Power to Predict

Predictive analytics are used in business to foresee events before they happen. Predictive models take into account the various characteristics of an individual in order to derive a single score, and models are built using machine learning to crunch the data.

These predictions can be applied to diagnostic equipment to find signs of wear and tear before critical instruments break down and prevent shortages, to manage population health by preventing debilitating and costly conditions among cohorts, predict infections from methods of suturing, determining the likelihood of disease, helping a physician with a diagnosis, and even predicting future wellness.

Until now, doctors have been the analytic machines that predict what a patient will do or the outcome of treatment. Physicians are smart, well trained, and do their best to stay up-to-date with the latest research, but it is impossible to commit to memory all the knowledge they need for every situation. In addition, they probably don’t have all of the information they need at their fingertips, nor do they have time to compare treatment outcomes for all the diseases they encounter. To treat a patient, the information needs to be analyzed in conjunction with the patient’s own medical profile. This type of in-depth research and statistical analysis is not possible in a busy physician’s daily workload.

Predictive analytics uses technology and statistical methods to analyze massive amounts of information to predict outcomes for individual patients. That information can include data from past treatment outcomes as well as the latest medical research published in peer-reviewed journals and databases.

There are two major ways that predictive analytics differs from traditional statistics:

  • Predictions are made for individuals, and not for groups.

  • They do not rely upon a normal bell-shaped (Gaussian) curve.

Predictive models learn from their predictions when data are fed back into the model; called machine learning. The models are constantly being updated. Prediction modeling uses techniques such as artificial intelligence to create a prediction algorithm from past individuals. The model is then deployed so that a new individual can get a prediction instantly for whatever the need is such as an accurate diagnosis or drug treatment. The difference between the “physician as the machine” and a model is that the predictive analytic system can take in massive amounts of data on a continuous basis to refine each prediction.

Predictive models are used to score an individual based on his attributes. The model computes the score from an algorithm that can be built in many ways. For example, one method is to assign a weight to each characteristic and add them up. For instance, factors contributing to measure the life expectancy of a group of individuals might have males increase their score by 5.2, while individuals over 65 increase their score by 10.0, and so on, to obtain a predictive score. Another method is to use rules. This might look like:

IF the individual

Is under 40

AND does not get flu shots

AND exercises 4 days per week

AND reads nonfiction

AND has a college degree

THEN he has a probability of 49.5 percent of buying a catastrophic health plan.

Models can be made more complex using formulas that improve the predictive scores.

Predictive Analytics Basics

  • Does not tell you what will happen in the future; it can only predict the probability of an event in the future.

  • Correlation does not imply causation. A predictive relationship between two characteristics does not mean one causes the other—not directly or indirectly.

  • The objective is to predict what will happen, not why.

  • The value is to drive decisions from many individual predictions.

  • Takes the data you have, crunches it, and predicts the data that you didn’t have for an individual.

  • Predicts multiple futures with associated probabilities.

  • Ethics are involved when making certain predictions of individual behavior. For example, if an individual is predicted to commit a crime, he is arrested before the crime is committed. He has no chance to prove that he was not going to commit the crime.

Establishing useful strategies to use the data to gain insights that can be implemented in real time to impact patient care, enhance provider decision making, and reduce costs are key.

The ability to link EHR data to electronic health insurance data creates a robust inventory for powering predictive analytics. The data from patient-generated devices and correlation analysis can be used to identify patterns and habits of patients who are likely to live until 100 years as well as those who are likely to fall ill to a chronic disease.

Streaming data from monitors can also be used to identify patterns of when something is going wrong, such as when a warning or alert is relayed to providers and intervention is done to avert a worsening situation. The monitoring data collected over time can detect patterns that could easily not be detected by people simply because no single person is present 24 hours a day seven days a week.

One early example of how predictive analysis helps save lives is the work of Dr. Carolyn McGregor, Canada Research Chair in Health Informatics at the University of Ontario, Institute of Technology. Dr. McGregor examined big data to identify premature babies (preemies) who are at an extremely high risk of developing infections. Among the preemies that develop infections, 18 percent die. By the time these mini-patients exhibit signs of infection, their medical status is already bad. Typically, it is an experienced nurse who uses a combination of subtle clinical signs, knowledge, and instinct to spot a preemie that is not in trouble yet, but is also “not quite right.”

Preemies are hooked up to monitors in the neonatal intensive care units, which provide a flood of constant readings, including heart rate and respiration rate. Dr. McGregor and her research team set out to see if there were any patterns in neonatal vital signs that could be used to create algorithms to predict when a preemie is at risk of infection. Detecting changes in the heart rate of these tiny patients, Dr. McGregor’s team found that a full day before the preemies started exhibiting trouble, their heart rate became unusually steady—something that is odd since it rarely happens to people even at rest. The changes in heart rate are so miniscule that they cannot be detected by a human watching or by listening to a heart rate monitor. With the new model, the data from neonatal monitors is collected and analyzed in real time for a pattern of infection. When an alert is detected, hospital staff is immediately notified to intervene so they can prevent an infection.

Dr. McGregor’s method does not demonstrate causality—or why the infection happens. It correlates the unusually calm heartbeat with the high risk of an infection within the next 24 hours. In other words, it demonstrates what happens when two data values are strongly correlated. The correlation does not infer that the calm heartbeat causes an infection.

Correlations Do Not Equal Causation

Correlations are the key to glean insights more easily and faster from data. A correlation quantifies the statistical relationship between two data values. A strong correlation means that when one of the data value changes, the other value is highly likely to change in the direction (positive or negative) plotted by the data. A weak correlation is when one data value changes, the other data value changes very little or moves in a direction other than what is expected. For example, a correlation run on how many times an individual eats in restaurants a week with how many colds he suffers in a year may not be useful in finding out if he is overweight.

Correlations measure the strength of a relationship between variables, but correlations do not prove causation. One variable does not make or cause the other variable to change even though they both change in tandem. To prove that one element causes a change in another, controlled experiments are used where the suspected cause is carefully applied or suppressed. Statistical techniques are applied to determine the probability that the results occurred by chance. If the effects correspond to whether the cause was applied or not, then there is a higher likelihood that the results show a causal link. These types of studies are very expensive to conduct and highly controlled in order to demonstrate causality with a high probability. Causation is very difficult to prove. For example, inhaling cigarette smoke causes lung cancer has been studied for decades even though the link was recognized in the 1940s and 1950s.5

Correlations show that two things are potentially connected and provide insights to investigate causation, but they do not establish a causal relationship alone.

Moving from Universals to Individuals

Predictive analytics brings the ability to move from universals—finding a single way to explain and treat the average of all people—to understanding individual variances. In physics, the universe has similar events occurring. The world is explained by the probability of atoms being in a certain place, not that the atoms are in place when the observer looks at it.

At the core of predictive analytics is the ability to search for patterns in the data because the model must be able to predict for situations that it has not yet encountered. It provides the drive to make per person decisions from empirical data. In contrast, forecasting is different because it makes aggregate predictions to answer questions at a macroscopic level. While forecasting can estimate the number of robotic vacuum cleaners that will be purchased during the summer, predictive analytics tells you what type of person will buy the robot cleaners.

While payers have had patient health analytics software for a few years, patient scoring and health analysis will make its way into the hands of the clinicians and patients in the next three to five years. Software tools for visualizing a patient’s complete range of health metrics, along with clinically validated algorithms for scoring a holistic view of patient health, will make their debut. These will provide clinicians with at-a-glance analytics of a patient’s overall health, allowing doctors to spot patterns and red flags by comparing a person’s health data against targeted health ranges, based on factors such as age and gender.

Timely insights enable key decisions and new ideas to better deliver quality care.

Algorithms and models could be created to:

  • Predict adverse events from population level data.

  • Predict best treatment plans based on similar patients, past procedures, and outcomes to improve quality of care and cost management.

  • Predict medication safety and effectiveness.

  • Predict relative risk of a particular treatment approach based on overall benefits and potential harm.

  • Detect a drug’s lack of efficacy early, thereby reducing the cost of research.

  • Predict efficient care models.

  • Get quick feedback from sensors and remote monitoring devices can prevent patient harm and adverse effects of treatment approaches.

  • Optimize outcomes in the treatment process based on historical clinical data from previous patients, including when complications or major events are likely to occur. Continuous feeding of data into the model would automatically refine the predictions on a real-time basis.

  • Identify patients who are most likely to acquire specific diseases based on their lifestyle and family history.

  • Monitor treatment costs versus outcomes for reimbursement models supporting outcomes, not volume of care.

  • Identify trends and outliers for use in prevention and wellness programs.

  • Track health and manage care transitions.

  • Provide new insights into treatments and outcomes.

  • Integrate data across continuum of care.

  • Connect people with real-time patient and institutional information.

  • Control costs and better manage resources.

  • Promote wellness by letting patients access their own analytics.

  • Measure outcomes of population-based interventions.

The success of predictive analytics depends on the actions taken as a result of predictions. Clinicians are key to acting on appropriate intervention measures to make a difference.

Data Ownership

The standard method of sharing information with other entities has been to inform the patient and obtain written consent. Now that there are patient-generated data, unstructured data from social media, multiple EHRs per person, sensor data, and remote monitoring streams, it is not clear who owns what data. There is also third-party data that may be purchased as well as government data; many different parties can combine all for various uses. With reused data, multiple owners, including the subject of the data, the data collector, the purchasers of data, the analyst, or the public, can claim ownership.

Security

Data security is a potential barrier to the free-flowing use of big data in healthcare. The rules for sharing personal health information between covered entities were laid out in HIPAA. In the near future, the sources of big data will reach beyond covered entities defined in HIPAA, and the types of data will evolve across industries challenging the process of de-identification. It already requires special attention to keep a small narrow population of people with rare diseases anonymous even when the standard personally identifiable information—name, address, birth date—are removed. They can be identified by facts from other databases especially if there is something unique to the disease that makes purchases distinctive. There are many websites serving as virtual support groups for people with specific conditions. Merging these with other healthcare data could make it possible to unintentionally identify individuals (see the Netflix story in Chapter 8).

Together—Hyperdata and Analytics

The combination of hyperdata along with computing power has ushered in a new wave of analytics. Marrying health data with social and other nonhealth data presents healthcare with the ability to perform predictive analytics. The power to predict comes with volumes of data fed into algorithms and computers that analyze in minutes what would take weeks for humans to do. Results are fed back into the system for machine learning, update it, and improve the next prediction. This has enabled precision medicine to take hold.

Transformation Tips: Big Data and Predictive Analytics

Big data can produce timely insights, augment decision making, identify, and to implement targeted interventions for optimal patient outcomes. Clinicians can impact care delivery by quickly getting answers that help identify the best treatment for a patient as well as treatment approaches that may be harmful. Big data enables treatment plans used by other clinicians on similar patients to be shared.

  • Be aware that predictions do not last forever. They need to be routinely adjusted when new information is available, and fed back into the model to produce improved predictions.

  • Ask the right questions and thoughtfully consider the findings to rule out spurious results.

  • Make your data work, then act on it. Without intervention, no predictor—no matter how good—is useful.

A New Path—Precision Medicine

Precision medicine is an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person. It delivers the right treatment at the right time, and seeks to understand pertinent patient variation from an individual’s unique environments and biochemistries.

Until now, treatments have been “one-size-fits-all” and are based on the “average” patient. This results in excellent outcomes for some, but not for others for the same treatment. Precision medicine is a new approach made possible by the convergence of consumers involved in improving their health, decreased genomic analysis costs, sophisticated data science, the adoption of EHRs, and a connected ecosystem with mobile technology.

While some advances in precision medicine have been made, the practice is not currently in use for most diseases. However, it is quickly gaining the interest of doctors as a new approach treating their patients with diseases that don’t have a proven means of prevention or effective treatment. The goal is to improve care and speed up the development of new treatments.

I want the country that eliminated polio and mapped the human genome to lead a new era of medicine—one that delivers... the right treatments at the right time, every time, to the right person.

President Barack Obama State of the Union, January 2015

The Precision Medicine Initiative

The Precision Medicine Initiative6 (PMI) was announced by President Barack Obama in 2015 and received $215 million in the 2016 budget. It will create a new model of research to accelerate biomedical discoveries and provide clinicians with new knowledge, tools, and therapies. The research will provide better insights into the biological, environmental, and behavioral influences on these diseases to make a difference for the millions of individuals who suffer from them. The advance of technology and unraveling of the human genomic sequence has put precision medicine in the spotlight and it comes closer to reality with funds.

Objectives of the Initiative

  • Research more and better treatments for cancer: NCI will accelerate the design and testing of effective, tailored treatments for cancer by expanding genetically based clinical cancer trials, exploring fundamental aspects of cancer biology, and establishing a national “cancer knowledge network” that will generate and share new knowledge to fuel scientific discovery and guide treatment decisions.

  • Create a voluntary national research cohort: National Institutes of Health (NIH), in collaboration with other agencies and stakeholders, will launch a national research cohort of one million or more Americans who volunteer to participate in research. Participants will be involved in the design of the initiative and will have the opportunity to contribute diverse sources of data—including medical records; profiles of the patient’s genes, metabolites (chemical makeup), and microorganisms in and on the body; environmental and lifestyle data; patient-generated information; and personal device and sensor data. Privacy will be rigorously protected. The ONC will develop interoperability standards and requirements to ensure secure data exchange with patients’ consent, to empower patients and clinicians and advance individual, community, and population health.

  • Protect privacy: To ensure from the start that this initiative adheres to rigorous privacy protections, the White House will launch a multi-stakeholder process with HHS and other Federal agencies to solicit input from patient groups, bioethicists, privacy, and civil liberties advocates, technologists, and other experts in order to identify and address any legal and technical issues related to the privacy and security of data in the context of precision medicine.

  • Regulatory modernization: The initiative will include reviewing the current regulatory landscape to determine whether changes are needed to support the development of this new research and care model, including its critical privacy and participant protection framework. As part of this effort, the FDA will develop a new approach for evaluating Next Generation Sequencing technologies—tests that rapidly sequence large segments of a person’s DNA, or even their entire genome. The new approach will facilitate the generation of knowledge about which genetic changes are important to patient care and foster innovation in genetic sequencing technology, while ensuring that the tests are accurate and reliable.

  • Public–private partnerships: The Obama Administration will forge strong partnerships with existing research cohorts, patient groups, and the private sector to develop the infrastructure that will be needed to expand cancer genomics, and to launch a voluntary million-person cohort. The administration will carefully consider and develop an approach to precision medicine, including appropriate regulatory frameworks, that ensures consumers have access to their own health data–and to the applications and services that can safely and accurately analyze it—so that, in addition to treating disease, we can empower individuals and families to invest in and manage their health.

The Future of Precision Medicine

Advances in precision medicine have already led to new treatments that are tailored to a patient’s specific genetic makeup or the genetic profile of a person’s tumor. This is transforming cancer treatment where patients with breast, lung, and colorectal cancer, and those with melanomas and leukemias routinely undergo molecular testing in their treatment course. This enables their physicians to select treatments that improve chances of survival while reducing the risk of adverse side effects.

The more detailed the diagnosis, the more precisely treatments can be tailored to the combination of the individual and to the individual’s cancer. The number of targeted therapies for cancer is increasing, and physicians have the complicated task of matching therapeutics to the combination of their patient’s genomics as well as the tumor.

When researching a patient’s condition, a doctor may not find a patient with a close fit locally or even within his state, but he could find a narrow subpopulation that is spread across the country and globe.

The challenge is enabling physicians to share information on rare genomic cases and their outcomes across institutions and geographies. Eventually, health information exchange organizations (HIOs) may provide the hub for de-identified patient information sharing, but exchanges are not yet capturing data in enough of the country to consider using them for precision medicine.

In time, the one million member cohort being studied by the NIH will serve as the database for running predictive analytics on all diseases and treatments. This program will seek to extend precision medicine’s success to common diseases such as diabetes, heart disease, Alzheimer’s, obesity, and mental illnesses such as depression, bipolar disorder, and schizophrenia, as well as rare diseases. The cohort will also focus on ways to increase a person’s chances of living healthy for life.

Precision medicine serves the population in multiple ways:

  • Detailed data from many individuals allows doctors to search for other patients with the same exact disease and their treatments and outcomes. The doctor can select the best option for his patient.

  • De-identified data will be available to researchers and data pooled from other studies will speed up medical research.

The PMI establishes an assertive attitude of setting health data free by giving people back their data, sharing data, and pushing the open data world.

It is easier to recruit people to join research studies when they need only download an app and get qualified. Less than one year after Apple® announced their ResearchKit in April 2015 for researchers and app developers, one of the first developers, Sage Bionetworks, is sharing data collected over six months to study people with Parkinson’s disease. About 75 percent, 9,500 participants of 12,000 elected to share their data with researchers.

ResearchKit is a software framework designed for medical and health research that helps doctors, scientists, and other researchers gather data more frequently and more accurately from participants using mobile devices. It allows developers to contribute modules to the open source framework, which has customizable modules that address the most common elements in research studies—consent, surveys, and active tasks.

New technologies and programs such as these are making it easier for researchers and individuals to collaborate in projects that lead to a better understanding of diseases and interventions. Studies now attract tens of thousands of participants versus the few hundred in the past. ResearchKit turned the iPhone into a tool for medical research that can be used by participants around the world.

Transformation Tips: Participating in Research

  • Be aware of studies that affect your patients. Post them on your website and include them in all your communications, including during the telephone hold music, so they can participate.

  • Publicize the one million-cohort study run by NIH among your patients. They are seeking a diversity of participants.

References

Friedman, D.J., and B. Starfield. 2003. “Models of Population Health: Their Value for U.S. Public Health Practice, Policy, and Research.” American Journal of Public Health 93, no. 3, pp. 366–69.

Laney, D. 2001. “3D Data Management: Controlling Data Volume: Velocity and Variety.” www.bibsonomy.org/bibtex/263868097d6e1998de3d88fcbb7670ca6/sb3000

Mayer-Schönberger, V., and K. Cukier. 2014. “Big Data: A Revolution That Will Transform How We Live, Work, and Think, Mariner Books.” p. 3.

Proctor, R.N. 2011. “The History of The Discovery of The Cigarette–Lung Cancer Link: Evidentiary Traditions, Corporate Denial, Global Toll: Tobacco Contro.” http://tobaccocontrol.bmj.com/content/21/2/87.full

1 Laney (2001).

2 There is no single widely used definition of population health. Friedman and Starfield (2003).

3 Mayer-Schönberger and Cukier (2013).

4 Mayer-Schönberger and Cukier (2014).

5 Proctor (2011).

6 Excerpted from: Fact Sheet, President Obama’s Precision Medicine Initiative, January 30, 2015.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.66.156