Predictive Analytics—The Future of Health Care?

Many industries and organizations mine their data and employ predictive analytics to provide better services and to enhance their bottom lines.

OPINION- Many industries and organizations mine their data and employ predictive analytics to provide better services and to enhance their bottom lines. Such analytics may be as simple as being able to predict that a restaurant might sell more breakfast sandwiches on a Monday, compared to a Sunday, depending upon the history of its customers’ purchases in the locale in which it operates.

The analytics might be as complicated as being able to determine what flights that a passenger whose flight is delayed at an airport might prefer to take later, given the passenger’s flying history and depending upon the actual length of the delay and the alternative future flight connections, given the proposed occupancy rates for the other planes. What later flight might a passenger want to take on the same airline or other airlines, or will the passenger decide to leave tomorrow or not at all?

One of the basic elements necessary for good predictive analytics is quality, clean data. It is often said that data analytics is only as good as the data that an entity seeks to analyze and the form in which it exists.

Our healthcare organizations have considerable amounts of data. This data holds the promise for many types of predictive analytics. For example, in a city the size of Seattle (about 650,000 people), if 15 individuals presented with strange flu-like symptoms at 15 different providers and health systems in a particular week, and no provider or health system knew of the other 14 patients, not much might be thought of the situation and little predictive analytics might be performed. If in the second week, there were another 30 patients presenting to some of the same providers and to some new and different ones, a problem with this type of flu would be recognized much sooner if all were instead reporting to one entity immediately or, better yet, if all had interoperable medical records from which a public health department would get this information instantaneously. That is, if there was no delay in reporting, the information could be available sooner for predictive analytics which might be able to determine where the flu started in Seattle and what areas of the city are likely to see more cases.

Clinical research typically takes a long time. One has to design a study, recruit subjects, conduct the research over a period of time, often years, analyze the data, pen an article, have it peer reviewed, and then publish the findings of the research study. The whole process can take many years.

Today, much clinical research is done with existing data. For example, if you wanted to determine the effectiveness of a particular form of pharmacological treatment for a particular disease, you might look at people who have had that disease in the past, what drugs they were prescribed, in what doses, when, and how long they took the drugs, and how they responded. From this data, you might be able to “predict” the effectiveness of the drug in treating a particular patient. It sounds too simple, and of course, it is.

On October 27, 2014, I had the pleasure of being able to attend the UK’s program on ePrescribing for National Health Service Hospitals. One of the presenters, Will Dixon of

the Farr Institute and the University of Manchester, hosted a session on Big Data and Data Analytics, one of my favorite topics. He noted the importance of having quality, clean, and consistent data. “Assuming” that all the data is there, he posited many questions about that data that can make its usefulness in predictive analytics somewhat problematic. Taking and expanding on his observations, one might identify at least the following issues with data in this context:

1. When was a person actually diagnosed with the disease?

2. When did the person actually have the disease?

3. If the patient was prescribed a drug, what were the dose and its frequency?

4. When did the patient start taking the drug?

5. What was the patient’s gender, weight, and other characteristics?

6. Did the patient really take the drugs as prescribed? Did the patient take all of them, part of them, or none of them?

7. Was the drug regimen interrupted or changed, and if so, when and how?

8. When did the patient’s symptoms for the disease start to subside; how do we know that?

9. If the patient was in a hospital and discharged, when was he or she next observed, if at all, and what was the result of that observation?

These are only part of the questions that one might ask about the data. Obviously, it is important to not only have the data, but also to have it recorded consistently to enable it to be sliced and diced by researchers and for others to be able to compare similar patients with comparable drug regimens. If you cannot make those comparisons with all of the data, it might be necessary to do a data analysis for different data sets. One might need to do an analysis on many different data sets for comparability purposes for an analysis to be useful and for the predictive analytics to be truly predictive in nature. An individual may respond to a lower dose of a drug than another person. A higher dose may result in greater toxicity for one person, but not another. The drug may not even be effective for a person at all.

Would it not be good if your clinician had this information? You might want this information in a patient-centered cross-collaborative care environment or in any environment. It would not involve enlisting people for a specific type of clinical trial, as mentioned above, but might be accomplished by mining existing data. But will the data be there in a form from which there can be useful predictive analytics?

News source: 


Predictive analytics also help insurance companies figure out how to shift more costs to the consumer. It's startling that Oregonian's data for the All Payer All Claims databases go to Milliman Inc. (a private actuarial firm which insurance companies consult) without our consent. How do we know if the claims data is accurate? Insurance companies at the last All Payer All Claims Technical Advisory Group meeting last week admitted that there will be inputting errors. In this regard, these APAC databases are like medical records. They should be fully accessible for our review. Indeed, Section 1201 (9) of HB 2009 says: "The collection, storage and release of health care data and other information under this section is subject to the requirements of the federal Health Insurance Portability and Accountability Act." p. 535

Thank you so much for your comment. Yes, indeed, predictive analytics can be used in many ways.  You correctly note that accurate data is a key, and as I noted, it should be comparable.  In the 1980s many medical groups were provided with older data from health plans which they made their decisions about what rates to take for capitated contracts. Of course, in many instances, this turned out to be a disaster.  I agree transparency is important, but unfortunately, in my opinion, we have not seen much transparency in heatlhcare.  You might consider penning a Commentary for the Lund Report on the very important issues which you raise.