In Chapter 1, Introduction to Healthcare Analytics, we discussed the three subcomponents of analytics: descriptive analytics, predictive analytics, and prescriptive analytics. Predictive and prescriptive analytics form the heart of healthcare's mission to improve care, cost, and outcomes. That is because if we can predict that an adverse event is likely in the future, we can divert our scarce resources toward preventing the adverse event from occurring.
What are some of the adverse events we can predict (and then prevent) in healthcare?
- Deaths: Obviously, any death that is preventable or foreseeable should be avoided. Once a death is predicted to occur, preventative actions may include directing more nurses toward that patient, hiring more consultants for the case, or speaking to the family about options earlier rather than later.
- Adverse clinical events: These are events that are not synonymous with deaths, but highly increase the chances of morbidity and mortality. Morbidity refers to complications, while mortality refers to death. Examples of adverse clinical events include heart attacks, heart failure exacerbations, COPD exacerbations, pneumonia, and falls. Patients in which adverse events are likely could be candidates for more nursing care or for prophylactic therapies.
- Readmissions: Readmissions don't present an obvious danger to patients; however, they are costly, so preventable readmissions should, therefore, be avoided. Furthermore, readmission reduction is highly incentivized by the Centers for Medicare and Medicaid Services, as we saw in Chapter 6, Measuring Healthcare Quality. Preventative actions include assigning social workers and case managers to high-risk patients to assure that they are following up with outpatient providers and buying needed prescriptions.
- High utilization: Predicting patients who are likely to incur high amounts of medical spending again could potentially reduce costs by assigning more care members to their team and ensuring frequent outpatient check-ins and follow-ups.
Now that we've answered the "What?" question, the next question is, "How?" In other words, how do we make predictions about which care providers can act?
- First, we need data: The provider should send you their historical patient data. The data can be claims data, clinical transcripts, a dump of EHR records, or some combination of these. Whatever the type of data, it should eventually be able to be molded into a tabular format, in which each row represents a patient/encounter and each column represents a particular feature of that patient/encounter.
- Using some of the data, we train a predictive model: In Chapter 3, Machine Learning Foundations, we learned about what exactly we are doing when we train predictive models, and how the general modeling pipeline works.
- Using some of the data, we test our model's performance: Assessing the performance of our model is important for setting the expectations of the provider as to how accurate the model is.
- We then deploy the model into a production environment and provide live predictions for patients on a routine basis: At this stage, there should be a periodic flow of data from the provider to the analytics firm. The firm then responds with regularly scheduled predictions on those patients.
In the remainder of the chapter, we will go through the "How?" of building a predictive model for healthcare. First, we will describe our mock modeling task. Then, we will describe and obtain the publicly available dataset. After that, we will preprocess the dataset and train predictive models using different machine learning algorithms. Finally, we will assess the performance of our model. While we will not be using our models to make actual predictions on live data, we will describe the steps necessary for doing so.