There is a tremendous amount of energy focused on the potential for predictive analytics to transform healthcare. Many companies – both large and small – are entering the space, selling applications that utilize either a unique technology (such as deep learning or cognitive computing) or make use of novel data from molecular assays or mobile devices.
These solutions seem capable of addressing endless questions and utilize tremendously powerful technical tools.
However, the biggest challenge facing developers of healthcare-specific predictive models is not technical. Instead, it’s their need to gain a deep understanding of health systems’ rapidly changing business models and internal political landscapes. In order to ask pertinent questions and properly present answers, developers must have a solid understanding of the current challenges impacting health system operations.
The analytic tools. From the perspective of a data scientist, the term predictive analytics is not much more than a bag of tools for taking inputs and converting those inputs into guesses about the future. With the wide availability of statistical and machine-learning software such as R, SAS and Python, the implementation of these tools is relatively easy. Users and developers now have access to any of a wide array (see the sidebar for examples) that can be used in a broad set of applications. What’s critical, however, is having the expertise to use them well 1,2 and to avoid the numerous opportunities for failure.
The information. Building predictive models using an overabundance of data from a single source is something of a tragedy. Health systems have access to data generated during the course of care, but have little insight into what happens while patients are at home, at work or in unaffiliated healthcare facilities.
Payers have access to information about every interaction between a patient and a healthcare provider, but their data is often several months old. In addition, claims data typically contains biases because it relies on systems designed by providers to convince payers to reimburse them the maximum appropriate amount for care provided. Social elements, such as ease of mobility and the availability of help at home can also be critical to a correct treatment decision and positive outcomes, but their relevance to health outcomes is sometimes unclear.
To understand personal health drivers and develop accurate predictive models, data must be pulled from a wide variety of sources. Building inputs from all of all these sources can lead to more accurate predictions.
The questions. A tremendous array of questions can be asked about the future of a patient or patient population. Some relevant questions for risk-bearing health systems include:
• Which patients are at risk for unplanned admissions?
• Who is most likely to be readmitted?
• Who will be our most expensive patients?
• Are there patients who are high risk, but are not appropriately diagnosed? Are there patients who have been misdiagnosed as high risk?
Underlying these questions – and critical to designing and implementing interventions – is an understanding of which patients can be impacted. The statistical tools for making those estimations may sometimes fall outside the realm of predictive analytics. Nonetheless, for successful interventions that are tailored to individual patient needs, it’s critical to understand the answers to these questions when developing predictive models.
Adoption. It’s exciting to consider the performance characteristics of cutting-edge data science tools such as deep learning. Unfortunately, using these tools to make healthcare decisions can be challenging. Software vendors are generally unwilling to accept medical liability when these tools fail; instead they market solutions as “decision support,” allowing providers to have the final say. In that context, interpretability – or, understanding why a recommendation is being made – can be critical and impacts whether or not a provider is comfortable acting upon on recommendations. Interpretability is a weakness of some newer statistical tools such as deep learning. The successful delivery of interpretability, along with recommendations, greatly impacts how effectively the predictive model affects care decisions.
As data scientists, we derive pleasure from developing and implementing complicated statistical tools that provide small boosts in the accuracy of our predictive models. Unfortunately (for us), that is only a small part of the predictive analytics package. We also must understand the important questions to ask; get the right data; properly account for errors and mistakes; deliver the information in useful ways; obtain the buy-in of providers; and most importantly, ensure that that our models drive decisions that are beneficial to both patients and the health system.