One of the core issues that clinicians and others have had with EHRs (electronic health records) is that so much of the relevant medical information that’s vital to improving patient care is “trapped” within EHR physician notes. As such, patient care organizations are more and more turning to natural language processing (NLP)—a technology that allows providers to gather and analyze unstructured data, such as free-text notes.
One such health system that has worked to leverage NLP to exact and transform data collected during routine clinical care is the St. Louis-based Mercy, one of the country’s largest health systems that includes more than 40 acute care and specialty hospitals, and 800 physician practices and outpatient facilities. In Mercy’s submission for the 2018 Healthcare Informatics Innovator Awards Program, organizational leaders outlined the key details behind the health system’s project, called “Using NLP on EHR notes in Heart Failure Patients.” The submission ended up receiving semifinalist status in this year’s program.
The overarching goal of this initiative, says Kerry Bommarito, manager of data science at Mercy, was to use NLP to extract key cardiology measures from physician and other clinical notes, and then incorporate the results into a dataset with discrete data fields. The dataset would then be used to obtain actionable information and contribute to the evaluation of outcomes of medical devices in heart failure patients—a subset population of which there have been approximately 100,000 patients in the Mercy system going back to 2011.
Bommarito explains that three core measures that are commonly stored in clinical notes and not available in discrete fields include ejection fraction measurement, patient symptoms including dyspnea (breathing difficulty), fatigue and dizziness, and the New York Heart Association (NYHA) heart failure classification—the latter which places patients in one of four categories based on how limited they are during physical activity.
“To be able to best classify how severe the CHF [congestive heart failure] was, we really needed to get these measures out of the physician notes,” Bommarito attests, adding that since heart failure is a chronic and progressive syndrome, changes in these three measures are important indicators of heart failure decompensation.
Joseph Drozda, M.D., cardiologist, director of outcomes research at Mercy, says that “perhaps 60 percent of the data you would really like is available to you as discrete data in the EHR. The remaining 40 percent is contained in text and clinical notes, and in order to get the meaningful data out, you have to [use] something like NLP to capture it.”
Indeed, Dr. Drozda believes the issue stems from most EHRs being originally developed as billing systems that were designed to capture data necessary for populating a claim form and submitting a bill. “They weren’t really designed for clinical care; that came afterwards,” he says. “And in a lot of ways, in the early stages, the EHR systems were putting the paper records in electronic format without any regard to trying to use the data on the back end. It was pretty much all text. I think it was a basic design flaw in most EHR systems right from the beginning, though we are starting to overcome it.”
Nonetheless, Drozda notes that this challenge is still difficult to overcome, because in order to capture something like ejection fraction—a measurement in determining how well the heart is pumping out blood, helping to diagnose and track heart failure—visually speaking, a clinician has to look for a dropdown menu or find someplace to enter a value, which takes extra time. “So there are two things working against you—the basic underlying technology challenges and the workflow challenges that clinicians face in entering discrete data. We have to be very careful in how much discrete data [we make] clinicians enter, as they are already concerned with deaths by 1,000 clicks,” he says.
Joseph Drozda, M.D.
As such, for Mercy’s heart failure patient population, project leaders brought in all of the notes that they had at the time, totaling about 34 million going back seven years. NLP queries were developed by a team of Mercy data scientists to search for relevant linguistic patterns, and then the queries were evaluated for both precision and recall. When the queries were determined to have a high accuracy, the results were integrated into a comprehensive data schema that contains real-world clinical data for each heart failure patient from before the diagnosis of CHF to their current state, Bommarito explains.
The final queries were validated and had a high accuracy with an F-measure (the measure of as test’s accuracy) score above 0.90 (1 is the highest F-measure value possible). “This F-measure score shows that Mercy’s queries were highly precise (positive predictive value) and had high recall (also known as the true positive rate or sensitivity), says Bommarito. What’s more, “These results show that natural language processing is a reliable and accurate method to extract relevant data from clinical notes in a CHF population,” she concluded.