Skip to content Skip to navigation

Penn Medicine’s Investment in Data Infrastructure Paying Off

July 15, 2015
by David Raths
| Reprints
‘Penn Signals’ applies machine learning to adverse patient events
Penn Medicine's Brian Wells

Philadelphia-based Penn Medicine has invested many years and millions of dollars in electronic health record infrastructure to capture data at the point of care. More than seven years ago it also started building a large clinical data warehouse that now holds records on 3 million patients going back 10 years.

That investment is now reaping benefits as Penn data scientists use the data to fine-tune algorithms and predictive logic that could identify adverse events such as the onset of sepsis. On July 14, I had a chance  to interview Brian Wells, Penn Medicine’s associate vice president of health technology and academic computing, and Michael Draugelis, chief data scientist, about “Penn Signals,” a platform that provides the tools needed to build, test and deploy predictive applications powered by Penn’s EHR data stream.

“We can run an algorithm against real-time data coming out of the EMR and do predictions almost at the point of care,” said Wells. “That is the foundation we have laid. Because of that investment in having the data organized, aggregated, and mapped in a way that is very usable, it enables our team to do some things that are pretty amazing.”

In fact, Draugelis, who came to Penn Medicine after a career as chief data scientist at Lockheed Martin, said he was attracted to Penn Medicine because of the great job they have done curating the data. (Penn Medicine is a $4.3 billion organization with more than 2,000 physicians providing services to the Hospital of the University of Pennsylvania, Penn Presbyterian Medical Center, Pennsylvania Hospital, Chester County Hospital and a health network that serves the city of Philadelphia, the surrounding five-county area and parts of southern New Jersey.)

Penn also bought into the vision of a small data science team that could build a framework called Penn Signals that taps into years of retrospective data, as well as real-time data to develop these algorithms in a way that allows insights to be sent into operational channels for clinicians, Draugelis said.

“We embed the data scientist with the clinical team to figure out what the experts are doing right now and that is going to point us in the right direction,” he explained. For instance, with the sepsis algorithm, Penn Medicine already had a system called early warning system 1.0, and the clinical team developed a decision tree with seven clinical variables that did quite well, he said. “We started there, but then we pulled thousands of variables together — all the vitals, labs and medications, etc., and Penn Signals puts this into a real-time matrix that we can apply these algorithms to in order to make predictions,” he explained. The result is a snapshot, not just of one moment, but of a trend in time. “How these hundreds of variables interact over a period of time creates a tapestry and that is what we are looking to recognize,” he explained.

 “What we find is that, of course, the variables the clinicians looked at are important in forecasting these events, but we finds tens of others that are important,” Draugelis said. “Two things happen: we get a more powerful forecasting algorithm, and we have deployed that. Secondly, we create points of research, where we can say there are other variables that are showing strong forecasting power for the onset of severe sepsis. Why is that? That is a very important focus at a university hospital, where it sets up points for more clinical research to understand those things.”

The fact that the data scientists are embedded with the clinical team instead of just providing some black-box solution is critical, he stressed. “Some of the variables are red herrings,” Draugelis said. “We might be measuring the care processes and not the illness. We don’t want to use those. As much as we as data scientists try to protect from that, clinicians can segregate those out pretty quickly. With these data models it is easy to do the sub-optimal thing and go off on a dirt road that is dangerous,” he added. “To get the correct solution for the patient, it is really important to be connected to the care team along the way and have this cycle of iteration with their pathway.”

Penn Medicine has a pipeline of algorithms in the works. The first was sepsis; another is around heart failure and risk stratification. The heart failure team is working to define the Penn pathway for heart failure, to determine how they can catch signals in the very beginning before official diagnosis to say this person could benefit from advanced care (or not).

Draugelis said a key focus of his team has been to reduce pain points in developing the algorithms and make them accessible for health systems to deploy, so they can focus less on the technology. “Any time you produce these new pieces of information that never existed before, you need to do a redesign of your [care] pathway, and that is where the hard work is,” he said. “We did this completely with open source technology and our plan is to share it as much as we can with other institutions so that they can take advantage of these things.”

Other Things in Progress at Penn Medicine

Besides the clinical data warehouse, Penn Medicine also has created a research data warehouse dubbed “PennOmics,” which provides a centralized location for fully de-identified clinical data, replacing research data silos around campus. I asked Brian Wells for a quick update on PennOmics and other developments at Penn Medicine.