Very exciting developments have been taking place recently at the Danville, Pennsylvania-based Geisinger Health System, an integrated health system long renowned for its innovations in many operational areas. Among the many exciting developments of late has been the push on the part of senior leaders at Geisinger to develop and implement an enterprise-wide unified data architecture (UDA), something that for most patient care organizations nationwide remains futuristic—yet is happening now at Geisinger.
For the accomplishment of the development a unified data architecture, the editors of Healthcare Informatics have named Geisinger Health as a semi-finalist winner in the 2017 Healthcare Informatics Innovator Awards Program.
At Geisinger, senior vice president and CIO John Kravitz, Bipin Karunakaran, the vice president in charge of data management, and Joseph Scopelliti, IT director, data management, have been helping to lead their colleagues forward in moving to leverage data and analytics. And, as large numbers of professionals at Geisinger move forward to leverage data for many, many purposes, it has become clearer and clearer over time that a very broad-based and unified data architecture will be needed in order to service the broadly cresting wave of needs for data and analytics. Thus, Kravitz, Karunakaran, Scopelliti, and other healthcare IT leaders at Geisinger, have come to the conclusion within two years that the organization would need to rework its data infrastructure to support is groundbreaking work in population health management, care management, clinical transformation, and other key areas.
As Scopelliti wrote in his team’s Innovator Awards submission, “The project was to create a Unified Data Architecture (UDA), which integrates all of the analytic platforms at Geisinger Health System. The key component of the UDA would be the creation of the Big Data (Hadoop) platform. This platform was the first phase of the project. In a one-year timeframe, the team established a big-data platform, based on Hadoop and other open-source components. In this first year, we have developed code for a source ingestion pipeline (which pulls in source data, performs the necessary transformations, and loads the data into various views, each of which have specific benefits to the data analysts. We have pulled in all of the source data currently populating the data warehouse (EDW), plus additional sources not in the EDW. Additionally, we've done work with the non-discrete data (using the NLP capabilities of Hadoop), and now can analyze the thyroid and pulmonary clinic notes. Further, we've decided that all new development should be done on the big data platform (instead of the EDW) wherever possible; case in point being the work we did on Hadoop for BPCI (Bundled Payments Care Initiative).”
Scopelliti added in his team’s submission that “Geisinger has taken a bold step with this project, even the first phase (building out the big data platform), as we plan to deviate from industry standard and the common opinion that Big Data should augment the EDW, not replace it. We are on our way to proving that we CAN replace the EDW. By running analytics from our Hadoop infrastructure, we have all of the benefits of distributed computing, plus the additional benefits of late binding and the ability to deal with non-discrete data, such as we find in clinic notes. I have included a presentation we recently did at the Healthcare Data and Analytics Association conference, which gives more background on the work we did, and benefits achieved.”
Scopelliti spoke recently to HCI Editor-in-Chief Mark Hagland regarding Geisinger’s unified data architecture initiative. Below are excerpts from that interview.
How did your unified data architecture initiative begin?
Geisinger has a long history of analytics. There are a couple of organizations in the country like Intermountain and Geisinger, that have been doing this for a long time. And honestly, the start of it was 1995-1996, when we implemented our Epic EHR. And ten years later, leadership said, we’ve got all this clinical data, we need a data warehouse. So in 2008, we went live with the first iteration of our EDW—we called it CDIS—the Clinical Decision Intelligence System. The beauty of this—there are a couple of things. Number one, we pulled in not only EHR data, but financial data, claims data, because we have a health plan, and other types of data as well.
In the past, if data analysts wanted to do some research or analysis, they would have to request research from the data team. Now, all of a sudden, with the data warehouse, they could do this themselves, and data analytics exploded, in a good way. And IBM came in and helped us with this. And we ran with that until 2012. And then we decided that we needed a different data warehouse, so we moved to a TeraData data warehouse with stronger computing capability. And we’re still running that. We have thousands of reports and dashboards that are running on CDIS.
So last year, it was decided by executive leadership that we needed to move beyond the CDIS data warehouse, to a unified data architecture. And how I see the UDA is that it’s an integration of all of our key data platforms. So for example, we’re doing some work with Cerner on population health, via their HealthyIntent platform. And Epic is going to be coming out with this EDW of their own—they keep changing its name. The point is that, we’re tying all of our key analytics platforms together. But one major component of this UDA is this new data platform based on Hadoop.