Last October, the 20-plus-hospital University of Pittsburgh Medical Center (UPMC) health system launched a massive big-data initiative, one that will cost the organization more than $100 million over the next five years. With healthcare and healthcare IT leaders in that organization collaborating to embrace and leverage their massive store of data (senior vice president and CIO Dan Drawbaugh recently noted internally that UPMC has five pedabytes of data enterprise-wide), data governance has come to the fore as an issue of urgent importance.
As a result, earlier this spring, the organization launched a formal Data Governance Program, with a formal Data Governance Council, and a broad objective to support data policy-making across the massive, far-flung organization. Recently, Terri Mikol, director, data governance, at UPMC, spoke with HCI Editor-in-Chief Mark Hagland and Assistant Editor Rajiv Leventhal regarding the organization’s data governance initiative, and its implications for data governance across healthcare. Below are excerpts from that interview.
From your perspective as one of the key leaders of the data governance initiative at UPMC, what do you see as the core challenges and needs in this area, and what is your vision and the vision of your team around this?
We’ve internally created five statements that really frame why we’re doing data governance and how, as well. We began our program to protect our investment we had made in data analytics, and we were going to build a large data warehouse, and we needed a plan. The three elements of data governance are metadata, data integrity, and master data. So the crux of our program is, without these three components, we will create yet another pile of data. In healthcare, we have lots of piles of data, but to bring them to life and turn them into assets.
And how do you define those three elements?
Let’s start with master data. In our organization, we actually maintain a provider/physician master [directory] in multiple places, and the data we keep about our physicians doesn’t match. So something as simple as, what is a physician’s specialty? It’s a problem. The terms don’t match across the system. So master data is primarily data maintained in multiple places but is not in sync, but you have to do mappings and consolidations, and harmonizations, and the physician master is our best example in healthcare. We’re also working on master data for our patients—things like who is their PCP, and what is a consistent name for a patient?
And how do you define metadata and data integrity?
Metadata has to do with information about our data—where is it in our organization? Where does it originate, where is it collected, and where do we move it, and how do we use it? And what are the definitions of core metrics and data we use, and what do we send out of the organization to share? The tasks are endless. We estimate that we run about 1,200 different applications at UPMC, so we’re building an official inventory of those, to figure out how they’re being used. So in addition to the application inventory, which helps us determine where all of our data is collected, we will then move to where our data is being interfaced and shared. This is important in data governance, because every time data is moved, there’s the potential for it becoming corrupted. We want to minimize movement and keep data sound.
In addition, we’re beginning to identify all the teams that do reporting at UPMC, and we’re going to have them publish lists of the core reports they provide; so we won’t publish ALL the reports that are done in a list, but this will help us know where everyone should go for report information. We’ll never be done with metadata, but that’s what we’re working on now.
In terms of data integrity, we’ll always have problems, because we have 1,200 applications running. So we’ll be working to expose and make them transparent. We can’t fix them all. But we can start to make smart choices about what we fix. Most of the data integrity issues start at the collection point, so a lot of our applications are rather legacy and don’t have the best edits in place; changing some of these systems can be very costly, so we have to make smart choices.
What is the people-governance process around all this?
So, I have three bullet points around people. First, we’re building for the first time a shared stewardship around our data. Historically, data has been seen as IT’s responsibility, but we are now sharing the stewardship with the business, and the bulk of the ownership involves people outside IT, both business and clinician leaders. So our hope is to build data analytics integrity here. Second, we hope to change our culture at UPMC over the next five years, by growing the number of people who attain data analytics capability.
And, third, 18 months into our program, we have over 200 people with named responsibilities and decision rights throughout UPMC. These are new roles, and they are all part-time. We have backed them with a policy. And we’ve enhanced job performance criteria, for all these individuals, so they have formal descriptions that indicate that. I’ll start with the 26-member Data Governance Council. There are lots of clinicians on the council; most have some role involving working with IT; we also have representation from strategic planning, finance, and HR, and a few IT folks as well. And the Council is the ultimate decision-maker; they approve all of our roles and policies. And their biggest role is to be data evangelists, to educate people on data governance. And people still struggle with terms like metadata and master data; it’s a long process to get people comfortable. So we’re trying to use very tangible concepts when we talk to people. And that takes time.
The next role is information owners. None reside in IT. An information owner—we have an information owner of lab data and an information leader for pharmacy, and one over registration data, which is really big. So these people sign off on all the data definitions, on how we’re going to use the data, and their role will continue to increase, as we give them more information about where their data resides, and what we’re doing with it. And some of these individuals are thrilled, because they finally have decision-making capabilities, but most are uncomfortable, because we’re still working on metadata. And below information owners are data stewards. So for example, the information owner over registration data, will name data stewards over inpatient registration, over outpatient registration, over ED registration, over home healthcare, for example.
We don’t expect the information owners to have expertise in all those areas, but we want to grow their knowledge. This will be a huge time-saver in virtually every area, because we will be able to go to one person and engage. Right now, we have to go to 20 people, and no one person has decision rights. So the data stewards, their primary role are to write data definitions; and they’re also the primary resource on data integrity issues, and they’ll do research to help us determine whether we need system or process changes to fix problems.
And we have two other types of stewards unique to UPMC; and this may be one of the reasons other data governance programs falter. So, we have application stewards, and they live in IT and represent one of the 1,200 applications. And their primary role is to explain what their application does, how it collects and uses data; and they’re also responsible for the quality of the data outbounds, so the quality is better than it is today. So it’s a very IT-focused role. And the final role is the analytics steward; and these individuals represent a team of individuals that creates reports to share with the organization. And for the first time, we’ve pulled the top 40 teams, and they met together and came up with their own agenda for year one.
And they’ve decided to create an inventory of core reports, and standardization of headers and footers, and use of standardized definitions. And what’s interesting is that they’re working together across the organization for the first time. And they’re finding redundancy, where multiple teams are sharing multiple reports that overlap. And we’re striving towards self-service, and these analytics stewards are going to drive that process, so that self-service can be valuable and not harmful. Isn’t that exciting? I don’t think you’ll find that to be unique here.
It seems also that your data governance initiative fits into broader trends in healthcare. Right now, processes and policies are being made more conscious now in healthcare, right?
Well, what I’ll say is this: we want to have a great experience for the patient, and we don’t want to waste money. And some of the most creative people will find ways to get the job done. So you’ll see an amazing amount of creativity in healthcare to get things done, because there’s never been an enterprise-wide approach in organizations. And, for example, we get data requests all the time in the organization; and everybody has their own myths about, how, certain types of data are sensitive, we shouldn’t do a certain thing. But in a lot of cases, people were going to different people and getting different answers. So we’re naming what is sensitive or not, and whom you go to for approval, such as around referral data. So it isn’t so much that things were done the way they always were, but it was the best we could do.
So until you create a system of processes, you can’t improve things, right?
What is the hardest thing about doing all of this?
That people become very attached to processes. Specifically in IT, we have people who have built their careers simply based on knowing processes, since everybody has to go to them for answers. So we’re putting all this knowledge in tools, so data can be readily available. And we don’t want the risk of people leaving the organization, and data knowledge going with them. So to be optimal, we have to get all of our data knowledge and processes into tools, so everyone can use them. And the other big challenge is time.
How important is it for a healthcare organization to define those roles and responsibilities? Is that an essential component of data governance strategy?
Yes, it is, because many, many people are involved, and many lack information or don’t have an enterprise-wide perspective. So it’s very important to select the right people for this.
What advice would you have for healthcare IT leaders across the country about all of this?
Two things. First, if you’re still in a situation where you’re still selling the concept, then the organization isn’t ready yet. They have to feel the pain first. And our organization is ready, because we’ve now begun investing in big data, and frankly, if we don’t do this, we’ll just have another big data pile. The second big thing I would recommend is, don’t quit; the bulk of the data governance programs fizzle because people quit. I just went through this with the Data Council yesterday. It was an “OK” meeting, and I went to the chair, and asked, is this still valuable? And he said to me, the fact that we’re in month 18, the fact that they’re still coming and still talking, shows we’re being successful. So you have to find people who are stubborn, because you’re never done with this; it’s going to be the way we work now, and you have to just keep coming back.
And we have gotten some key people in key roles, such as members of the Data Governance Council. I did our BI management for years; and when you’re put in charge of business intelligence and you have absolutely no data governance, it’s a really hard job. And people don’t survive, they get fired, because of the all the data governance issues. So my role, and I report to Lisa, is to make sure this all works. A lot of people say that data governance shouldn’t be in IT, but all this work resides in IT right now, and it will take years to push it out of IT; and we do manage the tools of IT here.
You kind of touched on some progress made and lessons learned already. Do you have future plans you might like to share?
Sure, two things. First, we do have a maturity model that we got from IBM, very detailed, and it guides us through our maturity process, so we refer to that to help us begin to build the program. We’ve begun work in all the categories, and our roadmap shows us taking that enterprise-wide. And a lot of our work focuses on data analytics, in the three key areas I mentioned. Over the next three years, you will see us expand the membership of these virtual teams to go beyond IT and into the business, so I actually see these teams becoming departments eventually. So eventually, we’ll be doing these things for all systems and projects. We’re changing the system development life-cycle process.