Skip to content Skip to navigation

How Healthcare Organizations Can Turn Big Data Into Smart Data

January 10, 2014
by Rajiv Leventhal
| Reprints
Industry expert: “Big data is closer than it appears”
Shane Pilcher

Only a very small percentage of healthcare organizations today seem to be leading the way in healthcare data analytics, while the vast majority are very early in the business intelligence (BI)/analytics process, or haven’t even started. As a result, organizations seem to see big data as something that’s off in the very distant future; for most of them, anything outside of five years is almost nonexistent, says Shane Pilcher, vice president at the Bethel Park, Pa.-based Stoltenberg Consulting.

It is important to remember that big data is more than just a sea of information; it is an opportunity to find insights in new and emerging types of data and content.  So what are hospitals and healthcare organizations forgetting in their paths for eventual success with big data? According to Pilcher, the answer is “smart data.” In the below interview with HCI Assistant Editor Rajiv Leventhal, Pilcher talks about the difference between big data and smart data, strategies for collecting the right data, and advice for physicians in getting on board with the movement.

When you say “smart data,” what do you mean? How does smart data differ from big data?

The data that organizations are collecting today that they will be using for big data are going into this black hole (usually the data warehouse) somewhere. They are happy that they’re collecting it and preparing for when big data finally does come around to their organization, but if they aren’t careful and if they don’t monitor what they’re recording, the quality and quantity of the data when it’s to be used five years from now will not be sufficient enough. These organizations might think that they have five years of historical data to start their analytics, but in reality, the data is often not of the quality or quantity, or even the type, that is needed. That’s the smart data—that step that focuses on the type of data that they have, the volume of data, and also the validity of that data. You have to make sure that what you’re collecting is what you’re expecting.

Do healthcare organizations recognize this need?

Big data is a common theme with CIOs at healthcare organizations everywhere—they know it’s coming. However, there are CEOs at their hospitals who hear about “big data” at conferences and have no idea what it is, yet they will still come back and tell their CIOs that they “have to be doing big data.” And thus, it’s left in the lap of CIOs. But for the CIOs, they have Stage 2 of meaningful use and ICD-10 coming [for many providers, Stage 2 is here already], so they are not in the best place to be dealing with big data. So for the most part—except for about 5 percent of organizations out there, they tend to move it to sideline. It’s like looking at the side view mirror on your car and not seeing the message, “images are closer than they appear.” They see big data reflected, but it’s a lot closer than what they’re thinking. For the places that have limited resources and time, this is something that is being pushed to the side until they can get to it down the road.

How can organizations better ensure they are collecting the right quantity and quality of data?

First, you need to start developing your strategy now. Using the standard data models and approaches other industries are using doesn’t necessarily translate to healthcare IT. The amount of data, the data structure, and the data model is off the chart compared to even something as large as automotive manufacturing—the complexity isn’t even comparable. You have to develop as you go. The biggest thing I can suggest, as this industry is developing and our tools are growing, is to develop those peer networks with other healthcare leaders that are already further down the road than you. About 5 percent of healthcare organizations are right now in “stage two” of the data maturity model where they could start looking at predictive and prescriptive approaches to data. Those that are on the forefront of data analysis and intelligence are going to be critical to the rest of the industry following along. So learn from and use your peers.

And again, the quality of the data is critical. Organizations often think that they initiated the data collection, it’s implemented, and it’s working, so they turn to next project, thinking that when they’re ready, they will have it there in the warehouse. But then when it gets closer to the time to use the data, they don’t have the quantity that they thought they had. If you are collecting the wrong information or it’s incorrect, when you do your analysis, you will get wrong results and not even know it. Decisions could be devastating because your data was inaccurate leading to wrong analysis.

So you also need to assess the data on a regular basis constantly and ensure that what you think you’re collecting is actually what you’re getting. Then you can depend on the accuracy of that data when it’s time to start analyzing. Being able to analyze unstructured data for trends is very difficult, almost borderline impossible.  Yet, about 80 percent of hospitals expect to use unstructured data in their data warehouse. Turning that data into structured data, or finding a tool that can do that for you with accuracy, becomes a huge push. If organizations are not prepared for that, they are racing against time at the last minute.

You need to trust the accuracy of your data. You know that your electronic health record (EHR) is collecting certain data and dumping into the data warehouse. But is anything happening with that transfer of data that is changing it in any way? Is it remaining accurate? Was it accurate to begin with? I wouldn’t say there is an issue of incorrect data in EHRs, but people can’t 100 percent say, “Yes, it’s ready to be analyzed.”

What are some other challenges organizations are facing with big data?




Rajiv, at LexisNexis Risk Solutions we are actively engaged in using the open source HPCC Systems data intensive compute platform along with the massive LexisNexis Public Data Social Graph to tackle everything from fraud waste and abuse, drug seeking behavior, provider collusion to disease management and community healthcare interventions. We have invested in analytics that help map the social context of events through trusted relationships to create better understanding of the big picture that surrounds each healthcare event, patient, provider, business, assets and more. For an interesting case study visit:

Interesting article. I agree that there is a great distinction between big data and smart data. While many healthcare organizations view time, money and staffing as obstacles to collecting the structured data they need for accurate analysis, much of today’s advanced diagnostic equipment can collect data and automatically connect it with the larger data structure (like the patient’s medical history). By installing medical equipment with these data collection capabilities, organizations can collect quality data without depleting their resources.
Kurt Forsthoefel, Midmark Corporation