ASCO’s CancerLinQ is Harnessing Big Data to Build a Learning Health System | Healthcare Informatics Magazine | Health IT | Information Technology Skip to content Skip to navigation

ASCO’s CancerLinQ is Harnessing Big Data to Build a Learning Health System

November 22, 2016
by Heather Landi
| Reprints
In October, the CancerLinQ platform hit a milestone with more than 1 million patient records in the system, allowing oncologists to access more data
Click To View Gallery

Two years ago, the American Society of Clinical Oncology (ASCO) announced the launch of an initiative to build a big data platform, called CancerLinQ, as a database that provides oncologists with growing amounts of real-world cancer information. The CancerLinQ platform was designed to connect and analyze cancer data from electronic records to provide data to cancer providers in order to assist them with making more informed decisions about patient care. The CancerLinQ platform was co-developed with SAP utilizing the SAP Connected Health platform that runs on SAP HANA, a flexible, in-memory data management and application platform.

In October, CancerLinQ announced a significant milestone with more than 1 million patient records now in the system. Additionally, there are now 71 oncology practices in 39 states and the District of Columbia participating in CancerLinQ, representing more than 1,500 oncologists. The aim of this big data initiative is to enable cancer providers to improve the quality and value of care by analyzing millions of cancer patient medical records, uncovering patterns and trends, and measuring their care against that of their peers and recommended guidelines.

Recently, Healthcare Informatics Assistant Editor Heather Landi spoke with Robert Miller, M.D., medical director at CancerLinQ, about the progress, to date, to build the research data network. Miller, who is a board-certified medical oncologist and informaticist, also shared his perspectives about how CancerLinQ is helping to break down data silos, how big data initiatives can help move the needle toward better cancer care, and the continuing challenges that oncologists and informaticists face in this area.

What are some of the challenges that oncologists are facing that CancerLinQ was specifically developed to address?

The American Society of Clinical Oncology (ASCO) board of directors identified the fact that only a small percentage of adult cancer patients actually participate in clinical trials for their care [about 3 percent]. So what that means is that for 97 percent of cancer patients, they are receiving the best care as determined by their local oncologist, but that knowledge of what happens to them as a result of the everyday care experience is largely lost, because the data is trapped in electronic health records (EHRs), or in some remaining cases paper records. So, the problem that ASCO was hoping to impact was to try to make all these data interoperable, to allow learning to occur from the care experiences of every cancer patient. So, they came up with this idea of creating a database whereby EHR data would feed into a single, aggregated database, and it would have to be de-identified for privacy protections, and that database could then be accessed by the broader cancer community, or what they call a learning health system. ASCO wanted to build a learning health system for the field of oncology and that’s what we are doing.


Experience New Records for Speed & Scale: High Performance Genomics & Imaging

Through real use cases and live demo, Frank Lee, PhD, Global Industry Leader for Healthcare & Life Sciences, will illustrate the architecture and solution for high performance data and AI...

Robert Miller, M.D.

As of today, CancerLinQ has 71 participating practices in the U.S., and the way the program works is that our technology teams and informaticists connect the EHRs at the back end to the CancerLinQ database through a direct software connection and then using either pull technology or push technology, the protected health information data is moved from the EHRs into the CancerLinQ database where it’s aggregated. There is an initial data dump at the time of connection, and then there’s nightly incremental updates, so the data is refreshed every single day. Several weeks ago, we had crossed the point where we had 1 million patient records in the data lake. These are not all processed records yet, but a million records that have been brought on board from a percentage of the 71 practices that are participating.

Who is able to access these patient records in the database?

Right now, as of today, access is restricted to the subscribing practices that are themselves contributing data to CancerLinQ. However, we are in the process of finalizing, probably in the next few weeks, the third-party access policy. From the very beginning of this whole initiative, ASCO has anticipated that this database would have great interest and value to the larger cancer community, and so, once those policies are in place, and it will probably be operational by the first quarter of next year, then really anyone, whether that’s an academician, government agency or a commercial interest, anyone with a legitimate interest in CancerLinQ data would have the ability to apply for access. There will be an approval process and the request will have to be consistent with ASCO’s mission and for the good of the cancer community, but we anticipate a broad swath of interested parties will start to use CancerLinQ in the not-too-distant future.

What does CancerLinQ enable cancer providers to do?

It’s still fairly early days as we’re bringing practices on and the number of practices that are fully live and operational is still relatively small, as it’s not all 71 just yet. However, there are two main parts of the service that every subscriber gets. One is they have access to electronic clinical quality measures, so these are clinical quality measures that are based on other quality measures that ASCO has already created for other quality programs that really reflect the quality of the care that providers are delivering in their offices. CancerLinQ is in these multi-faceted dashboards and provides [the oncologist] with their own quality performance on a real-time basis—which is something that, heretofore, had been retrospective, and now this is real-time and prospective. So they have access to that and that has an immediate potential impact on improving quality of care. And the second is a tool within CancerLinQ that’s based on SAP technology, CancerLinQ Insights (CLQI), and this is a data exploration tool where any subscriber can drill down into the aggregated de-identified data set looking for new patterns or increasingly smaller cohorts of patients to really answer pressing clinical questions. So, an example of that might be, a rare cancer type, something that there’s really not good guidelines on and which clinical trials generally don’t address very effectively. An individual subscriber can drill down into the data to see how the last 100 patients with this unusual cancer type were treated in the database and that can impact care by being able to offer some suggestions. It’s not the same thing as guidelines or a clinical trial, but it’s still data that would otherwise not be available to most people.

Do you think big data initiatives like CancerLinQ are helping to move the needle toward better cancer care?

This is a way of enabling physicians to use electronic tools, to use the power of computers, particularly supercomputers that can handle big data, at the frontlines. What I mean by that is our current electronic tools, the EHRs that oncologists and other doctors use, there’s no mystery that there is a lot of dissatisfaction with them, that they don’t perform as well as many specialists like them too. While CancerLinQ is not an EHR in and of itself, within the system, this is something that oncologists are designing for other oncologists so there are dashboards, tools and programs that give an oncology-specific view of the patient timeline and presents oncology-specific quality measures, things that really matter to oncologists, and this is something that is just not available, for the large part, in many of the electronic systems and many environments that most cancer doctors are familiar with now.

How does CancerLinQ support Vice President Biden’s Cancer Moonshot effort?

CancerLinQ has been very visible with the Vice President, we have personally met with Mr. Biden on several occasions, and spoken with him about his vision and how we can help his vision become a reality. He’s talked largely about research and compressing progress, achieving 10 years of progress in five years. Where I think CancerLinQ can be effective is, number one, data sharing. The norm is that doctors care for cancer patients in their own clinics and own cancer centers, and the data is sitting in their EHRs and it’s rarely accessed and certainly not shared across institutions. Through the CancerLinQ technology, we now have the ability to have this much larger aggregated dataset that everyone can benefit from, and with the third-party access opening up soon, cancer care will improve in that regard.

There is a lot of discussion about the need to break down data silos. Does the CancerLinQ platform create a new paradigm for sharing data?

I think the primary way CancerLinQ is helping to break down data silos is the main construct of the system which is that we’re bringing in data from practices throughout the U.S., and I’ll add that, probably next year, to international practices as well, and we’re not restricted to any one particular EHR or any one type of practice. We have something like 15 different EHRs that are currently connected or will be connected soon to CancerLinQ. Our practices represent the full spectrum of oncology care throughout the U.S., including small, single specialty practices, larger regional integrated delivery networks, such as Intermountain Healthcare in Utah, and also all the way up to academic medical centers, University Hospitals in Cleveland and Rush University and several others, and at the other end, there are some safety net hospitals which are in the process of signing with us. I think the one of the complaints about data silos is that any one site represents the types of patients and practice in that site, but with CancerLinQ, all of the data from these different types of sites can be blended into one dataset that everyone can access.

When we started this, we didn’t know for sure if oncologists were going to prioritize this or whether they would be comfortable with this idea of sharing their patients’ data. We wondered whether there would be grave reservations, from a privacy and security standpoint, or at the other end of the spectrum, there are concerns that some have raised that the larger centers may want to hang onto their data, to monetize it or use it for other purposes, but what we found was virtually none of that. The response of the oncology community has been overwhelming. We have several hundred practices more in the pipeline. This construct that we’ve come up with seems to resonate.

What role does the CancerLinQ initiative play in moving toward personalized medicine?

Right now, CancerLinQ is largely bringing in what we call phenotypic data, so this is the data from EHRs, for the large part, and it doesn’t include a lot of genetic information or personalized molecular information just yet. But we will, in the near future, start to bring in some solutions to capture genomic data, and when we bring in this genomic data—which tells us what happened to this patient’s cancer, the patient’s experience with the treatment, what were the outcomes, what side effects did they experience as a result of a particular drug—we can then link our data with genomic data from other sources, and that’s largely tumor sequencing data. So, more and more cancer patients are undergoing next generation sequencing of their tumors, so there are these rich genomic profiles being created. That data is just exploding right now, but people don’t know what to do with it just yet, and the reason is, there hasn’t been enough effort yet to match the specific genomic alterations of the tumor to what happens to the patient. So, what CancerLinQ can do is provide the answer to what happens to the patient, what is their cancer like, and what was their treatment, and how does that break down based on what specific markers are present in their tumor? I think this is a very important role and I think this area will only accelerate in the next 12 and 24 months.

What are some of the biggest challenges that CancerLinQ is facing right now?

There’s challenges that we’re facing, and there’s challenges that cancer care is facing. I think what oncologists are facing is that this is a very complex time to practice on many levels—the number of therapeutic options is exploding, the need to understand genomics and other molecular medicine is huge in oncology, and that is something that’s a real challenge for many who see multiple types of cancer patients. In addition, because of the title wave of changes in reimbursement, practitioners have to demonstrate that they are practicing quality care by participating in registries and other requirements and this is a challenge for them.

I think what we face from the CancerLinQ side is really just trying to integrate all these data sources and make sure that the data quality is as good as possibly can be. The data that comes from real world practice is never as clean as the somewhat artificial datasets that are generated in the context of the clinical trial, where the conditions are very controlled. So what we’ve taken on is basically trying to fix all that on the backend. So as we grow and diversify, that challenge is never going to go away. We have the backing of SAP as a technology partner to innovate with us and to build new solutions for these types of data challenges. That’s my day job right there. Trying to figure out some of these things.

You mentioned expanding to international providers in the next year. What are some other future plans for CancerLinQ?

Currently, we’re somewhere in the ballpark of having about one in seven or one in eight U.S. oncologists participate in CancerLinQ through their practice. I think by the end of next year, it could very well be one in three, and probably by 2019, it may be one in two. So growth is one of the things we see happening and certainly we’re on pace for that right now. The second is the diversification of the data. Right now we bring EHR data in from practices, but that’s not the only source of data—there is claims data, cancer registry data and genomic datasets and large laboratory datasets and, very importantly, there’s patient-generated data. The patient reports of their outcomes is a critical aspect of what’s really happening to cancer patients and there are apps, webpages, mobile devices and other tools that can be used to gather those data points from patients and all that can be brought into CancerLinQ. So we intend to diversify our data input to some of these other sources, which again, will enrich the overall database that we’re building.


The Health IT Summits gather 250+ healthcare leaders in cities across the U.S. to present important new insights, collaborate on ideas, and to have a little fun - Find a Summit Near You!


Definitive Healthcare Acquires HIMSS Analytics’ Data Services

January 16, 2019
by Rajiv Leventhal, Managing Editor
| Reprints

Definitive Healthcare, a data analytics and business intelligence company, has acquired the data services business and assets of HIMSS Analytics, the organizations announced today.

The purchase includes the Logic, Predict, Analyze and custom research products from HIMSS Analytics, which is commonly known as the data and research arm of the Healthcare Information and Management Systems Society.

According to Definitive officials, the acquisition builds on the company’s “articulated growth strategy to deliver the most reliable and consistent view of healthcare data and analytics available in the market.”

Definitive Healthcare will immediately begin integrating the datasets and platform functionality into a single source of truth, their executives attest. The new offering will aim to include improved coverage of IT purchasing intelligence with access to years of proposals and executed contracts, enabling transparency and efficiency in the development of commercial strategies.

Broadly, Definitive Healthcare is a provider of data and intelligence on hospitals, physicians, and other healthcare providers. Its product suite its product suite provides comprehensive data on 8,800 hospitals, 150,000 physician groups, 1 million physicians, 10,000 ambulatory surgery centers, 14,000 imaging centers, 86,000 long-term care facilities, and 1,400 ACOs and HIEs, according to officials.

Together, Definitive Healthcare and HIMSS Analytics have more than 20 years of experience in data collection through exclusive methodologies.

“HIMSS Analytics has developed an extraordinarily powerful dataset including technology install data and purchasing contracts among other leading intelligence that, when combined with Definitive Healthcare’s proprietary healthcare provider data, will create a truly best-in-class solution for our client base,” Jason Krantz, founder and CEO of Definitive Healthcare, said in a statement.

More From Healthcare Informatics


Machine Learning Survey: Many Organizations Several Years Away from Adoption, Citing Cost

January 10, 2019
by Heather Landi, Associate Editor
| Reprints

Radiologists and imaging leaders see an important role for machine learning in radiology going forward, however, most organizations are still two to three years away from adopting the technology, and a sizeable minority have no plans to adopt machine learning, according to a recent survey.

A recent study* by Reaction Data sought to examine the hype around artificial intelligence and machine learning, specifically in the area of radiology and imaging, to uncover where AI might be more useful and applicable and in what areas medical imaging professionals are looking to utilize machine learning.

Reaction Data, a market research firm, got feedback from imaging professionals, including directors of radiology, radiologists, chiefs of radiology, imaging techs, PACS administrators and managers of radiology, from 152 healthcare organizations to gauge the industry on machine learning. About 60 percent of respondents were from academic medical centers or community hospitals, while 15 percent were from integrated delivery networks and 12 percent were from imaging centers. The remaining respondents worked at critical access hospitals, specialty clinics, cancer hospitals or children’s hospitals.

Among the survey respondents, there was significant variation in the number of annual radiology studies performed—17 percent performed 100-250 thousand studies each year; 16 percent performed 1 to 2 million studies; 15 percent performed 5 to 25 thousand studies; 13 percent performed 250 to 500 thousand; 10 percent performed more than 2 million studies a year.

More than three quarters of imaging and radiology leaders (77 percent) view machine learning as being important in medical imaging, up from 65 percent in a 2017 survey. Only 11 percent view the technology as not important. However, only 59 percent say they understand machine learning, although that percentage is up from 52 percent in 2017. Twenty percent say they don’t understand the technology, and 20 percent have a partial understanding.

Looking at adoption, only 22 percent of respondents say they are currently using machine learning—either just adopted it or have been using it for some time. Eleven percent say they plan to adopt the technology in the next year.

Half of respondents (51 percent) say their organizations are one to two years away (28 percent) or even more than three years away (23 percent) from adoption. Sixteen percent say their organizations will most likely never utilize machine learning.

Reaction Data collected commentary from survey respondents as part of the survey and some respondents indicated that funding was an issue with regard to the lack of plans to adopt the technology. When asked why they don’t ever plan to utilize machine learning, one respondent, a chief of cardiology, said, “Our institution is a late adopter.” Another respondent, an imaging tech, responded: “No talk of machine learning in my facility. To be honest, I had to Google the definition a moment ago.”

Survey responses also indicated that imaging leaders want machine learning tools to be integrated into PACS (picture archiving and communication systems) software, and that cost is an issue.

“We'd like it to be integrated into PACS software so it's free, but we understand there is a cost for everything. We wouldn't want to pay more than $1 per study,” one PACS Administrator responded, according to the survey.

A radiologist who responded to the survey said, “The market has not matured yet since we are in the research phase of development and cost is unknown. I expect the initial cost to be on the high side.”

According to the survey, when asked how much they would be willing to pay for machine learning, one imaging director responded: “As little as possible...but I'm on the hospital administration side. Most radiologists are contracted and want us to buy all the toys. They take about 60 percent of the patient revenue and invest nothing into the hospital/ambulatory systems side.”

And, one director of radiology responded: “Included in PACS contract would be best... very hard to get money for this.”

The survey also indicates that, among organizations that are using machine learning in imaging, there is a shift in how organizations are applying machine learning in imaging. In the 2017 survey, the most common application for machine learning was breast imaging, cited by 36 percent of respondents, and only 12 percent cited lung imaging.

In the 2018 survey, only 22 percent of respondents said they were using machine learning for breast imaging, while there was an increase in other applications. The next most-used application cited by respondents who have adopted and use machine learning was lung imaging (22 percent), cardiovascular imaging (13 percent), chest X-rays (11 percent), bone imaging (7 percent), liver imaging (7 percent), neural imaging (5 percent) and pulmonary imaging (4 percent).

When asked what kind of scans they plan to apply machine learning to once the technology is adopted, one radiologist cited quality control for radiography, CT (computed tomography) and MR (magnetic resonance) imaging.

The survey also examines the vendors being used, among respondents who have adopted machine learning, and the survey findings indicate some differences compared to the 2017 survey results. No one vendor dominates this space, as 19 percent use GE Healthcare and about 16 percent use Hologic, which is down compared to 25 percent of respondents who cited Hologic as their vendor in last year’s survey.

Looking at other vendors being used, 14 percent use Philips, 7 percent use Arterys, 3 percent use Nvidia and Zebra Medical Vision and iCAD were both cited by 5 percent of medical imaging professionals. The percentage of imaging leaders citing Google as their machine learning vendor dropped from 13 percent in 2017 to 3 percent in this latest survey. Interestingly, the number of respondents reporting the use of homegrown machine learning solutions increased to 14 percent from 9 percent in 2017.


*Findings were compiled from Reaction Data’s Research Cloud. For additional information, please contact Erik Westerlind at


Related Insights For: Analytics


Drexel University Moves Forward on Leveraging NLP to Improve Clinical and Research Processes

January 8, 2019
by Mark Hagland, Editor-in-Chief
| Reprints
At Drexel University, Walter Niemczura is helping to lead an ongoing initiative to improve research processes and clinical outcomes through the leveraging of NLP technology

Increasingly, the leaders of patient care organizations are using natural language processing (NLP) technologies to leverage unstructured data, in order to improve patient outcomes and reduce costs. Healthcare IT and clinician leaders are still relatively early in the long journey towards full and robust success in this area; but they are moving forward in healthcare organizations nationwide.

One area in which learnings are accelerating is in medical research—both basic and applied. Numerous medical colleges are moving forward in this area, with strong results. Drexel University in Philadelphia is among that group. There, Walter Niemczura, director of application development, has been helping to lead an initiative that is supporting research and patient care efforts, at the Drexel University College of Medicine, one of the nation’s oldest medical colleges (it was founded in 1848), and across the university. Niemczura and his colleagues have been partnering with the Cambridge, England-based Linguamatics, in order to engage in text mining that can support improved research and patient care delivery.

Recently, Niemczura spoke with Healthcare Informatics Editor-in-Chief Mark Hagland, regarding his team’s current efforts and activities in that area. Below are excerpts from that interview.

Is your initiative moving forward primarily on the clinical side or the research side, at your organization?

We’re making advances that are being utilized across the organization. The College of Medicine used to be a wholly owned subsidiary of Drexel University. About four years ago, we merged with the university, and two years ago we lost our CIO to the College of Medicine. And now the IT group reports to the CIO of the whole university. I had started here 12 years ago, in the College of Medicine.


Experience New Records for Speed & Scale: High Performance Genomics & Imaging

Through real use cases and live demo, Frank Lee, PhD, Global Industry Leader for Healthcare & Life Sciences, will illustrate the architecture and solution for high performance data and AI...

And some of the applications of this technology are clinical and some are non-clinical, correct?

Yes, that’s correct. Our data repository is used for clinical and non-clinical research. Clinical: College of Medicine, College of Nursing, School of Public Health. And we’re working with the School of Biomedical Engineering. And college of Arts and Sciences, mostly with the Psychology Department. But we’re using Linguamatics only on the clinical side, with our ambulatory care practices.

Overall, what are you doing?

If you look at our EHR [electronic health record], there are discrete fields that might have diagnosis codes, procedure codes and the like. Let’s break apart from of that. Let’s say our HIV Clinic—they might put down HIV as a diagnosis, but in the notes, might mention hepatitis B, but they’re not putting that down as a co-diagnosis; it’s up to the provider how they document. So here’s a good example: HIV and hepatitis C have frequent comorbidity. So our organization asked a group of residents to go in and look at 5,700 patient charts, with patients with HIV and hepatitis C. Anybody in IT could say, we have 677 patients with both. But doctors know there’s more to the story. So it turns out another 443 had HIV in the code and hep C mentioned in the notes. Another 14 had hep C in the code, and HIV in the notes.

So using Linguamatics, it’s not 5,700 charts that you need to look at, but 1,150. By using Linguamatics, we narrowed it down to 1,150 patients—those who had both codes. But then we found roughly 460 who had the comorbidity mentioned partly in the notes. Before Linguamatics, all residents had to look at all 5,700 charts, in cases like this one.

So this was a huge time-saver?

Yes, it absolutely was a huge time-saver. When you’re looking at hundreds of thousands or millions of patient records, the value might be not the ones you have to look at, but the ones you don’t have to look at. And we’re looking at operationalizing this into day-to-day operations. While we’re billing, we can pull files from that day and say, here’s a common co-morbidity—HIV and hep C, with hep C mentioned in those notes—and is there a missed opportunity to get the discrete fields correct?

Essentially, then, you’re making things far more accurate in a far more efficient way?

Yes, this involves looking at patient trials on the research side, while on the clinical side, we can have better quality of care, and more updated billing, based on more accurate data management.

When did this initiative begin?

Well, we’ve been working with Linguamatics for six or seven years. Initially, our work was around discrete fields. The other type of note we look at has to do with text. We had our rheumatology department, and they wanted to find out which patients had had particular tests done—they’re looking for terms in notes… When a radiologist does a report on your x-ray, it’s not like a test for diabetes, where a blood sugar number comes out; x-rays are read and interpreted. The radiologists gave us key words to search for, sclerosis, erosions, bone edema. There are about 30 words. They’re looking for patients who have particular x-rays or MRIs done, so that instead of looking for everyone who had these x-rays done, roughly 400 had these terms. We reduced the number who were undergoing particular tests. The rheumatology department was looking for patients for patient recruitment who had x-rays done, and had these kinds of findings.

So the rheumatology people needed to identify certain types of patients, and you needed to help them do that?

Yes, that’s correct. Now, you might say, we could do word search in Microsoft Word; but the word “erosion” by itself might not help. You have to structure your query to be more accurate, and exclude certain appearances of words. And Linguamatics is very good at that. I use their ontology, and it helps us understand the appearance of words within structure. I used to be in telecommunications. When all the voice-over IP came along, there was confusion. You hear “buy this stock,” when the message was, “don’t buy this stock.”

So this makes identifying certain elements in text far more efficient, then, correct?

Yes—the big buzzword is unstructured data.

Have there been any particular challenges in doing this work?

One is that this involves an iterative process. For someone in IT, we’re used to writing queries and getting them right the first time. This is a different mindset. You start out with one query and want to get results back. You find ways to mature your query; at each pass, you get better and better at it; it’s an iterative process.

What have your biggest learnings been in all this, so far?

There’s so much promise—there’s a lot of data in the notes. And I use it now for all my preparatory research. And Drexel is part of a consortium here called Partnership In Educational Research—PIER.

What would you say to CIOs, CMIOs, CTOs, and other healthcare IT leaders, about this work?

My recommendation would be to dedicate resources to this effort. We use this not only for queries, but to interface with other systems. And we’re writing applications around this. You can get a data set out and start putting it into your work process. It shouldn’t be considered an ad hoc effort by some of your current people.



See more on Analytics

agario agario---betebet sohbet hattı betebet bahis siteleringsbahis