Skip to content Skip to navigation

The Odyssey: Using Really Big Data to Better Understand Treatment Pathways

June 15, 2016
| Reprints
Combining data on 250 million patients highlights treatment heterogeneity

I have written before about how “big data” studies involving the electronic medical records of thousands of patients over many years are starting to reveal new insights about the impact of drug combinations that may be unattainable through traditional clinical trials.

As part of the concept of a learning health system, observational research on large EHR data sets promises to complement experimental research by providing large, diverse populations that would be infeasible for an experiment. Another recent example involved a study combining data on 250 million patients and found heterogeneity in care treatment pathways for several conditions.

The Observational Health Data Sciences and Informatics (OHDSI, pronounced Odyssey) collaboration created an international data network with 11 data sources from four countries, including EHR and administrative claims data on 250 million patients. Participants include the Stanford Translational Research Integrated Database Environment (2 million patients) and Columbia University Medical Center (4 million patients). 

I read about the large-scale study on the Scope blog of the Stanford School of Medicine. “These kinds of numbers are going to become increasingly common,” bioinformatician Nigam Shah told the Scope. “Most networks like this are already in the 100-million range,” added Shah, one of the researchers on the project.

The researchers explain that patient privacy was maintained by using a distributed model, with results aggregated centrally, according to a paper published in the Proceedings of the National Academy of Sciences.

The study, led by George Hripcsak, M.D., M.S., a professor and chair of Columbia University’s Department of Biomedical Informatics, followed treatment pathways for type 2 diabetes mellitus, hypertension, and depression. Although the researchers said there is movement toward more consistent therapy over time across diseases and across locations, they added that “significant heterogeneity remains among sources, pointing to challenges in generalizing clinical trial results.”

Diabetes favored a single first-line medication, metformin, to a much greater extent than hypertension or depression, the study found. About 10 percent of diabetes and depression patients and almost 25 percent of hypertension patients followed a treatment pathway that was unique within the cohort. Aside from factors such as sample size and underlying population (academic medical center vs. general population) EHR data and administrative claims data revealed similar results.

The researchers used OHDSI’s large, diverse population to characterize treatment pathways—defined as the ordered sequence of medications that a patient is prescribed—to provide new insights into clinical practice, with the aim of revealing patterns and variation in treatment among data sources and diseases.

For diabetes, metformin was the most commonly prescribed medication; it was prescribed 75 percent of the time as the first medication and remained the only medication 29 percetn of the time, thus confirming general adoption of the first-line recommendation of the American Association of Clinical Endocrinologists diabetes treatment algorithm.

Hypertension shows the slight predominance of hydrochlorothiazide as a starting medication but the more significant predominance of lisinopril as a sole therapy, with hydrochlorothiazide being a sole therapy only rarely (hydrochlorothiazide is frequently paired with another active ingredient in combination medications).

(Depression shows a more even spread of medications.)

But the researchers stressed that 10 percent of diabetes patients, 24 percent of hypertension patients, and 11 percent of depression patients followed a treatment pathway that was shared with no one else in any of the data sources. So for almost one quarter of hypertension patients, the response to the question, “In an underlying population of 250 million, based on my 3-year treatment pathway, what patients are like me?” would be “No one.”

OHDSI’s common data model specifies how to encode and store clinical data at a fine-grained level, ensuring that the same query can be applied consistently to databases around the world. OHDSI has chosen data integration standards that dovetail with those of the U.S. government and the international community, and it also supplies tools and mapping tables for converting data from other standards. More than 50 databases, with a total of 682 million patient records, had been created using the common data model.

The OHDSI researchers say the study successfully addressed patient privacy and research regulatory constraints, adopted a consistent data model, and distributed queries across a broad population. All the involved data sources have adopted a common industry standard for longitudinally recorded visits, diagnoses, procedures, medications, and (where available) laboratory tests, and any combination of the data can be used to answer future questions across medicine.