For Lung Cancer Diagnosis, Machine Learning can Prove More Accurate than Pathologists, Research Finds | Healthcare Informatics Magazine | Health IT | Information Technology Skip to content Skip to navigation

For Lung Cancer Diagnosis, Machine Learning can Prove More Accurate than Pathologists, Research Finds

August 16, 2016
by Rajiv Leventhal
| Reprints

Computers, not pathologists, can prove to be more accurate when it comes to assessing slides of lung cancer tissues, according to a new study by researchers at the Stanford University School of Medicine.

The researchers found that a machine learning approach to identifying critical disease-related features accurately differentiated between two types of lung cancers and predicted patient survival times better than the standard approach of pathologists classifying tumors by grade and stage.

“Pathology as it is practiced now is very subjective,” said Michael Snyder, Ph.D., professor and chair of genetics. “Two highly skilled pathologists assessing the same slide will agree only about 60 percent of the time. This approach replaces this subjectivity with sophisticated, quantitative measurements that we feel are likely to improve patient outcomes.”

The research was published Aug. 16 in Nature Communications. Snyder, who directs the Stanford Center for Genomics and Personalized Medicine, shares senior authorship of the study with Daniel Rubin, M.D., assistant professor of radiology and of medicine. Graduate student Kun-Hsing Yu, M.D., is the lead author of the study.

As explained by the Stanford Medicine News Center, for decades, pathologists have assessed the severity, or “grade,” of cancer by using a light microscope to examine thin cross-sections of tumor tissue mounted on glass slides. The more abnormal the tumor tissue appeared—in terms of cell size and shape, among other indicators—the higher the grade. A stage is also assigned based on whether and where the cancer has spread throughout the body.

Often a cancer’s grade and stage can be used to predict how the patient will fare. They also can help clinicians decide how, and how aggressively, to treat the disease. This classification system doesn’t always work well for lung cancer, however. In particular, the lung cancer subtypes of adenocarcinoma and squamous cell carcinoma can be difficult to tell apart when examining tissue culture slides. Furthermore, the stage and grade of a patient’s cancer doesn’t always correlate with their prognosis, which can vary widely. Fifty percent of stage-1 adenocarcinoma patients, for example, die within five years of their diagnosis, while about 15 percent survive more than 10 years.

As such, the researchers used 2,186 images from a national database called the Cancer Genome Atlas obtained from patients with either adenocarcinoma or squamous cell carcinoma. The database also contained information about the grade and stage assigned to each cancer and how long each patient lived after diagnosis.

The researchers then used the images to “train” a computer software program to identify many more cancer-specific characteristics than can be detected by the human eye—nearly 10,000 individual traits, versus the several hundred usually assessed by pathologists. These characteristics included not just cell size and shape, but also the shape and texture of the cells’ nuclei and the spatial relations among neighboring tumor cells.

“We began the study without any preconceived ideas, and we let the software determine which characteristics are important,” said Snyder. “In hindsight, everything makes sense. And the computers can assess even tiny differences across thousands of samples many times more accurately and rapidly than a human.”

The researchers homed in on a subset of cellular characteristics identified by the software that could best be used to differentiate tumor cells from the surrounding noncancerous tissue, identify the cancer subtype, and predict how long each patient would survive after diagnosis. They then validated the ability of the software to accurately distinguish short-term survivors from those who lived significantly longer on another dataset of 294 lung cancer patients from the Stanford Tissue Microarray Database.

Snyder anticipates that the machine-learning system described in this study will be able to complement the emerging fields of cancer genomics, transcriptomics and proteomics. Cancer researchers in these fields study the DNA mutations and the gene and protein expression patterns that lead to disease.

Although the current study focused on lung cancer, the researchers believe that a similar approach could be used for many other types of cancer. “Ultimately this technique will give us insight into the molecular mechanisms of cancer by connecting important pathological features with outcome data,” said Snyder.

Get the latest information on Medical Imaging and attend other valuable sessions at this two-day Summit providing healthcare leaders with educational content, insightful debate and dialogue on the future of healthcare and technology.

Learn More

Topics

News

Survey: By 2019, 60% of Medicare Revenues will be Tied to Risk

Medical groups and health systems that are members of AMGA (the American Medical Group Association) expect that nearly 60 percent of their revenues from Medicare will be from risk-based products by 2019, according to the results from a recent survey.

83% of Physicians Have Experienced a Cyber Attack, Survey Finds

Eighty-three percent of physicians in a recent survey said that they have experienced some sort of cyber attack, such as phishing and viruses.

Community Data Sharing: Eight Recommendations From San Diego

A learning guide focuses on San Diego’s experience in building a community health information exchange and the realities of embarking on a broad community collaboration to achieve better data sharing.

HealthlinkNY’s Galanis to Step Down as CEO

Christina Galanis, who has served as president and CEO of HealthlinkNY for the past 13 years, will leave her position at the end of the year.

Email-Related Cyber Attacks a Top Concern for Providers

U.S. healthcare providers overwhelmingly rank email as the top source of a potential data breach, according to new research from email and data security company Mimecast and conducted by HIMSS Analytics.

Former Health IT Head in San Diego County Charged with Defrauding Provider out of $800K

The ex-health IT director at North County Health Services, a San Diego County-based healthcare service provider, has been charged with spearheading fraudulent operations that cost the organization $800,000.