While much has been made about the potential for diagnostic software to make accurate clinical diagnoses, in a head-to-head comparison with human doctors, researchers found that physicians made a correct diagnosis more than twice as often.
According to a research letter published in JAMA Internal Medicine, a research team, from Harvard Medical School, Brigham and Women’s Hospital and The Human Diagnosis Project, conducted a head-to-head comparison of physicians with symptom-checker apps and websites that help patients with self-diagnosis. The research builds on a previous evaluation of the diagnostic accuracy of 23 symptom checkers. For this study, the research team compared the diagnostic performance of physicians with symptom checkers for those same 45 vignettes using the digital platform Human Dx.
The research letter cites The Institute of Medicine recently highlighting that physician diagnostic error is common and information technology may be part of the solution. “Given the advancements in computer science, computers may be able to independently make accurate clinical diagnoses. While studies have compared computer versus physician performance for reading electrocardiograms, the diagnostic accuracy of computers versus physicians remains unknown. To fill this gap in knowledge, we compared the diagnostic accuracy of physicians with computer algorithms called symptom checkers,” the research authors wrote.
In the study, 234 internal medicine physicians were asked to evaluate 45 clinical cases, involving both common and uncommon conditions with varying degrees of severity. For each scenario, physicians had to identify the most likely diagnosis along with two additional possible diagnoses. Each clinical vignette was solved by at least 20 physicians. Of the 234 physicians who solved at least one vignette, 90 percent were trained in internal medicine and 52 percent were fellows or residents.
Given that physicians provided free text responses, two physicians hand-reviewed the submitted diagnoses and independently decided whether the participant listed the correct diagnosis first or in the top three diagnoses.
The researchers reported that physicians listed the correct diagnosis first 72 percent of the time, while the online tools listed the correct diagnosis just 34 percent of the time. Physicians outperformed the symptom-checker apps and websites by a margin of more than 2 to 1.
The physicians and the computer programs were able to include more than one ailment in their differential diagnosis. So, the researchers also compared how often the correct diagnosis was among the top three responses. Physicians made the correct diagnosis among their top three possibilities 84 percent of the time, while the digital symptom-checkers only did so 51 percent of the time, the researchers reported.
The difference between physician and computer performance was most dramatic in more severe and less common conditions. It was smaller for less acute and more common illnesses.
"While the computer programs were clearly inferior to physicians in terms of diagnostic accuracy, it will be critical to study future generations of computer programs that may be more accurate," senior investigator Ateev Mehrotra, an associate professor of health care policy at HMS, said.
Despite outperforming the machines, physicians still made errors in about 15 percent of cases. Researchers say developing computer-based algorithms to be used in conjunction with human decision-making may help further reduce diagnostic errors.
"Clinical diagnosis is currently as much art as it is science, but there is great promise for technology to help augment clinical diagnoses," Mehrotra said. "That is the true value proposition of these tools."