News
Article
Author(s):
Study shows that some AI tools aren't worth much
A study led by a multi-institutional team of researchers from the University of North Carolina (UNC) School of Medicine, Duke University, Ally Bank, the University of Oxford, Columbia University and the University of Miami illustrates the need for rigorous clinical validation of artificial intelligence (AI) medical devices. Published in Nature Medicine, their research reveals that nearly half of the AI tools authorized by the U.S. FDA lack reported clinical validation data, raising concerns about their effectiveness and safety.
Sammy Chouffani El Fassi, an M.D. candidate at the UNC School of Medicine and research scholar at Duke Heart Center, along with Gail E. Henderson, Ph.D., a professor in the UNC Department of Social Medicine, led an analysis examining over 500 AI medical devices approved by the FDA and found that 226 of these devices — approximately 43% — did not have publicly available clinical validation data.
“Although AI device manufacturers often highlight FDA authorization as a mark of credibility, clearance doesn’t necessarily mean that these devices have been thoroughly evaluated for clinical effectiveness using real patient data,” Chouffani El Fassi said in a statement. “Our findings underscore the need for the FDA and the industry to enhance the credibility of AI devices by conducting and publicly sharing clinical validation studies.”
Since 2016, the annual number of AI medical device authorizations by the FDA has increased from two to 69, reflecting the rapid commercialization of AI technologies in health care. Most of these devices are designed to assist physicians in diagnosing abnormalities in radiological imaging, analyzing pathological slides, dosing medications and predicting disease progression.
The challenges of clinical validation
AI technologies rely on complex algorithms trained on vast data sets to perform tasks that traditionally require human expertise. However, ensuring that these technologies can accurately process and analyze new, unseen data is critical.
The researchers found that the latest draft guidance from the FDA, published in September 2023, lacks clarity in distinguishing among different types of clinical validation studies. This ambiguity, they argue, could lead to inconsistencies in the evaluation and approval process.
Of the devices analyzed, 144 were retrospectively validated, 148 were prospectively validated and only 22 underwent validation through randomized controlled trials, the gold standard in clinical research. Notably, some devices used “phantom images” — computer-generated rather than real patient data — which do not meet the criteria for clinical validation.
In response to these findings, the researchers have called for clearer standards in clinical validation. They emphasize the need for the FDA to differentiate among retrospective studies, prospective studies, and randomized controlled trials, given the varying levels of scientific evidence these methods provide.
“We’ve shared our findings with FDA directors overseeing medical device regulation, and we hope our work will inform their regulatory decisions,” Chouffani El Fassi said. “We also aim to inspire researchers and institutions worldwide to conduct more rigorous clinical validation studies to ensure the safety and effectiveness of AI in health care.”
Todd Shryock is managing editor of Medical Economics.
2 Commerce Drive
Cranbury, NJ 08512