News
Article
PCPs welcome AI as a time-saver, but new research shows that many failed to catch serious mistakes in AI-generated patient portal messages.
© VRVIRUS - stock.adobe.com
Between increased administrative burdens and an influx of patient messages, many primary care physicians (PCPs) are turning to generative artificial intelligence (AI) tools to help draft patient portal messages. A new study suggests those tools can help — but they may also introduce dangerous errors that physicians fail to catch.
The study, published in Digital Medicine, found that most physicians missed critical mistakes in AI-generated message drafts in a simulated exercise. Some of the errors had serious safety implications.
“All but one physician ‘sent’ at least one fictitious response to a patient that contained an error,” the authors wrote.
Researchers from MedStar Health, Georgetown University and the Naval Research Laboratory recruited 20 practicing PCPs in the Baltimore-Washington area. Participants were asked to review and edit 18 AI-generated responses to common patient portal inquiries. Four of the drafts contained significant errors.
These errors included:
At least 65% of participants missed each of the four errors, and between 35% and 45% submitted the problematic messages without any edits. Only one participating physician caught and corrected all four errors.
Despite these oversights, most PCPs responded positively to the AI assistance. According to post-task surveys:
“I found the AI drafts to be helpful in responding to these [patient portal messages],” one survey statement read. Nineteen of the 20 participants marked it as true.
But the researchers argue that this trust — however well-intentioned — may lead to over-reliance.
“As the occurrence of automation complacency increases with reliability, the fact that most AI-generated drafts did not contain an error or patient safety risk may have resulted in vigilance decrement,” the authors noted.
The study points to several possible reasons why seasoned physicians may have missed obvious red flags:
“These results highlight how physicians, as well as other stakeholders, understandably have a growing appetite for AI-driven technologies that can address workload burden,” the researchers wrote. “[They] may in fact be so overextended that any technology promising a reprieve is eagerly adopted despite the risks.”
More than 100 health systems are already using generative AI to support clinician inboxes, including tools integrated into popular EHR platforms like Epic. The researchers behind this study argue that rollout may be outpacing safety testing.
While the AI drafts didn’t always include outright AI hallucinations, some gave incomplete or misleading advice. In the case of a child showing textbook signs of DKA, the Ai offered a generic response about stomach viruses — an omission flagged by only five of the 20 participating physicians.