News

Article

FDA’s new AI tool “Elsa” faces accuracy concerns despite commissioner’s high hopes

Author(s):

Fact checked by:

Key Takeaways

  • Elsa, the FDA's AI assistant, faces criticism for producing unreliable and fabricated information, despite public endorsements from leadership.
  • Internal reports describe Elsa as "clunky" and prone to "hallucinations," contrasting with the agency's positive public messaging.
SHOW MORE

FDA's AI tool, Elsa, faces scrutiny for producing unreliable data, raising concerns about its rapid implementation and impact on regulatory accuracy.

FDA's AI tool showing flaws early on: ©Tada Images - stock.adobe.com

FDA's AI tool showing flaws early on: ©Tada Images - stock.adobe.com

The U.S. Food and Drug Administration’s sweeping rollout of its internal AI assistant, Elsa, has triggered concern among employees and outside experts, despite enthusiastic public endorsements from Commissioner Marty Makary. While the tool is being hailed by FDA leadership as a major step toward faster, more efficient regulatory reviews, internal reports suggest Elsa may be producing unreliable and at times fabricated information, raising questions about whether the agency is moving too fast into the AI era.

CNN reported that the AI makes up studies and misrepresents research, making it only useful for basic tasks like summarizing meetings or creating email templates. This isn't the first report of issues with Elsa. In a detailed report from STAT News, FDA reviewers described Elsa as “clunky,” with one staff member noting the tool was capable of producing “hallucinated” studies and regulatory citations. “It confidently hallucinates,” another reviewer told BioSpace. “Anything you don’t have time to double-check is unreliable.” The internal skepticism contrasts sharply with the public messaging from FDA leadership, who have promoted Elsa as a productivity breakthrough.

Commissioner Makary announced Elsa’s launch on June 2, 2025, praising it as the beginning of a transformation at the agency. Powered by Anthropic’s Claude model and deployed in a secure GovCloud, Elsa was introduced to streamline administrative tasks such as summarizing adverse event reports, comparing labels, reviewing clinical trial protocols, and even writing code to help manage internal databases. “What took [a reviewer] two to three days now takes six minutes,” Makary said at the time.

Makary’s broader vision includes deploying AI across all FDA centers—a mandate issued in early May—with full implementation by June 30. In interviews and public appearances, he emphasized the agency’s need to modernize and described Elsa as a tool to eliminate inefficiencies in the FDA’s traditional, paper-based review process. “This is the dawn of the AI era at the FDA,” he declared.

But as Elsa rolled out agency-wide, its actual performance fell short of expectations for some users. According to Food & Wine, some outputs included incorrect safety data and irrelevant labeling guidance, requiring manual correction. One reviewer said that instead of saving time, Elsa often required rework to fix confidently incorrect answers.

External observers have also begun to raise concerns. In a client alert, the law firm Hogan Lovells questioned what kind of oversight governs the tool’s outputs, whether benchmarks exist for evaluating its performance, and how “human-in-the-loop” processes are being enforced. “It’s not clear how the agency defines success for Elsa, and whether guardrails are in place to prevent AI-generated errors from influencing regulatory decisions,” the firm wrote.

Makary’s ambitions for AI go far beyond administrative tools. In a June opinion piece in JAMA, Makary and co-author Vinay Prasad, MD, outlined a vision for “rapid or instant reviews” of drug applications, faster food safety alerts, and the rebuilding of public trust through transparency and innovation. He compared the goals of Elsa and other agency AI tools to the pace of Operation Warp Speed, which helped deliver COVID-19 vaccines in record time.

The agency has also expanded its regulatory oversight of AI-enabled medical devices. By mid-2024, the FDA had authorized over 1,000 such tools. Makary, writing in a policy update through Debevoise & Plimpton, said the agency is “rethinking our approach to AI, balancing safety and accuracy while fueling innovation.”

Still, critics warn that those goals may be outpacing the agency’s readiness. Elsa’s outputs—no matter how promising in concept—must still meet high scientific and regulatory standards.

Makary has dismissed suggestions that the FDA is moving too quickly, pointing to the need to modernize and deliver treatments faster to patients. He has also introduced the “Commissioner’s National Priority Review Voucher” program to speed up drug reviews for conditions like pediatric cancer and rare diseases, as detailed in a Wall Street Journal article.

But as Elsa’s flaws become more apparent, the question remains: can the FDA’s technological leap deliver on its promises without sacrificing accuracy, trust, and safety? For now, the agency insists it can. But reviewers and watchdogs will be watching Elsa closely.

Newsletter

Stay informed and empowered with Medical Economics enewsletter, delivering expert insights, financial strategies, practice management tips and technology trends — tailored for today’s physicians.

Related Videos
Martin Burns of Bruin Biometrics
Martin Burns of Bruin Biometrics
Martin Burns of Bruin Biometrics
ACP policy update 2025 interview