News|Articles|July 23, 2025

FDA’s new AI tool “Elsa” faces accuracy concerns despite commissioner’s high hopes

Listen

0:00 / 0:00

Key Takeaways

Elsa, the FDA's AI assistant, faces criticism for producing unreliable and fabricated information, despite public endorsements from leadership.
Internal reports describe Elsa as "clunky" and prone to "hallucinations," contrasting with the agency's positive public messaging.
Commissioner Makary envisions AI transforming FDA processes, but Elsa's performance has raised concerns about oversight and readiness.
Critics warn that the FDA's rapid AI adoption may compromise scientific and regulatory standards, despite the agency's insistence on modernization.

FDA's AI tool, Elsa, faces scrutiny for producing unreliable data, raising concerns about its rapid implementation and impact on regulatory accuracy.

The U.S. Food and Drug Administration’s sweeping rollout of its internal AI assistant, Elsa, has triggered concern among employees and outside experts, despite enthusiastic public endorsements from Commissioner Marty Makary. While the tool is being hailed by FDA leadership as a major step toward faster, more efficient regulatory reviews, internal reports suggest Elsa may be producing unreliable and at times fabricated information, raising questions about whether the agency is moving too fast into the AI era.

CNN reported that the AI makes up studies and misrepresents research, making it only useful for basic tasks like summarizing meetings or creating email templates. This isn't the first report of issues with Elsa. In a detailed report from STAT News, FDA reviewers described Elsa as “clunky,” with one staff member noting the tool was capable of producing “hallucinated” studies and regulatory citations. “It confidently hallucinates,” another reviewer told BioSpace. “Anything you don’t have time to double-check is unreliable.” The internal skepticism contrasts sharply with the public messaging from FDA leadership, who have promoted Elsa as a productivity breakthrough.

Commissioner Makary announced Elsa’s launch on June 2, 2025, praising it as the beginning of a transformation at the agency. Powered by Anthropic’s Claude model and deployed in a secure GovCloud, Elsa was introduced to streamline administrative tasks such as summarizing adverse event reports, comparing labels, reviewing clinical trial protocols, and even writing code to help manage internal databases. “What took [a reviewer] two to three days now takes six minutes,” Makary said at the time.

Makary’s broader vision includes deploying AI across all FDA centers—a mandate issued in early May—with full implementation by June 30. In interviews and public appearances, he emphasized the agency’s need to modernize and described Elsa as a tool to eliminate inefficiencies in the FDA’s traditional, paper-based review process. “This is the dawn of the AI era at the FDA,” he declared.

But as Elsa rolled out agency-wide, its actual performance fell short of expectations for some users. According to Food & Wine, some outputs included incorrect safety data and irrelevant labeling guidance, requiring manual correction. One reviewer said that instead of saving time, Elsa often required rework to fix confidently incorrect answers.

External observers have also begun to raise concerns. In a client alert, the law firm Hogan Lovells questioned what kind of oversight governs the tool’s outputs, whether benchmarks exist for evaluating its performance, and how “human-in-the-loop” processes are being enforced. “It’s not clear how the agency defines success for Elsa, and whether guardrails are in place to prevent AI-generated errors from influencing regulatory decisions,” the firm wrote.

Makary’s ambitions for AI go far beyond administrative tools. In a June opinion piece in JAMA, Makary and co-author Vinay Prasad, MD, outlined a vision for “rapid or instant reviews” of drug applications, faster food safety alerts, and the rebuilding of public trust through transparency and innovation. He compared the goals of Elsa and other agency AI tools to the pace of Operation Warp Speed, which helped deliver COVID-19 vaccines in record time.

The agency has also expanded its regulatory oversight of AI-enabled medical devices. By mid-2024, the FDA had authorized over 1,000 such tools. Makary, writing in a policy update through Debevoise & Plimpton, said the agency is “rethinking our approach to AI, balancing safety and accuracy while fueling innovation.”

Still, critics warn that those goals may be outpacing the agency’s readiness. Elsa’s outputs—no matter how promising in concept—must still meet high scientific and regulatory standards.

Makary has dismissed suggestions that the FDA is moving too quickly, pointing to the need to modernize and deliver treatments faster to patients. He has also introduced the “Commissioner’s National Priority Review Voucher” program to speed up drug reviews for conditions like pediatric cancer and rare diseases, as detailed in a Wall Street Journal article.

But as Elsa’s flaws become more apparent, the question remains: can the FDA’s technological leap deliver on its promises without sacrificing accuracy, trust, and safety? For now, the agency insists it can. But reviewers and watchdogs will be watching Elsa closely.

Stay informed and empowered with Medical Economics enewsletter, delivering expert insights, financial strategies, practice management tips and technology trends — tailored for today’s physicians.

Subscribe Now!

Latest CME

Video

Cases and Conversations™: Applying Best Practices to Prevent Shingles in Your Practice

Paul G. Auwaerter, MD, MBA; Paul P. Doghramji, MD, FAAFP; Aruna Subramanian, MD

Video

Clinical Consultations™: Addressing Elevated Phosphate Levels in Patients with END-STAGE Kidney Disease (ESKD)

Anil K. Agarwal, MD, FACP, FASN, FNKF, FASDIN; Jay B. Wish, MD

Multimedia

Advances In: Managing Hyperphosphatemia in Chronic Kidney Disease – Bridging Treatment Gaps With Novel Therapies

Glenn M. Chertow, MD, MPH; Anjay Rastogi, MD, PhD

Case-based Simulation

SimulatED™: Understanding the Role of Genetic Testing in Patient Selection for Anti-Amyloid Therapy

Nicholas Doher, DO; Babak Tousi, MD

Multimedia

Burst CME™: Addressing Inadequate Response to Anti-TNF Therapy in Patients With Rheumatoid Arthritis

Jeffrey A. Sparks, MD, MMSc

Multimedia

Community Practice Connections™: Cases and Conversations – Keeping Up with Novel Approaches to Managing ANCA-Associated Vasculitis

Frank B. Cortazar, MD; Lindsay S. Lally, MD

Video

Burst CME: Targeted Therapy for Optimal Psoriasis Management

Tina Bhutani, MD

Video

Cases and Conversations™: A Horizon View of Continuous Monitoring Systems for Diabetes Management

Diana Isaacs, PharmD, BCPS, BC-ADM, BCACP, CDCES, FADCES, FCCP; Anders Carlson, MD; Jennifer B. Green, MD

Video

Progress in Hyperlipidemia Management to Reduce ASCVD Risk: An Illustrated Update

Nihar R. Desai, MD, MPH; Martha Gulati, MD, MS, FACC, FAHA, MASPC, FESC, FSCCT (hon), FRCP Edin

Trending on Medical Economics

As health systems grow, low-income Medicare patients lose ground, RAND study finds

What election results say about health care costs; blood tests for Alzheimer’s reach primary care; heart attacks and late-onset epilepsy – Morning Medical Update

FDA’s new AI tool “Elsa” faces accuracy concerns despite commissioner’s high hopes

Key Takeaways

Newsletter

Related Content

What election results say about health care costs; blood tests for Alzheimer’s reach primary care; heart attacks and late-onset epilepsy – Morning Medical Update

RESET: A device for type 2 diabetes patients

As health systems grow, low-income Medicare patients lose ground, RAND study finds

JLL Partners acquires Parks Medical Electronics to expand vascular device portfolio

10 U.S. cities with the worst inflation in 2025

Latest CME

Cases and Conversations™: Applying Best Practices to Prevent Shingles in Your Practice

Clinical Consultations™: Addressing Elevated Phosphate Levels in Patients with END-STAGE Kidney Disease (ESKD)

Advances In: Managing Hyperphosphatemia in Chronic Kidney Disease – Bridging Treatment Gaps With Novel Therapies

SimulatED™: Understanding the Role of Genetic Testing in Patient Selection for Anti-Amyloid Therapy

Burst CME™: Addressing Inadequate Response to Anti-TNF Therapy in Patients With Rheumatoid Arthritis

Community Practice Connections™: Cases and Conversations – Keeping Up with Novel Approaches to Managing ANCA-Associated Vasculitis

Burst CME: Targeted Therapy for Optimal Psoriasis Management

Cases and Conversations™: A Horizon View of Continuous Monitoring Systems for Diabetes Management

Progress in Hyperlipidemia Management to Reduce ASCVD Risk: An Illustrated Update

Trending on Medical Economics

As health systems grow, low-income Medicare patients lose ground, RAND study finds

What election results say about health care costs; blood tests for Alzheimer’s reach primary care; heart attacks and late-onset epilepsy – Morning Medical Update

JLL Partners acquires Parks Medical Electronics to expand vascular device portfolio

10 U.S. cities with the worst inflation in 2025

Remote monitoring boosts Medicare revenue by 20% for primary care practices, study finds