News|Articles|October 29, 2024

Can artificial intelligence contribute to improved clinical reasoning?

Listen

0:00 / 0:00

Key Takeaways

GPT-4 did not significantly improve physicians' diagnostic reasoning compared to conventional resources like UpToDate and Google.
The study involved 50 U.S.-licensed physicians across family, internal, and emergency medicine.
GPT-4 independently outperformed both physician groups, indicating potential for enhanced physician-AI collaboration.
A bi-coastal AI evaluation network, ARiSE, has been established to further assess generative AI outputs in healthcare.

A recent study compared the diagnostic performance of physicians referencing AI to those limited to conventional resources.

Researchers from the University of Minnesota Medical School, Stanford University, Beth Israel Deaconess Medical Center and the University of Virginia analyzed the efficacy of GPT-4, an artificial intelligence (AI) large language model (LLM) system, as a diagnostic tool to assist physicians’ diagnoses. The study, published in JAMA Network Open, found that physicians’ access to GPT-4 as a diagnostic aid did not result in significantly improved clinical reasoning compared to physicians left with conventional resources, including UpToDate and Google.

“The field of AI is expanding rapidly and impacting our lives inside and outside of medicine. It is important that we study these tools and understand how we best use them to improve the care we provide as well as the experience of providing it, Andrew Olson, M.D., a professor at the University of Minnesota Medical School and a hospitalist with M Health Fairview. “This study suggests that there are opportunities for further improvement in physician-AI collaboration in clinical practice.”

The study analyzed 50 total U.S.-licensed physicians across family, internal and emergency medicine. The median diagnostic reasoning score per case was 76% for the group with AI access, and 74% for the group only referencing conventional resources. The AI group spent an average of 519 seconds per case, compared with 565 seconds per case for the conventional resources group.

Researchers were able to conclude that access to GPT-4 did not significantly increase physicians’ diagnostic reasoning, although, on its own, the LLM did surpass the performances of both clinicians using conventional diagnostic online resources, and clinicians assisted by the program. These findings could prove the necessity of further research to understand how clinicians should be trained to use these tools.

Independently, the LLM demonstrated higher performance than either physician group, thereby indicating the need for training and development to achieve the full potential of physician-AI collaboration in clinical practice. At the forefront of these efforts, the four institutions behind the study announced a collaboration on a bi-coastal AI evaluation network, ARiSE, designed to further evaluate generative AI outputs in healthcare.

Newsletter

Stay informed and empowered with Medical Economics enewsletter, delivering expert insights, financial strategies, practice management tips and technology trends — tailored for today’s physicians.

Subscribe Now!

Latest CME

Video

Clinical Consultations™: Addressing Elevated Phosphate Levels in Patients with END-STAGE Kidney Disease (ESKD)

Anil K. Agarwal, MD, FACP, FASN, FNKF, FASDIN; Jay B. Wish, MD

Multimedia

Advances In: Managing Hyperphosphatemia in Chronic Kidney Disease – Bridging Treatment Gaps With Novel Therapies

Glenn M. Chertow, MD, MPH; Anjay Rastogi, MD, PhD

Case-based Simulation

SimulatED™: Understanding the Role of Genetic Testing in Patient Selection for Anti-Amyloid Therapy

Nicholas Doher, DO; Babak Tousi, MD

Multimedia

Burst CME™: Addressing Inadequate Response to Anti-TNF Therapy in Patients With Rheumatoid Arthritis

Jeffrey A. Sparks, MD, MMSc

Multimedia

Community Practice Connections™: Cases and Conversations – Keeping Up with Novel Approaches to Managing ANCA-Associated Vasculitis

Frank B. Cortazar, MD; Lindsay S. Lally, MD

Video

Burst CME: Targeted Therapy for Optimal Psoriasis Management

Tina Bhutani, MD

Video

Cases and Conversations™: A Horizon View of Continuous Monitoring Systems for Diabetes Management

Diana Isaacs, PharmD, BCPS, BC-ADM, BCACP, CDCES, FADCES, FCCP; Anders Carlson, MD; Jennifer B. Green, MD

Video

Progress in Hyperlipidemia Management to Reduce ASCVD Risk: An Illustrated Update

Nihar R. Desai, MD, MPH; Martha Gulati, MD, MS, FACC, FAHA, MASPC, FESC, FSCCT (hon), FRCP Edin

Can artificial intelligence contribute to improved clinical reasoning?

Key Takeaways

Newsletter

Related Content

What makes Morphic's RESET device unique in diabetes treatment

Rapid Nexus wins FDA Clearance for first device targeting root causes of chronic wounds

Survey: Rural physician compensation remains out of step with productivity

Orpyx technology selected for NIH-funded trial on diabetic foot ulcer prevention

Music and dementia; Scotland tests free Wegovy access; women and long COVID – Morning Medical Update

Latest CME

Clinical Consultations™: Addressing Elevated Phosphate Levels in Patients with END-STAGE Kidney Disease (ESKD)

Advances In: Managing Hyperphosphatemia in Chronic Kidney Disease – Bridging Treatment Gaps With Novel Therapies

SimulatED™: Understanding the Role of Genetic Testing in Patient Selection for Anti-Amyloid Therapy

Burst CME™: Addressing Inadequate Response to Anti-TNF Therapy in Patients With Rheumatoid Arthritis

Community Practice Connections™: Cases and Conversations – Keeping Up with Novel Approaches to Managing ANCA-Associated Vasculitis

Burst CME: Targeted Therapy for Optimal Psoriasis Management

Cases and Conversations™: A Horizon View of Continuous Monitoring Systems for Diabetes Management

Progress in Hyperlipidemia Management to Reduce ASCVD Risk: An Illustrated Update

Trending on Medical Economics

What makes Morphic's RESET device unique in diabetes treatment