AI shows how it can increase accuracy in clinical care and assist doctors with decision-making in the future
A study from researchers at Mass General Brigham found that ChatGPT was about 72% accurate in overall clinical decision-making, ranging from generating possible diagnoses to making a final one and then making care management decisions.
The AI chatbot performed equally well in both primary care and emergency settings across all medical specialties. The findings were published in the Journal of Medical Internet Research.
The researchers say that though no benchmark exists, they estimate the performance of the chatbot to be equivalent of someone who just graduated from medical school. This means that potentially AI could be an augmenting tool for the practice of medicine and support clinical decision-making with accuracy.
The researchers fed successive portions of 36 standardized clinical vignettes into the chatbot. It was asked to come up with a set of possible, or differential diagnoses based on the patient’s information that included age, gender, symptoms, and whether the case was an emergency. The AI was given additional information and asked to make management decisions and a final diagnosis to simulate a full patient encounter.
The study reports that overall, ChatGPT was about 72% accurate and did the best at making a final diagnosis, where it was 77% accurate. The lowest performing area was in making a differential diagnosis, where it was 60% accurate. In the area of clinical management decisions, the accuracy rate was 68%.
According to the report, researchers did not find any gender bias in the chatbot’s answers and its overall performance was similar in both primary and emergency care.