News
Article
A new study from Brazil suggests machine learning may ease referral overload and reduce unnecessary specialist visits — but not without caveats.
© LALAKA - stock.adobe.com
A machine learning model from Brazil has the potential to help primary care practices more efficiently manage referrals to specialists. As practices face mounting pressure from workforce shortages and rising administrative burdens, the study’s findings may point to a way to combat delays and improve resource use in referral management.
Published June 3 in JAMA Network Open, the study analyzed over 45,000 primary care referrals and found that an artificial intelligence (AI) model performed comparably — and in some ways better — than human reviewers at determining which patients truly needed to see a specialist.
“This configuration could reduce gatekeeper workload by prescreening referrals,” the authors wrote. “Expected benefits include reduced gatekeeper workload, fewer incorrect authorizations and unnecessary in-person consultations.”
The study was conducted within Brazil’s public health infrastructure — which uses a two-tiered referral system in which primary care physicians submit referrals which are then reviewed by regulation centers staffed with trained professionals — though its implications are increasingly relevant in the U.S.
While the Brazilian structure differs from U.S. systems, the underlying challenges — delays, variability and specialist overuse — are familiar. A growing number of American physicians operate within value-based care arrangements or integrated delivery systems where managing downstream specialist costs is a priority, and so is timely access.
In the study, researchers used data from a large state-run referral system and trained a model using neural networks and word embeddings to analyze the free-text fields submitted by referring physicians. In the U.S., where referrals are often tracked through electronic health records (EHRs) or prior authorization platforms, a similar model could be integrated with referral workflows to flag low-urgency or incomplete requests.
The AI tool was tested across five common referral specialties — endocrinology, gastroenterology, proctology, rheumatology and urology — and reached an overall accuracy of 71.6%. Its area under the receiver operating characteristic curve (AUC-ROC) was 0.765, meaning it correctly distinguished appropriate from inappropriate referrals more than three-quarters of the time. For comparison, 0.5 represents random guessing, while 1.0 indicates perfect classification.
Most notably, the model’s specificity — or ability to flag referrals that didn’t need to be approved — was 80.1%, significantly higher than the 34.0% achieved by human gatekeepers. That suggests it could help reduce unnecessary specialist visits in health systems struggling with access and over-utilization.
As with most AI tools — there’s a trade-off. The AI model was more cautious in greenlighting referrals. Its sensitivity — how often it approved appropriate referrals — was 54.2%, compared with 89.7% for human reviewers. That means the algorithm was more likely to request additional information before approving a referral, potentially slowing care for some patients.
But for overburdened practices, this conservative approach may still be worthwhile if it reduces the number of unnecessary visits and allows clinicians to spend more time on complex cases. In the study’s context, the model could have reduced human review for roughly 22,000 referrals a month.
For U.S. practices using centralized referral systems, accountable care organizations (ACOs) or value-based payment models, the ability to automate part of the triage process — while maintaining clinician oversight — could be a major operational win.
Though promising, the study also underscores challenges familiar to U.S. clinicians: bias in documentation, the need for real-time data integration and the importance of continuous model updating. The algorithm relied on clinical language in physician notes, and like any AI, it could reflect or amplify existing gaps in care documentation.
The authors caution that the model is not a replacement for clinical judgment. Instead, it’s a support tool that may be especially useful in high-volume, high-need environments. A faulty algorithm, they note, can make mistakes at scale — even if it outperforms individual physicians in some areas.
The bottom line? The Brazilian model isn’t plug-and-play for U.S. practices, but it shows how AI could help manage referral bottlenecks and optimize access to specialized care. If adapted thoughtfully, it could become part of the toolkit that helps private practices and health systems strike a better balance between access, efficiency and oversight.