Algorithm identifies relevant information entered as free-form text
Doctors and health care policymakers have long known the important role that social determinants of health (SDoH) play in determining health outcomes. But getting large quantities of SDoH information is difficult because it is usually entered in electronic health records (EHRs) as free-form text, rather than structured data.
Now, however, a team of scientific investigators has demonstrated that it’s possible to extract unstructured SDoH data from EHRs relatively easily by using natural language processing—a machine-based method for rapidly identifying certain text in a document. They published their findings recently in a study in JAMIA Open.
The researchers developed three natural language processing (NLP) algorithms for identifying three common social risk factors: housing instability, financial insecurity, and unemployment. They applied the algorithms to clinical notes from 9.900 patients at a safety-net hospital and a nonprofit system whose patients are mostly commercially insured, both in the Indianapolis, Indiana area.
After conducting their own review of random batches of notes examined by the algorithms, the study’s authors concluded that the algorithms showed consistent performance in identifying the three social factors within clinical text.
“More health care organizations are having to deal with social determinants because it is factors like financial resources, housing, and employment status that drive costs that make people unhealthy,” Joshua Vest, PhD, a research scientist at the Regenstrief Institute and faculty member of the Indiana University Fairbanks School of Public Health, said in a news release. “The challenge is effectively measuring and identifying patients with social risks so that [the organizations] can intervene.”
While natural language processing has previously been applied to other conditions, “this is one of the first papers that applies it to social determinants of health,” Vest said.
He added that the team’s goal is “to measure social determinants well enough fir researchers to develop risk models and for clinicians and health care systems to be able to use these (social determinants) in routine practice to…provide a better understanding of the overall characteristics and needs of the patient population.”