Data, Data, Everywhere: What’s the Problem?

Health data is exploding and threatens to overwhelm our ability to store, analyze, and use it. Still, major challenges remain.

Health data is exploding and threatens to overwhelm our ability to store, analyze, and use it. Doctors and patients want actionable information at the point of care that assists them with shared decision making.

Healthcare systems want data that allows them to better manage populations, which identifies those at risk for a specific outcome and identifies the risk factors that can be changed.

In addition, all that data is a treasure trove for researchers interested in learning how to make care better, more personalized, and less wasteful.

The challenges of using EMR data for research are several:

1. Too much data: There is more potential data available than a human can absorb and understand — volume and variety create challenges for extracting knowledge from data.

2. Lack of semantic standards: "Serum Na" and "sodium, serum." What does it mean?

3. Lack of syntactic standards: SQL, XML, HL7, text. Is it complete and accurate?

4. Limited longitudinal data or fragmented data: A person’s data is not linked across various healthcare systems and often available date presents limited and incomplete time periods.

5. Lack of consistent documentation using structured data fields which may not be random. Narrative text and dictated documentation or incomplete documentation may occur when doctors are rushed or caring for a patient that warrants additional time and attention. When data is missing not at random, the secondary use of the data has limitations as the reason data is missing might bias any inferences obtained from the data.

6. Inconsistent use of neurolinguistic programming methods to extract narrative and text data and convert to structured data.

7. Unrecognized duplicate or redundant data.

8. Many important data types, such as imaging data or echocardiology results, are only stored as text documents, some only as pdfs, with limited ability to use discrete data, such as a measure of ejection fraction for research or clinical decision support.

9. Whose data is it and how do I access it? Who has the decision rights to change or modify the record?

10. Who will fill the gap between the data and the doctor, given their overworked schedules and lack of training? Are more coaches the answer?

Here are some solutions.

The task at hand is to organize patient-academic-industry partnerships to create solutions for trusted digital ecosystems that support and facilitate the use of de-identified and secure data to generate knowledge that will improve the delivery of health care value and health care outcomes for all stakeholders.