Blog|Articles|June 17, 2026

Protected health information and the AI gold rush putting it at risk

Fact checked by: Todd Shryock, AC Baltz

Listen

0:00 / 0:00

Key Takeaways

A January breach of a telehealth clinician network exposed identifiers and medical data for 716,000 people, highlighting PHI’s value to attackers and reputational/regulatory downside.
HIPAA and BAAs do not permit unrestricted reuse of a provider’s patient data for AI training, nor downstream exposure to other vendors or customers.
Workflow AI wrapped around entrenched channels such as fax can speed lab/radiology result ingestion into EHRs, but must incorporate health care-specific security, auditing, and access controls.
PHI-minimizing model development relies on redacting all 18 HIPAA identifiers, training on anonymized corpora, and using synthetic-data factories to generate statistically robust documents for tuning.
Customer-by-customer segmentation with dedicated models and zero cross-tenant data crossover, combined with explicit mapping of data sources and destinations, is foundational to responsible AI governance.

Vibe-coded AI implementations can present serious risks to health data privacy, protection and, ultimately, to health care function.

The New York Times recently published an article about the entrepreneur behind Medvi, an artificial intelligence (AI)-powered telehealth provider of weight-loss drugs, and the two-man startup’s projected $1.8 billion in sales in 2026. A few days later, AI pioneer Gary Marcus took issue with what the story left out, listing a string of Medvi criticisms he felt were inadequately addressed in the New York Times piece.

The story touches on a bunch of ethical and philosophical arguments around AI use that are worth discussing, but one specific point jumped out at me: A Medvi critic noted that its clinician network experienced a data breach in January 2026 that exposed a huge number of patient records.

Reporting from The HIPAA Journal confirms that the breach did take place, impacted 716,000 people, and that files exfiltrated included information such as names, addresses, email addresses, dates of birth and medical information. This is a troubling incident, and it raises the specter of how we handle protected health information in the age of AI.

As a provider of AI-powered technology solutions for the health care industry, I am very interested and invested in driving innovation with new technology. But protected health information (PHI) has been an enormous cybercrime target since the dawn of digitalization. I worry that “vibe-coded” AI implementations can present serious risks to health data privacy, protection and, ultimately, to health care function.

The current AI rush is really no different than any historical gold rush, and there are tons of entrepreneurs just like the Medvi founder who are using the tools to stake their claim. But working with AI in health care demands understanding the nuances of HIPAA, how AI models and PHI need to be treated in that context, and what’s at stake if you take shortcuts.

Although there is no such thing as 100% cybersecurity for any data, including PHI, the truth is that there are solid standards and practices that reduce risk and defend against misuse and unauthorized access — and those standards and practices should be applied to AI function in health care at a foundational level.

For example, there are a lot of health care technology companies focused on using AI to untangle communications and unstructured information exchange via existing and entrenched channels, such as fax (yes, fax!). The basic function of such capabilities involves wrapping agentic AI around a legacy health channel to speed the flow of, say, lab or radiology results from a clinic into an electronic health records system at a hospital without staff on both ends having to handle a bunch of manual administration throughout the process. Thus, the truly amazing capabilities of AI tackle bothersome and time-consuming bottlenecks in health care workflows. But because these tools involve PHI exchange, they require additional safeguards to operate responsibly and in compliance, and those safeguards should be specifically designed for health care settings.

Like all AI models, agentic AI requires data to perform effectively. It has to be trained on what, for example, lab results actually look like and indicate in order to do all that AI processing from input to output. But lab results are PHI. And even if a tech company has a business associate agreement (BAA) with a particular health provider, it cannot just feed that provider’s patient PHI into its AI product and use it however it wants or potentially expose that sensitive data to some other entity it may be doing business with.

The solution is to design health care-specific AI models and prompts that protect PHI by not using any PHI at all for model training or testing. A standard classification model can be trained on anonymized data scrubbed from all 18 HIPAA identifiers, and samples can be generated in a synthetic data factory. Health care providers can also be supplied with their own custom AI models that have absolutely no data crossover with any of the tech company’s other customers. The health care provider supplies samples of anonymized and redacted documents they typically receive or think that they’ll be receiving, for example, and the tech company builds a model specific to their needs for recognizing and processing the type of information contained in those documents. The synthetic data factory uses these examples to generate a large, statistically relevant library of synthetic health care documents to train and hone the accuracy of the AI services.

So by segmenting customer-by-customer and anonymizing and redacting data in a HIPAA-compliant way from the get-go, the AI can be scaled compliantly and responsibly for health care use. It requires a little extra consideration and effort to achieve the technology’s benefits, but that is the assignment.

This cuts to the heart of the biggest risk people don’t seem to really understand when they’re evaluating AI use in health care. The very first question has to be “What data are being used and where are the data going?” If you — or any of the tools or services you use — just throw ChatGPT or Claude or Gemini at a health care function, you cannot answer that question. So, you’re not actually protecting PHI or protecting your organization.

These are issues I discuss with my team and my customers often, and it hits on another core consideration in the age of AI. Bots don’t live in the real world, but we do. How we use AI in any health care function will ultimately impact real human lives and real health care outcomes — so it is important to actually work with other real humans to navigate context, compliance and risks together responsibly. The lonely prospector out in the digital wild, wielding AI to whip up a billion-dollar health care marketing engine, may not be considering what data are being used and where the data are going — but he should be. So should we all.

Denis Whelan is the CEO of Documo, an AI-forward health care SaaS company building workflow and interoperability infrastructure for unstructured health care documents. Denis is focused on building the AI bridge between the document-centric world that health care operates in today and the fully interoperable future ahead. Previously, Denis served as CEO of Projector PSA, a leading developer of cloud-based professional services automation software, where he led transformational growth at the company, resulting in an acquisition by BigTime Software and Vista Equity Partners. Denis holds a BA in business from the Isenberg School of Management at the University of Massachusetts, Amherst.

Stay informed with the Medical Economics eNewsletter, delivering expert insights, financial strategies, practice management tips, and technology trends tailored for today’s physicians.

Latest CME

Video

Progress in Hyperlipidemia Management to Reduce ASCVD Risk: An Illustrated Update

Nihar R. Desai, MD, MPH; Martha Gulati, MD, MS, FACC, FAHA, MASPC, FESC, FSCCT (hon), FRCP Edin

Protected health information and the AI gold rush putting it at risk

Key Takeaways

Related Content

Medicaid Fraud War Room flags $203 million; nearly 10,000 UC physicians move to unionize; Fauci takes the Fifth — Morning Medical Update Weekly Recap

Optimization or cherry-picking? The AI threat to accountable care

Why physicians consistently underestimate retirement spending

Senate health committee advances Erica Schwartz, M.D., J.D., M.P.H., for CDC director

4 ways to protect yourself from malpractice claims tied to AI scribes

Latest CME

Progress in Hyperlipidemia Management to Reduce ASCVD Risk: An Illustrated Update

Trending on Medical Economics

A simple triage fix cuts emergency department wait times by 11 minutes

Primary care advocates cheer gains in 2027 MPFS but push CMS and private payers for more

Medicaid Fraud War Room flags $203 million; nearly 10,000 UC physicians move to unionize; Fauci takes the Fifth — Morning Medical Update Weekly Recap

Optimization or cherry-picking? The AI threat to accountable care

A primary care physician's guide to cyclosporiasis, with Molly O'Shea, M.D.