More Patient Data? Be Careful What You Wish for…AI’s Role in Making Clinical Data Exchange Useful

Abstract

Overview

Any single electronic medical record (EMR) often only contains a fraction of the patient’s total health record. Hence, provider organizations are investing in data exchange. Such EMR interoperability, while having obvious benefits, makes already bloated patient charts that much thicker. Although not a panacea, AI can help aggregate, clean, and prioritize EMR data, thus making the promise of interoperability more attainable than ever. This holds great promise for care continuity.

Continuity of care reduces errors and unnecessary utilization while potentially improving provider productivity and outcomes.¹ The coherent flow of patient data across different providers is a critical component of overall care continuity.² Achieving this means ensuring providers can access all pertinent elements of the patient’s medical history and have the time to review that information. Can artificial intelligence (AI) help accomplish this goal? Should it?

Patient Charts Are Complex, Big, and Getting Bigger…

Electronic medical records (EMRs) contain a lot of patient data. In addition to structured data like problem, medication, and encounter lists and lab values, there is unstructured data like medical notes, image files, and received faxes. Eighty percent of the data in EMRs is unstructured.³ Partly this is because clinical images and videos are massive vis-à-vis parametric and text data. The main culprit, though, is the volume of medical notes. Analysis of 6 years of data stored in the University of Pennsylvania Health System’s EMRs showed that the average patient chart contained 53 notes.⁴ This average patient chart masks the true problem. Utilization of health care services, and the corresponding size of the medical record, is concentrated on polychronic, frail patients, whose charts become massive. Many patients’ total medical records commonly exceed 1000 pages. As a provider may see 20 or more patients a day,⁵ reviewing such a tome before each encounter is impossible. Alas, the problem is getting worse. One major academic center found the length of outpatient progress notes grew 60% between 2009 and 2018.⁶

All These Records, but What is Needed is Often Missing…

Even with these massive charts, data are missing. Given the fragmentation of the US health care system, the totality of a patient’s medical record seldom resides in any single EMR. In 2019, 35% of Medicare beneficiaries saw 5 or more physicians in the United States.⁷ Similarly, over 40% of high utilizers of hospitals visit more than one facility.⁸ Unsurprisingly, 34% of primary care providers reported they do not “always” or “most of the time” receive useful information from other providers.⁷

Data Interoperability Becomes a National Priority…

Addressing fragmented medical records has been a policy goal for 20 years. In 2006, the newly formed Office of the National Coordinator for Health IT prioritized clinical data exchange.⁹ In 2009, the HITECH Act accelerated the adoption of the EMR system, a prerequisite for care continuity. MACRA (2015) and the 21st Century Cures Act (2016) further drove interoperability.

This effort is bearing fruit. In 2021, 3 out of 4 hospitals participated in at least one form of a health information exchange (HIE) network. Sixty percent of US hospitals can share summary care records.¹⁰ Whereas early research struggled to show a definitive link between HIE use and care improvement, it is now clear that the exchange of clinical data is at least somewhat beneficial.¹¹ This is exciting; however, it is only part of the story.

The Movement of Data is Messy…

Even when health care providers exchange data, it is often incomplete, hard to use, and difficult to integrate into the receiving provider’s core systems. One reason for this challenge is that the shared data exist in multiple formats, from structured continuity of care documents (C-CDA) to PDF exports of select pages of the chart. Another challenge is that the data are being exchanged in various ways, from APIs to faxes to snail mail. However, the real problem is not technical—it is the heterogeneity of the data itself.

The relatively easy stuff to move is the structured data. Importing a set of diagnosis, procedure, or national drug codes from one EMR to another is straightforward enough. Further, these data are easy to work with. A health system can ask questions like, “Which patients who had a mechanical heart valve implanted are not on anticoagulants?” Thousands of similar quality/care pathway-type inquiries are run all the time. It’s no surprise moving structured data are the focus of current interoperability efforts. However, what about the rest of the data, the 80-plus percent of EMR data that’s unstructured? One can move a document from one EMR to another, but using the information at scale once migrated is very challenging. Making matters worse, often structured data are printed or e-faxed, degrading it to unstructured data.

Why Ignoring the Unstructured Data is a Bad Idea…

If just structured data were adequate to achieve informational care continuity, the “interoperability crisis” would be all but solved. Alas, better care requires structuring and moving much more than this for at least 4 reasons. As Kharrazi et al quantified, structured data often underreports a patient’s health.¹² For example, they found a 1.7-fold increase in patients with dementia when they reviewed unstructured notes instead of just structured EMR data. Similar results for urinary retention and malnutrition were 3.9-fold and 18.0-fold greater, respectively.

Furthermore, the unstructured data provides much more detail on a patient’s health. Disease severity, symptoms, and ruled-out issues are rarely captured in a structured format. Parametric data like pain scores, PHQ-9 results, functional assessments, etc., which could be represented as structured data, are often captured solely in notes.

Third, the unstructured data act as quality control for the structured data. For example, a patient may have obesity on their problem list, even if a medical note states the patient recently reduced their body mass index from 32 to 26 kg/m².

Finally, the unstructured data contain information that is simply difficult to organize into a database of clinical codes. Examples are disease progression data, the relationship of symptoms to disease, or even the go-forward plan of care.

AI Can Help Address the Data Volume and Usability Challenge…

Making the data directly typed or dictated into EMRs as well as the data streamed in from HIEs useful requires at least 3 things. First, all relevant data need to be extracted from the medical notes and placed in a structured format. Second, the data must be cleaned. Third, pertinent data needs to be automatically prioritized for the clinician.

Machine learning technology is already being used to error-check structured data, remove duplicates, and even suggest missing data.¹³ Advances in AI mean that structured data can be automatically extracted from medical notes with improved accuracy. For example, a clinical natural language model performed as well as a physician in extracting clinical insights from unstructured medical documentation in a preoperative setting.¹⁴ Transformer technology (e.g., BERT) and next-generation large language models (e.g., ChatGPT) not only find words of interest but also understand the meaning and context of the data elements in the record.

Finally, AI holds the promise to not just summarize the patient chart in general but also provide user- and context-specific patient summaries. The information needed on a patient presenting in the ED with chest pain is different from the summary required for an outpatient cardiology visit.

Making AI-Supported EMR Data Cleaning Safe…

We fully support the idea of safely employing AI to make EMR data more accessible and EMR data exchange more practical. However, some risks need to be mitigated. First, AI is not perfect. It will sometimes misinterpret clinical findings. Further, given the growth of EMR interoperability, AI created chart summaries, and extracts will be propagated across the country. Finally, as these summary data are now more usable than ever, it will likely impact clinician decision-making. For example, a medication list and lab values in one EMR could be automatically augmented or modified based on how an AI algorithm running at some other institution interpreted a newly received faxed-in discharge summary. Exciting, yes. Scary, that too.

Nothing is perfect, and humans make errors too. However, we advocate at least 4 safeguards in employing AI to extract EMR data. First, all summary data extracted from EMRs by AI need to be flagged as such. Second, when at all practical, the raw data need to be available to clinicians to interpret for themselves. Third, EMR user interfaces need to support clinician review of any AI modifications to the patient record. Finally, AI technology developers need to publicly describe their methods, testing procedures, and performance. Collectively, such transparency will both make the use of AI-extracted/summarized EMR data safer and build clinician confidence in using it.

Having all relevant data aggregated, structured, cleaned, prioritized, and contributing to better care is the true goal of EMRs and interoperability. In The Rime of the Ancient Mariner, Coleridge’s protagonist laments, “Water, water everywhere, nor any a drop to drink.”¹⁵ This is easily adapted for interoperability as “Data, data everywhere, I have no time to think!” While not a panacea, the appropriate application of AI will invariably help make EMR data and interoperability much more valuable. Let’s just do it safely.

Footnotes

Author Disclosure Statement

The Conflict of Interest—states: K.A. is CEO of KAID Health, Inc., a clinical analytics company. The 2 authors are married.

Funding Information

No funding was received for this article.

References

Pereira Gray

, Sidaway-Lee

, White

, Thorne

, Evans

. Continuity of care with doctors—A matter of life and death? A systematic review of continuity of care and mortality. BMJ Open, 2018; 8(6):e021161.

Ljungholm

, Edin-Liljegren

, Ekstedt

, et al. What is needed for continuity of care and how can we achieve it?—Perceptions among multiprofessionals on the chronic care trajectory. BMC Health Serv Res, 2022; 22(1):686.

Pak

. Unstructured data in healthcare. Healthcare tech outlook. Available from: https://ai-healthcare.healthcaretechoutlook.com/cxoinsights/unstructured-data-in-healthcare-nid-506.html [Last accessed: April 12, 2024 ].

Steinkamp

, Kantrowitz

, Airan-Javia

. Prevalence and sources of duplicate information in the electronic medical record. JAMA Netw Open, 2022; 5(9):e2233348.

Merritt Hawkins. 2018 survey of America’s physicians: Practice patterns & perspectives. Available from: https://physiciansfoundation.org/wp-content/uploads/2018/09/physicians-survey-results-final-2018.pdf Published September 2018. [Last accessed: April 12, 2024 ].

Rule

, Bedrick

, Chiang

, et al. Length and redundancy of outpatient progress notes across a decade at an academic medical center. JAMA Netw Open, 2021; 4(7):e2115334.

Kern

, Bynum

JPW

, Pincus

. Care fragmentation, care continuity, and care coordination—How they differ and why it matters. JAMA Intern Med, 2024; 184(3):236–237.

Kaltenborn

, Paul

, Kirsch

, et al. Super fragmented: A nationally representative cross-sectional study exploring the fragmentation of inpatient care among super-utilizers. BMC Health Serv Res, 2021; 21(1):338.

Halamka

. Halamka on enabling nationwide interoperability. Open Health News. Published Feb 17, 2016. Available from: https://www.openhealthnews.com/story/2016-02-17/halamka-enabling-nationwide-interoperability [Last accessed: April 12, 2024 ].

10.

Office of the National Coordinator for Health IT, CMS. Interoperability and methods of exchange among hospitals in 2021. ONC Data Brief No. 64; Jan 2023.

11.

Menachemi

, Rahurkar

, Harle

, Vest

. The benefits of health information exchange: An updated systematic review. J Am Med Inform Assoc, 2018; 25(9):1259–1265.

12.

Kharrazi

, Anzaldi

, Hernandez

, et al. The value of unstructured electronic health record data in geriatric syndrome case identification. J Am Geriatr Soc, 2018; 66(8):1499–1507.

13.

Bess

. How machine learning is cleaning up medical records. insideBIGDATA. Published April 19, 2023. Available from: https://insidebigdata.com/2023/04/19/how-machine-learning-is-cleaning-up-medical-records/[Last accessed: April 12, 2024 ].

14.

Suh

, Tully

, Meineke

, Waterman

, Gabriel

. Identification of preanesthetic history elements by a natural language processing engine. Anesth Analg, 2022; 135(6):1162–1171.

15.

Coleridge

, Noyes

., ed. The rime of the ancient mariner. Globe School Book Co.: New York; 1900.