Sage Journals: Discover world-class research

Abstract

Background: Using the International Classification of Diseases (ICD) codes alone to record opioid use disorder (OUD) may not completely document OUD in the electronic health record (EHR). We developed and evaluated natural language processing (NLP) approaches to identify OUD from the clinal note. We explored the concordance between ICD-coded and NLP-identified OUD.

Methods: We studied EHRs from 13,654 (female: 8223; male: 5431) adult non-cancer patients who received chronic opioid therapy (COT) and had at least one clinical note between 2013 and 2018. Of eligible patients, we randomly selected 10,218 (75%) patients as the training set and the remaining 3436 patients (25%) as the test dataset for NLP approaches.

Results: We generated 539 terms representing OUD mentions in clinical notes (e.g., “opioid use disorder,” “opioid abuse,” “opioid dependence,” “opioid overdose”) and 73 terms representing OUD medication treatments. By domain expert manual review for the test dataset, our NLP approach yielded high performance: 98.5% for precision, 100% for recall, and 99.2% for F-measure. The concordance of these NLP and ICD identified OUD was modest (Kappa = 0.63).

Conclusions: Our NLP approach can accurately identify OUD patients from clinical notes. The combined use of ICD diagnostic code and NLP approach can improve OUD identification.

Keywords

Opioid use disorder natural language processing electronic health record case identification international classification of diseases code

Introduction

Opioid use disorders (OUD) and opioid-related overdose deaths are a public health crisis in the United States. The prevalence of OUD and mortality from opioid overdose have increased rapidly since the early 1990s, partly because physicians were encouraged to aggressively treat pain, treating it as the “fifth vital sign.”¹ Prescription opioids are one of the most effective treatments for acute pain, though the most commonly cited reason for initial opioid use in patients who develop OUDs is chronic pain.^2,3 In 2014, approximately 12.5 million Americans had OUD, and prescription opioid overdoses caused approximately 52 Americans to die each day, accounting for 18,893 deaths annually.⁴ Since 2015, Healthcare organizations have introduced initiatives to control opioid prescribing.⁵ However, prescription opioid-related overdose fatalities have remained significantly high, with 17,029 deaths in 2017.⁶ In addition, patients with OUD are more likely to become polysubstance abusers and to experience psychiatric disorders, disability, and various infectious diseases, including hepatitis C and human immunodeficiency virus disease.^7–9 OUD is also associated with a high cost for the health care system.² Effectively identifying individuals with, or at risk for, OUD may help to reduce opioid-related morbidity, mortality, and economic burden the USA healthcare system.

Healthcare providers have been strongly encouraged to perform risk evaluation for OUD universally when prescribing opioids for pain management.¹⁰ Screening and diagnostic tools have been developed to help clinicians identify patients who are currently misusing or dependent on opioids.^11,12 However, screening may not identify all at-risk patients, as some may be reluctant to admit a history of misuse or addiction. In addition, the implementation of routine screening in practice has proven a challenge due to time constraints, competing demands, and other clinical priorities.^13–15 Typically, diagnoses recorded in the electronic health record (EHR) utilize the International Classification of Diseases (ICD) codes; ICD codes for OUD submitted by either a coder or a clinician, unless the primary diagnosis for a visit, are unlikely to impact reimbursement levels in most settings; it is not surprising that the OUD diagnosis is sometimes not recorded as ICD even when presents. In addition, using ICD codes may misclassify OUD and may not distinguish OUD for prescription opioids and illicit drugs.¹⁶ Instead, patient’s problems, conditions, diagnoses, and treatment plans are often more completely recorded in clinical notes. That information can be effectively extracted using natural language processing (NLP). NLP is a computerized technique that automatically extracts information of interest from unstructured clinical notes and converts that information to a structured format for further analysis.¹⁷

Recent studies showed that NLP could identify 30–50% more patients with clinical evidence of prescription OUD than ICD coded data.^18,19 Even when a clinician detects some behaviors indicative of OUD, they may not add the diagnosis codes to the patient’s record, thereby not passing on important information to other care providers. As such, NLP-based OUD identification may be an effective supplementing patient screening, patient medical record manual review, or ICD-coded data and can certainly add value by identifying patients that these other methods do not pick up. In this study, we developed NLP algorithms that identify OUD from clinical notes in patients prescribed prescription opioids. Then we formally evaluated NLP algorithm performance against the gold standard (manual review). The concordance between ICD-coded OUD and NLP-identified OUD was also explored.

Methods

Study setting

The study was conducted at the Medical University of South Carolina (MUSC), an academic medical science center with inpatient, outpatient, and emergency facilities serving Charleston, South Carolina, and surrounding areas. MUSC initiated the EpicCare EHR system (Epic Systems Corp., Verona, WI) for outpatient care in 2012 and for inpatient care in 2014. A Research Data Warehouse (RDW) copies the Epic data and serves as the data repository for clinical research. For this study, we used patient medical records from the RDW. A multidisciplinary team with expertise in addiction, pain management, statistics, and medical informatics conducted this study.

Data source

To develop an NLP pipeline to identify OUD from clinical narratives, we used clinical notes from adult non-cancer patients who received chronic opioid therapy (COT) from 2013 to 2018. COT was operationally defined as having received at least a 70-day supply of prescription opioid medication within any 3 months.²⁰ Opioid medications include prescription analgesics administered transdermally or orally. Once COT patients were identified, all of their medical records were used for analyses, including data recorded outside of the COT period. Source data included various clinical narratives, such as progress notes, consultant notes, outpatient visit notes, orders, and discharge summaries. Patient demographic and ICD-coded diagnosis data were also extracted. The study subject ID linked patient source documents and structured demographic and clinical data across each individual’s records for analysis. Of 13,654 eligible patients, 10,218 (75%) patients were randomly selected as a training set to develop an NLP lexicon and NLP algorithms. The remaining 3436 patients (25%) were used as a test dataset to evaluate NLP algorithm performance (Figure 1).

Figure 1.

Patient selection.

NLP software

We used NLP software (Linguamatics I2E version 5.4, Cambridge, United Kingdom) to index, parse, and query each clinical note. The I2E NLP software applies standard medical terminology concept-based indexing techniques and performs named entity recognition from text documents. Then queries retrieve information for reports, meeting a user-defined set of criteria through a user-friendly interface to define syntactic and semantic representations.²¹ Our previous work using this NLP software demonstrated high accuracy approaches identifying under-coded healthcare issues from clinical notes. In previous work using the same NLP software, we abstracted numerator data for the Group Physician Reporting Option (GPRO) quality measure for fall risk assessment. The NLP algorithm identified 62 (of 144) patients for whom a fall risk screen was documented only in clinical notes and, thus, was not ICD coded. A manual review confirmed 59 patients as true positives and 77 patients as true negatives. That NLP approach scored 0.92 for precision, 0.95 for recall, and 0.93 for F-measure.²² Another recent study using I2E identified 1.6% of prostate cancer patients suffering from social isolation, which was documented only in clinical notes.²³ The NLP pipeline demonstrated 90% precision, 97% recall, and 93% F-measure. Our experience ensures the development of accurate NLP algorithms identifying OUD from clinical notes in this study.

Development of the lexicon for opioid use disorder

To develop an accurate and complete lexicon for OUD identification, we utilized multiple resources, including domain experts’ knowledge, ICD9/10 coding systems, the NLP software I2E built-in standard terminology SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) for clinical findings and RxNorm for medications.^24,25 Two psychiatrists (Brady and Barth), who have extensive clinical experience in addiction science and pain management, generated an initial list of commonly used terms. These terms represented: (1) opioid medications (e.g., opioid, opiate, narcotic, fentanyl, duragesic, analgesic, heroin, Oxycontin, etc.); (2) use disorder (e.g., abuse, addiction, misuse, dependency, poison, overdose, overuse, sedation, etc.); the combination of “opioid medication” and “use disorder” represents OUD; (3) FDA-approved medications for OUD in all formulations (e.g., buprenorphine sublingual tablet, Suboxone sublingual film/tablet, methadone, Zubsolv, Subutex, Vivitrol, etc.); (4) Current Opioid Misuse Measure (COMM) score greater than 8; and (5) mention(s) of regional addiction treatment facilities (e.g., Barrier Island, MUSC, Center for Drug and Alcohol Programs [CDAP]). We also included descriptive tokens of ICD relevant to OUD (e.g., Opioid dependence, Poisoning by other opioids, opioid abuse).²⁶ For each term, informaticians (Zhu and Lenert) utilized the NLP software to generate an enhanced term list with spelling variants, acronyms, and abbreviations. Also, the built-in ontologies for opioid medications (RxNorm) and OUD (SNOMED CT clinical finding) were included in the initial lexicon. Domain experts (Brady and Barth) who were blinded to this process evaluated the list by manually reviewing 1000 randomly sampled candidate phrases with OUD mentions identified by the NLP software. When disagreement occurred, the reviewers discussed the terms and came to a mutual agreement in order to form an enhanced and refined lexicon for OUD clinic evidence. We then iteratively refined the NLP queries and lexicon according to their feedback.

Develop and evaluate NLP algorithms for OUD

We developed a set of NLP queries to identify OUD mentions from clinical notes based on domain knowledge. A positive OUD case was identified by the following criteria: a) discovered named entity of OUD defined by the OUD lexicon; and b) excluding OUD mentions with negations. These NLP queries were designed to capture semantic information, syntactic patterns, and clinical negations in order to translate a documented OUD case to the following structured data elements: (1) patient medical record number (MRN), (2) OUD mentions, (3) author type, (4) note ID, (5) date of the document, and (6) type of clinical note. The query development is an iterative process based on two research assistants’ manual review of NLP identified OUD cases in the training dataset. The final NLP queries were established when sensitivity and specificity for NLP algorithms could not be improved.

Performance measure

We used the gold standard, a manual chart review (in this instance, by one post-doctoral researcher on addiction science and one research assistance with extensive experience on chart review) to evaluate NLP performance using the test dataset. Because the patients in the test dataset usually had multiple clinical notes in the training dataset, domain expert reviewed their latest progress notes. The inter-rater agreement rate was calculated. We calculated three standard performance measures for the NLP algorithm: precision, recall, and F-measure. Precision (exactness) is the proportion of true positives to the total number of algorithm-identified cases; in contrast, recall (completeness) is the proportion of true positives that are retrieved by algorithms. F-measure is a harmonic mean of precision and recall (F-measure = 2* precision * recall/(precision + recall)); an F-measure reaches its best value at one and worst at zero.²⁶ Finally, for all false positives and false negatives generated by the NLP algorithms, the reasons for incorrect classification were manually determined and summarized.

We also conducted a secondary analysis between NLP-identified OUD and ICD-coded data.

Results

Opioid use disorder lexicon

The study cohort contained 13,654 eligible patients. The average age of study subjects was 46 years old (standard derivation: 17). The study cohort had more females (8,223, 60.3%) and more Caucasians (8,000, 58.6%). The NLP final query was established using clinical notes in the training dataset, which contained 214,647 clinical notes from 10,218 patients. The average number of documents per patient was 21 (max: 1551; minimum: (1). The number of each note type is listed in Table 1. After iterative evaluations (manual review) between the keywords hit and the original documents, we developed a final lexicon of OUD and negations. For negations, we utilized the built-in clinical negations and added customized negations that were determined from clinical notes. For example, we also excluded documents that had OUD treatment medication mention(s) and, however, had pain mention(s) because some opioids could be used for both OUD treatment and pain management. During this process, we discovered that NLP terms referring to treatment facilities introduced a large number of false-positive OUD cases. The facilitiy (CDAP) treat many other addictions (e.g., alcohol abuse) besides OUD patients. In CDAP, OUD patients received Suboxone as the treatment medication. In our NLP algorithm, we already included Suboxone as one of the OUD treatments to identify OUD cases. Therefore, the term ‘CDPA’ was excluded in the final NLP algorithm. The final OUD lexicon included wide variations, with 539 terms representing OUD, 73 terms representing OUD treatments, 128 terms representing pre-negations, and 15 terms representing post-negations (Table 2). Without negation, there were 4809 documents with OUD mention(s) from 644 (6.3%) patients; with negation, there were 4421 documents with OUD mention(s) from 483 (4.7%) patients. Of these 483 OUD patients, 462 patients had OUD diagnosis mention(s), and OUD treatments identified an additional 21 OUD patients. The leading keywords (with morphologic variants) for OUD were “opioid dependence,” “opioid use disorder,” “opiate dependence,” “narcotic dependence,” and “opioid abuse.” The leading terms for OUD treatments were “buprenorphine,” “Subutex,“ and “suboxone sublingual.” The leading negations were “for pain” (when applied to methadone and buprenorphine products, to exclude indicated use for pain), “difference between dependence and abuse,” “without,” “prevent,” and “concern.” The most common note types with NLP-identified OUD instances were progress notes, blank (unknown note type), discharge summary, H&P, and consult notes. The most common author types were physician, resident, blank (unidentified author), physician assistant, and nurse practitioner.

Table 1.

Number and prevalence of different note type in training data set.

Note type	Number	Percentage, %
Progress notes	70,868	33.0
Blank^a	38,448	17.9
Nursing	36,109	16.8
ED provider notes	12,899	6.0
Consults	12,286	5.7
ED notes	12,086	5.6
H&P	11,635	5.4
Discharge summaries	10,576	4.9
Interval H&P notes	53,58	2.5
H&P (view-only)	43,32	2.0
ED follow up	50	0.0
Total	214,647	100.0

^aNote type information was not defined in the dataset.

Table 2.

Lexicon of opioid use disorder and frequency (common terms).

OUD terms	Freq	%	OUD treatment terms	Freq	%	Pre-negation	Freq	Post-negation	Freq
Opioid dependence	1592	39.2%	Buprenorphine	573	54.6%	For pain	108	(COMM) score 2	14
Opioid use disorder	1003	24.7%	Subutex	434	41.4%	Concern	70	(COMM) score 1	13
Opiate dependence	580	14.2%	SUBOXONE 8-2 mg sublingual	201	19.1%	Difference between dependence and abuse	66	(COMM) score 3	13
Opiate use disorder	466	11.5%	Buprenorphine HCl 8 mg sublingual	110	10.5%	Without	60	(COMM) score 0	12
Narcotic dependence	196	4.8%	Buprenorphine-naloxone 8-2 mg sublingual	83	7.9%	Prevent	30	(COMM) score 4	9
Opioid abuse	146	3.6%	Buprenorphine HCl 2 mg sublingual	61	5.8%	High risk medications	24	(COMM) score 8	8
Narcotic abuse	126	3.1%	Buprenorphine HCl 4 mg sublingual	41	3.9%	Possible	26	None	8
Heroin abuse	79	1.9%	Buprenorphine-naloxone 8-2 mg film sublingual	31	3.0%	Excluding	18	(COMM) score 7	7
Narcotic overuse	62	1.5%	SUBOXONE place 1 film under the tongue	25	2.4%	VS	17	(COMM) score 5	6
Heroin overdose	59	1.5%	SUBOXONE) 2–0.5 mg sublingual	21	2.0%	No current opioid	15	(COMM) score 6	5
Opiate overdose	57	1.4%	Buprenorphine 8 mg sublingual	21	2.0%	No known allergies diagnoses	14	Resolved	2
Opioid misuse	49	1.2%	Buprenorphine-naloxone 2–0.5 mg sublingual	12	1.1%	Educated	11	Concerns	2
Narcotic addiction	48	1.2%	SUBOXONE place 2 film under the tongue	11	1.0%	Patient/family	10	was discontinued	1
Opiate addiction	47	1.2%	SUBOXONE place 3 film under the tongue	11	1.0%	DOES NOT HAVE	10	—	—
Narcotic overdose	30	0.7%	SUBOXONE place 1 tablet under the tongue	10	1.0%	Rule out	9	—	—
Opioid addiction	23	0.6%	Suboxone 8 mg sublingual	8	—	R/O	8	—	—
Total hit documents	4068	—	—	1050	—	—	—	—	—

NLP algorithm performance

The multiple NLP queries combining seven modules produced a structured output table for further analyses. The output table also provided a link to the original document, which the NLP developer and reviewers could validate during the development and evaluation phases (Figure 2). Among 96,773 notes from 3436 patients in the test dataset, the NLP query identified 2088 notes with OUD mentions(s) from 158 patients (4.6%). The percentage of OUD patients in the test dataset was consistent with the training dataset (4.7%), which indicated that both the training dataset and test dataset were representative. Based on the prevalence of NLP-identified OUD patients in the sample and the assumption of classification accuracy of the NLP algorithm (area under the ROC curve) as 0.9 (Type I error = 0.05 and Power = 80%), two reviewers first manually evaluated 4 randomly selected clinical notes that NLP-identified as positive for OUD, and 74 notes that NLP-identified as negative for OUD. Both reviewers found no false positives or false negatives in those notes. Then, 1000 randomly selected clinical notes that NLP-identified as positive for OUD were evaluated. Among these 1000 notes, one reviewer identified 14 false positives, and another reviewer identified 13 false positives. Their inter-rate agreement was 98.6%. Counting at the document level, our NLP algorithm for OUD identification had a precision of 98.5%, recall of 100%, and an F-measure of 99.2% (Figure 3). Examples of false positives are listed in Table 3. The following major reasons accounted for false positives: (1) the NLP approach did not completely exclude false OUD mentions in the instruction of opioid side effects in the discharge summary; (2) the screening workup had no score immediately appearing after COMM; (3) sublingual buprenorphine was used off-label for pain management; or (4) a physician documented concern about the diversion of a patient’s opioid pain medications to a person with OUD.

Figure 2.

NLP output example. (Note: Patient_ID and Note_ID were masked)

Figure 3.

NLP algorithm performance evaluated by the gold standard.

Table 3.

Examples of false positives.

Continue norco 5-325 BID prn polysubstance abuse: Quit smoking crack approximately
Of diagnoses or management options opioid abuse: New and requires workup
Is stable on low dose tramadol without abuse or diversion
Too frequently can result in analgesic overuse headaches
Gabapentin - refill given for tramadol tobacco abuse: Patient with 54 year ...
These include GI bleeding and opioid dependence
These include GI bleeding and narcotic dependence
Current opioid misuse measure (COMM)
Subungual suboxone rx for pain management related to right knee with meniscus tear
Will hold off on opioids/meds with addictive potential
In terms of narcotic dependence, she denies any euphoria...
OPIOID MEDICATIONS CAN BE ADDICTIVE OR MISUSED.
The patient’s total score on the current opioid misuse measure was 6
He is advised in the past that he does have people living near him or in his home, relatives, who have a history of opioid abuse and therefore we need to take this into account

Comparison between NLP-identified OUD and ICD-coded OUD

We further explored the concordance between NLP-identified OUD and ICD codes representing OUD. The NLP pipeline included OUD treatments as an indication of OUD. To compare with ICD-identified OUD, we excluded NLP-identified cases by OUD treatment for this comparison because ICD codes represent diagnosis context. Using a lexicon of OUD diagnosis alone, the NLP approach identified 462 patients with positive OUD mention(s) from their clinical notes in the training dataset. In the coded data (ICD9: 305.5, 304.0, 304.2, 304.7, 965.0; ICD10: F11.1, F11.2, T40 (0-5)X (1-3), T40.60 (1-4),etc.), 368 patients (3.6%) had an OUD diagnosis. The overlap between ICD coded and NLP-identified OUD included 308 patients. Moreover, among patients with NLP-identified OUD, 154 (33.3%) had no ICD-coded OUD. Of patients with ICD coded OUD, 60 (16.3%) had no NLP-identified OUD evidence from clinical notes (Figure 4). Cohen Kappa, for these two approaches, was 0.63. The concordance between the NLP approach and ICD coding is also modest in the subgroup of OUD. For example, In the training set, NLP identified about 40% of patients had “opioid dependence” mention(s) in their clinic notes. However, only 95 patients had recorded ICD code(s) for “opioid dependence.” The distribution of OUD diagnostic codes for opioid dependence and opioid abuse is most prevalent, which is similar to the NLP OUD lexicon distributions (Table 4). For these 60 patients who had ICD code(s) for OUD but no corresponded information in their clinical notes a well-trained medical research team member (Kopscik) reviewed 330 (average five notes per patient) prominent clinical notes (progress note, H&P, ED provider notes, or discharge summary). These notes were dated 8 days before and after an ICD coded OUD event. This extended 16 days interval ensures that an ICD coded OUD event has at least one major clinical note to be examined. The investigation results suggested that most of these patients were on opioid medication for pain management; for example, “dilaudid 2 mg IV prn - opioid dependence,” “Opioid dependence/Chronic pain.” These patients might physically dependent on opioids for pain control; however, they were coded as opioid dependence or opioid abuse (Table 5). The study cohort was patients who have been on COT. We intended to develop NLP negations to exclude opioid usage as pain treatment from OUD. Therefore, our NLP algorithm did not identify these 60 patients as OUD.

Figure 4.

ICD code or NLP identified OUD patients in the training dataset.

Table 4.

ICD codes and frequency in ICD coded OUD.

ICD code	Description	Count	ICD code	Description	Count
F11.20	Opioid dependence, uncomplicated	649	F11.11	Opioid abuse, in remission	5
304.00	Opioid type dependence, unspecified	390	305.53	Opioid abuse, in remission	4
304.01	Opioid type dependence, continuous	338	T40.2X1A	Poisoning by other opioids, accidental (unintentional), initial encounter	4
305.50	Opioid abuse, unspecified	99	T40.601 A	Poisoning by unspecified narcotics, accidental (unintentional), initial encounter	4
F11.10	Opioid abuse, uncomplicated	57	965.01	Poisoning by heroin	4
F11.23	Opioid dependence with withdrawal	53	F11.29	Opioid dependence with unspecified opioid-induced disorder	3
F11.21	Opioid dependence, in remission	29	965.01	Poisoning by heroin	2
304.03	Opioid type dependence, in remission	21	965.02	Poisoning by methadone	2
305.51	Opioid abuse, continuous	17	T40.4X2A	Poisoning by other synthetic narcotics, intentional self-harm, initial encounter	2
F11.24	Opioid dependence with opioid-induced mood disorder	15	F11.14	Opioid abuse with opioid-induced mood disorder	2
304.70	Combinations of opioid type drug with any other drug dependence, unspecified	14	305.52	Opioid abuse, episodic	1
304.02	Opioid type dependence, episodic	6	304.72	Combinations of opioid type drug with any other drug dependence, episodic	1
965.09	Poisoning by opiates and related narcotics, other	6	F11.121	Opioid abuse with intoxication delirium	1
965.00	Poisoning by opium (alkaloids), unspecified	5	F11.129	Opioid abuse with intoxication, unspecified	1
304.71	Combinations of opioid type drug with any other drug dependence, continuous	5	F11.250	Opioid dependence with opioid-induced psychotic disorder with delusions	1

Table 5.

ICD coded OUD patient without NLP extracted OUD information.

ICD code	Code description	# Of patients	Relevant mention in notes	# Of notes
305.50	Opioid abuse unspecified	20	Opioids as pain medication	208
304.00	Opioid type dependence, unspecified	16	No relevant information	95
304.01	Opioid dependence, uncomplicated	8	Pain/pain managemnet	18
F11.20	Opioid dependence, uncomplicated	8	Substance abuse	6
F11.23	Opioid dependence with withdrawal	6	Patient controlled analgesia	3
F11.10	Opioid abuse, uncomplicated	5	—	—
965.09	Poisoning by other opiates and related narcotics	2	—	—

Discussion

This study demonstrates that NLP can identify OUD from clinical notes with high precision and recall when evaluated by manual chart review (considered the gold standard). Developing accurate NLP approaches by incorporating clinical knowledge into computerized algorithms can be a daunting task. Developing a lexicon that completely and accurately reflects information about OUD in clinical notes is an essential step. There is no consensus on OUD terminology, and considerable heterogeneity of terminology use (both accurate and inaccurate) can be seen in clinical documentation.²⁸ We maximally utilized available resources, including descriptive terms from ICD codes, standard terminologies (SNOMED CT for diagnoses and RxNORM for medications) as the build-in feature in the NLP software, and terms from the scant, yet informative, literature on this topic. Most importantly, we heavily relied on domain experts’ clinical knowledge about how providers document OUD-relevant information in their practice. Both ICD 9/ICD 10 codes and SNOMED CT provided a broad array of terms with the full coverage of medical specialty; however, providers may not use only these standard terms to document clinical findings. Therefore, using these terms alone for case identification may miss important clinical information about OUD. For example, “opioid use disorder” is not a term in either the ICD codes or SNOMED CT; that term was suggested by our domain experts because it is the official term used in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-V).²⁹ In our dataset, “opioid use disorder” was not uncommon; notably, it represented 28% of the clinical notes with NLP-identified OUD mentions. Other terms, also suggested by our domain experts, significantly improved OUD identification in our study, including “opiate use disorder,” ”narcotic dependence,” “narcotic abuse,” and “opiate addiction.” In contrast, diagnostic coding systems (ICD and SNOMED CT) suggest using “opioid abuse,” “opioid dependence,” or “poisoning by opioid” to document OUD incidents; however, “poisoning by opioid” or “opioid poisoning” were not found in any clinical notes; similarly, “opioid abuse” had a very modest frequency. We observed that “opioid dependence” was the most dominant (40%) term of OUD in clinical notes; moreover, the frequency of ICD codes demonstrated the same finding. However, those mentioned above three standard terms altogether identified less than half of the OUD cases in our analysis. Besides terms of diagnosed OUD, the domain experts also suggested using FDA-approved treatments for OUD as an indication of OUD, which identified an additional 4% of OUD cases and further reduced false negatives for our NLP algorithm. To minimize false positives, we utilized the NLP built-in clinical negations. In addition, our domain experts manually examined sentences identified by OUD terms in order to develop customized negations, which more specifically excluded false mentions of OUD, such as “for pain,” “education,” “prevent,” “concern,” and terms for “COMM sore <9.” These negations excluded 25% of candidate OUD cases (from 6.3% to 4.7%) and thus reduced false positives. Consequently, our NLP algorithm achieved high performance (with 98.5% precision and 100% recall), which primarily resulted from our domain experts’ knowledge and their careful evaluation of NLP-generated results during the development process. The lexicon generated from the current study combines standard concepts and domain expert knowledge, thus offering a more complete and accurate data extraction method.

A critical secondary finding was that the agreement between NLP identification from clinical notes and diagnosis codes was slightly above moderate (Kappa = 0.63) and bi-directional in the EHR. NLP identified nearly 40% more OUD cases than ICD diagnoses codes alone; we also observed that about 25% of ICD-coded OUD had no NLP-identified evidence from clinical notes, suggesting that these two methods of identification are complementary. To the best of our knowledge, this is one of only a few studies that have compared NLP-based identification versus ICD-coded data for OUD. Carrell et al. (2015) recently developed and studied an NLP system that performed a dictionary look-up task from clinical notes using a customized dictionary of prescription opioid problem usage (addiction, abuse, misuse, or overuse) for patients receiving COT within Group Health system. The researchers found that traditional diagnostics codes identified 10.1% of patients who experienced at least one form of opioid problem uses in their COT cohort. Their NLP system identified an additional 3.1% of patients from clinical notes. Carrell et al. also reported that the Cohen Kappa agreement between ICD-9 codes and NLP identification was moderate (Kappa = 0.51).¹⁸ In another study, Palmer et al. (2015) also studied the prevalence of problem opioid use identified by NLP versus ICD9 diagnosis codes for the same Group Health COT cohort. Those researchers found that over 33% of NLP-identified cases for problem opioid use did not have an ICD-9 diagnosis of opioid abuse or dependence; moreover, about 40% of patients with ICD-9 codes of OUD did not have NLP-identified evidence of OUD. The two methods of NLP identification and ICD code identification also demonstrated slightly above moderate agreement (Kappa = 0.61).²⁰ Our study confirmed findings from those previous studies that neither ICD diagnosis codes nor the NLP approach alone could ultimately reflect the true prevalence of OUD extracted from EHR data.

ICD codes have been predominantly used to record diagnoses in EHR; However, using ICD codes alone is incomplete and inaccurate for case identification for primary diseases (accuracy: 0.06-0.71).^30,31 Notably, ICD codes for OUD lack both sensitivity and specificity from both previous studies and the current investigation. A recent study reported that more than 40% of patients with assigned ICD codes of OUD did not meet the DSM-5 OUD diagnosis criteria evaluated by the domain experts’ chart review.³² The DSM criteria were usually referred by providers as the gold standard for diagnosing OUD at clinical practices.³³ While the DSM-5 uses the terminology “Opioid use disorder” replacing the terms “opioid abuse” and “opioid dependence” in older versions, ICD coding still uses these outdated terms to record OUD diagnosis and lump all opioid uses, including pain treatment, under this large umbrella.³⁴ Notably, a study reported that “Opioid dependence,” the primary ICD code for OUD, could not distinguish psychological dependence on opioids (OUD) and physiological dependence on opioid therapy for pain management.³⁵ Our study results demonstrated a similar finding: most of the 60 patients, who were ICD coded as OUD but not classified as OUD by NLP, were on opioid treatment for pain. The potential solution is to update the ICD coding system of OUD to reflect the definition of OUD in DSM-5 more closely and provide cleaner guidelines to providers around coding practice for OUD. For example, using the new ICD code Z79.891 (long-term [current] use of opiate analgesic) rather than F11.20 (opioid dependence, uncomplicated) to document the opioids treatment for pain.³⁵ Besides ICD codes, we also can utilize other clinical data to identify OUD patients, such as clinical notes, to supplement the ICD coding system. ICD codes are commonly used for acute diagnoses; chronic diseases, such as OUD, may commonly be documented in problem lists that are usually codified with SNOMED-CT. Information about OUD may also be documented in clinical notes as chief complaint, problem list, history of present illness, clinical findings, or treatment plan. As clinical notes are readily available in the electronic format for clinical practices and research, NLP can automatically unlock such information and add value by identifying OUD patients that ICD codes do not pick up. NLP methods have significantly advanced in the past 5 years with the development of many software packages to support analytics in clinical domains. Extraction of explicitly stated concepts can reach very high accuracy (greater than 90% in many settings, including prior work on opioid abuse).^36,37 Our methods are reusable with other NLP software as a reference dictionary and clinical logic for both rule-based and data-driven approaches, which reflect its potential for broader dissemination with necessary customizations. Importantly, some studies have utilized either coded data or an NLP strategy alone to identify OUD patients from the EHR.^38,39 However, our findings demonstrated that combining NLP-extracted OUD evidence from clinical notes with evidence from ICD diagnostic codes is necessary to provide a comprehensive picture of OUD for clinical practice, research, and surveillance systems.

Limitations

There are several limitations associated with this study. First, our study only focused on patients who had received COT; we did not study illicit opioid users, patients with a history of short-term opioid use (e.g., surgical or dental patients), or patients with cancer. Providers may have documented clinical information differently for these opioid users. The purpose of this study was to develop a fundamental lexicon and NLP algorithm to identify OUD from clinical notes. Studying data from patients on COT may help us to understand OUD information in a longitudinal and focused way in order to develop NLP approaches more efficiently. Future investigators of opioid user subpopulations may customize and utilize our approach and thus add additional necessary components for rapid algorithm development. For example, other street drug names of opioids could be added for illicit opioid users. Secondly, this study was designed for MUSC patient care. NLP tasks are usually institutionally specific, and generalizability is a common concern. During the development process, we intended to maximize the generalizability of the OUD lexicon and algorithms by using the DSM-V classification system, ICD9/10 coding systems, and the I2E built-in standard terminologies SNOMED CT and RxNorm to extract OUD indicators from clinical notes. Our dataset consisted of patients’ longitudinal clinical notes with diverse note types (Table 1); thus, our lexicon and algorithms maximally reflected the patterns of OUD documentation in different note types at MUSC. The OUD lexicon and NLP pipelines generated from this study are reusable with customization: other institutions may engage providers to review the OUD lexicon to confirm, add, or delete terms based on their local practice. In addition, performance evaluation (e.g., Figure 3) and error analyses for false positives and false negatives (e.g., Table 3) using their local dataset are needed before implementation.⁴⁰ Finally, this current study utilized a rule-based approach. In future work, we will leverage our gained knowledge of identified OUD from clinical notes to develop machine learning and deep learning approaches and compare the performance of different NLP strategies.

Conclusions

NLP can accurately identify patients with OUD from clinical note data. Using both the ICD diagnostic code and NLP-extracted OUD evidence from clinical notes can improve our ability to identify OUD patients from the EHR and positively inform treatment interventions.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is supported by U.S. National Institute of Drug Abuse (K12DA031794-06A1; K23DA039328-01A1; UG1DA013727), U.S. National Library of Medicine (R21LM012945), and U.S. National Center for Advancing Translational Sciences (5UL1TR000062–05).

Ethical approval

This study performed the secondary analysis using data from EHR and was approved by the MUSC Institutional Review Board.

ORCID iD

Vivienne J Zhu

Appendix

References

Centers for Disease Control and Prevention . Wide-ranging Online Data for Epidemiologic Research (WONDER), Multiple-Cause-of-Death file, 2015, http://www.cdc.gov/nchs/data/health_policy/AADR_drug_poisoning_involving_OA_Heroin_US_2000-2014.pdf (Accessed 3 December 2019).

White

Birnbaum

Mareva

, et al., Direct costs of opioid abuse in an insured population in the United States. J Manag Care Pharm 2005; 11: 469–479.

Nahin

. Estimates of pain prevalence and severity in adults: United States. J Pain 2015; 16(8): 769–780.

American Society of Addiction Medicine (ASAM) . Opioid Addiction Disease: 2015 Facts and Figures. ASAM. Available at: http://www.asam.org/docs/default-source/advocacy/opioid-addiction-disease-facts-figures.pdf, (Accessed 25 March 2016).

Califf

Woodcock

Ostroff

. Special report: a proactive response to prescription opioid abuse. N Engl J Med 2016; 374(15): 1480–1485.

The Centers for Disease Control and Prevention . National Drug Overdose Deaths Involving Prescription Opioids 1999-2017. CDC Wonder. 2018.

Black

Trudeau

Cassidy

, et al.. Associations between public health indicators and injecting prescription opioids by prescription opioid abusers in substance abuse treatment. J Opioid Manag 2013; 9(1): 5–17.

Becker

Sullivan

Tetrault

, et al.. Non-medical use, abuse and dependence on prescription opioids among U.S. adults: psychiatric, medical and substance use correlates. Drug Alcohol Depend 2008; 94(1–3): 38–47.

Mokdad

Ballestros

Echko

, et al.. The State of US Health. JAMA 1990; 319(14): 1444–1472.

10.

US Food and Drug Administration . Extended-release (ER) and long-acting (LA) opioid analgesics risk evaluation and mitigation strategy (REMS) version 8/2012. US Food and Drug Administration. Available at: http://www.fda.gov/downloads/Drugs/DrugSafety/PostmarketDrugSafetyInformationforPatientsandProviders/UCM311290.pdf, (Accessed 3 December 2019).

11.

McLellan

Kushner

Metzger

, et al. The Fifth Edition of the Addiction Severity Index, J Subst Abuse Treat 9, 1992, 199–213.

12.

Pilowsky

. Screening instruments for substance use and brief interventions targeting adolescents in primary care: a literature review. Addict Behav 2013; 38(5): 2146–2153.

13.

Hylan

Von Korff

Saunders

, et al. Automated prediction of risk for problem opioid use in a primary care setting. J Pain 2015; 16(4): 380–387.

14.

Jones

Moore

Levy

, et al. A comparison of various risk screening methods in predicting discharge from opioid treatment. Clin J Pain 2012: 28–100.

15.

Sehgal

Manchikanti

Smith

H.S.

Prescription opioid abuse in chronic pain: A review of opioid abuse predictors and strategies to curb opioid abuse. Pain Physician 2012; 15(Suppl 3): ES67–ES92.

16.

Roland

Lake

Oderda

. Prevalence of prescription opioid misuse/abuse as determined by international classification of diseases codes: A systematic review. J Pain Palliat Care Pharmacother 2016; 30(4): 258–268.

17.

Friedman

Hripcsak

. Natural language processing and its future in medicine. Acad Med 1999; 74(8): 890–895.

18.

Carrell

Cronkite

Palmer

, et al. Using natural language processing to identify problem usage of prescription opioids. Int J Med Inform 2015; 84(12): 1057–1064.

19.

Hylan

Von Korff

Saunders

, et al. Automated prediction of risk for problem opioid use in a primary care setting. Pain 2015; 16(4): 380–387.

20.

Palmer

Carrell

Cronkite

, et al. The prevalence of problem opioid use in patients receiving chronic opioid therapy: computer-assisted review of electronic health record clinical notes. Pain 2015; 156(7): 1208–1214.

21.

Milward

Bjäreland

Hayes

, et al. Ontology-based interactive information extraction from scientific abstracts. Comp Functional Genomics 2005; 6(1–2): 67–71.

22.

Zhu

Walker

Warren

, et al. Identifying falls risk screenings not documented with administrative codes using natural language processing. AMIA Annu Symp Proc 2018; 201716: 1923–1930.

23.

Zhu

Lenert

Bunnell

, et al. Automatically identifying social isolation from clinical narratives for patients with prostate Cancer. BMC Med Inform Decis Mak 2019; 19(1): 43.

24.

National Library of Medicine . Overview of SNOMED CT. Available at https://www.nlm.nih.gov/healthit/snomedct/snomed_overview.html, (Accessed on 30 April 2021).

25.

National Library of Medicine . RxNorm Overview. Available at https://www.nlm.nih.gov/research/umls/rxnorm/overview.html. (Accessed on April 30, 2021).

26.

Weiss

Heslin

Stocks

, et al. Hospital Inpatient Stays Related to Opioid Use Disorder and EndocarditisStatistical Brief #256. 2020 Apr 14. In: Healthcare Cost and Utilization Project (HCUP) Statistical Briefs. Rockville, MD: Agency for Healthcare Research and Quality (US), 2016. Available from: https://www.ncbi.nlm.nih.gov/books/NBK557173/table/sb256.tab7/.[Internet]Table 4, ICD-9-CM and ICD-10-CM diagnosis codes defining opioid use disorder (OUD).

27.

NLP group Stanford . Evaluation of ranked retrieval results, http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html, (Accessed on 1 May 2018).

28.

Smith

Dart

Katz

, et al. Classification and definition of misuse, abuse, and related events in clinical trials: ACTTION systematic review and recommendations. Pain 2013; 154(11): 2287–2296.

29.

Psychiatric Association American . Supplement to Diagnostic and Statistical Manual of Mental Disorders. 5th Edition, Psychiatric Association American. 2018, p. 8.

30.

O’Malley

Cook

Price

, et al. Measuring diagnoses: ICD code accuracy. Health Serv Res 2005; 40(5 Pt 2): 1620–1639.

31.

Wei

Teixeira

, et al. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J Am Med Inform Assoc 2016; 23(e1): e20–e27.

32.

Lagisetty

Garpestad

Larkin

, et al. Identifying individuals with opioid use disorder: Validity of International Classification of Diseases diagnostic codes for opioid use, dependence and abuse. Drug Alcohol Depend 2021; 221: 108583.

33.

First

Williams

Karg

, et al. SCID-5-CV: Structured Clinical Interview for DSM-5 Disorders: Clinician Version. Arlington: American Psychiatric Association Publishing, 2016.

34.

Howell

Abel

Park

, et al. Validity of incident opioid use disorder (OUD) diagnoses in administrative data: a chart verification study. J Gen Intern Med 2021; 36(5): 1264–1270.

35.

Watson

Simon

Peratikos

, et al. Medical utilization surrounding initial opioid-related diagnoses by coding method. Am J Manag Care 2020; 26(2): e64–e68.

36.

Kreimeyer

Foster

Pandey

, et al. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. J Biomed Inform 2017; 73: 14–29.

37.

Haller

Renier

Juusola

, et al. Enhancing risk assessment in patients receiving chronic opioid analgesic therapy using natural language processing. Pain Med 2017; 18(10): 1952–1960.

38.

Hazlehurst

Green

Perrin

, et al. Using natural language processing of clinical text to enhance identification of opioid-related overdoses in electronic health records data. Pharmacoepidemiol Drug Saf 2019; 28(8): 1143–1151.

39.

Mountcastle

Joyce

Sasinowski

, et al. Validation of an administrative claims coding algorithm for serious opioid overdose: A medical chart review. Pharmacoepidemiol Drug Saf 2019; 28(10): 1422–1428.

40.

Kersloot

van Putten

FJP

Abu-Hanna

, et al. Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies. J Biomed Semant 2020; 11: 14, DOI: 10.1186/s13326-020-00231-z

Automatically identifying opioid use disorder in non-cancer patients on chronic opioid therapy