Sage Journals: Discover world-class research

Abstract

Introduction

A steadily rising opioid pandemic has left the US suffering significant social, economic, and health crises. Machine learning (ML) domains have been utilized to predict prolonged postoperative opioid (PPO) use. This systematic review aims to compile all up-to-date studies addressing such algorithms’ use in clinical practice.

Methods

We searched PubMed/MEDLINE, EMBASE, CINAHL, and Web of Science using the keywords “machine learning,” “opioid,” and “prediction.” The results were limited to human studies with full-text availability in English. We included all peer-reviewed journal articles that addressed an ML model to predict PPO use by adult patients.

Results

Fifteen studies were included with a sample size ranging from 381 to 112898, primarily orthopedic-surgery-related. Most authors define a prolonged misuse of opioids if it extends beyond 90 days postoperatively. Input variables ranged from 9 to 23 and were primarily preoperative. Most studies developed and tested at least two algorithms and then enhanced the best-performing model for use retrospectively on electronic medical records. The best-performing models were decision-tree-based boosting algorithms in 5 studies with AUC ranging from .81 to .66 and Brier scores ranging from .073 to .13, followed second by logistic regression classifiers in 5 studies. The topmost contributing variable was preoperative opioid use, followed by depression and antidepressant use, age, and use of instrumentation.

Conclusions

ML algorithms have demonstrated promising potential as a decision-supportive tool in predicting prolonged opioid use in post-surgical patients. Further validation studies would allow for their confident incorporation into daily clinical practice.

Keywords

opioid misuse surgery postoperative care machine learning artificial intelligence

Introduction

Over the past two decades, the opioid pandemic has been steadily on the rise.¹ In 2008, US citizens, constituting less than 5% of the world’s population, consumed about 80% of the global opioid supply.² Despite declining consumption over the following decade, the US still leads the charts for opioid use and its complications.³ A CDC report in 2021 estimated total deaths in the US from opioid overdoses alone to have increased from 56064 to 75673 over the preceding year.⁴ Additionally, the devastating social implications of opioid misuse and the related significant morbidity and mortality^5-7 burdened the US economy in 2013 with around $78.5 billion^8,9 which increased to $1.02 trillion in 2017.¹⁰ Although the emergence of illegally manufactured opioid derivatives shares the blame for the emergence of this issue, its impact is not well tracked and studied.¹¹ The primary culprit, however, seems to be the increased pharmaceutically regulated opioid prescription (OP).¹² What started as a sincere effort to humanely and adequately manage pain has turned into a national emergency; not at the hands of drug dealers.

Understanding pain and determining the best management protocol through opioids or other analgesia is a debatable topic that requires extensive research.^13,14 Nevertheless, tackling this issue by analyzing current practices has recognized several factors fueling the increased OP. For example, Kalakoti et al found preoperative opioid dependence to be the most decisive risk factor for postoperative opioid dependence.¹⁵ Orthopedic surgeons rank third among physicians and account for 7.7% of all OP in the US^16,17 despite surgery ranking behind cancer pain, backache, and many other rheumatological and musculoskeletal conditions that are top causes linked to an office visit with an OP.^18-20 Certainly, the simultaneous presence of such factors increases the risk of prolonged postoperative opioid (PPO) use, predisposing long-term dependence behavior and significant adverse consequences.^21-24

Artificial Intelligence (AI) domains, particularly Machine Learning (ML), have been increasingly implemented in health care over the past decade with promising results.^25,26 For example, several studies have utilized ML models to predict PPO.^27-41 If such an estimation proves accurate, elimination of unindicated OP and interventions to adopt a safer and more conservative OP pattern could be implemented before exposure to a significant risk factor like surgery. Furthermore, such information will help categorize conditions and clinical settings and stratify patients into low and high-risk for opioid misuse. Indeed, doing so will rationalize the use of opioids and offer a more personalized patient-specific treatment approach. Therefore, we have conducted this systematic review by compiling the current evidence to date to answer a specific research question: can ML algorithms accurately predict PPO use?

Methods

Search Strategy

This systematic review was conducted in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA).⁴² We performed an all-time search on May 10th, 2022, utilizing four electronic medical databases: PubMed/MEDLINE, EMBASE, CINAHL, and Web of Science. The following keywords: “Machine Learning,” “Opioid,” and “Prediction” were used to generate the search string: (“Machine Learning” AND opioid AND prediction). In addition, boolean operators, truncation, and MeSH terms were applied where appropriate. The search results were limited to human studies in the English language with available full texts through our institutional access.

Eligibility Criteria

We included all (1) peer-reviewed journal articles that addressed (2) an ML algorithm model to (3) predict (4) long-term (5) opioid use by (6) adult patients in the (7) post-operative setting, (8) while reporting the results for both training and validation cohorts. Both qualitative and quantitative studies were included. We excluded unpublished data, data under review, viewpoints, dissertations, thesis, conference proceedings, short surveys, letters to editors, and book chapters. In addition, we excluded studies that predicted pain occurrence instead of an OP or those that predicted a short-term opioid use; less than 2 weeks postoperatively.

Study Selection and Collection process

Two authors independently performed the search and imported the results to the EndNote v.20 reference manager, where duplicates were removed. Any conflicts or selection issues were solved by resorting to a third author. No studies were added from references of the included 15 papers. Figure 1 summarizes the search process.

Figure 1.

A PRISMA flowchart diagram summarizing the database search process.

Quality Assessment

In addition, we assessed the potential risk of bias using the MINROS index for non-randomized studies.⁴³ Inter-rater reliability was ensured by having three authors independently assess each study using the same standardized sheet. In cases of disagreement, the majority of votes were the deciding factor for that particular scoring point (see the supplementary materials).

Results

Searching all 5 databases yielded 316 results that were screened for inclusion based on titles first, then abstracts, and finally, after full-text readings to generate a final list of fifteen studies. All studies were retrospective in design, where the Electronic Medical Records (EMR) were searched to obtain the relevant data, and were published in the years 2019-2022. Table 1 summarizes the data extracted from each study.

Table 1.

Shows a Summary of Important Data Extracted From the Studies Included in This Systematic Review.

Author and date	Definition of Opioid Misuse	Sample Size	Opioid Exposure Preoperatively	Operative Procedure	Number and Category of Input Variables	ML Algorithm	Results of Best Performing Model	Most Predictive Variables
Karhade et al (2019)³²	Uninterrupted filing of prescription opioids extending to at least 90-180 days after surgery	2737	74% naïve	Anterior cervical discectomy and fusion	59, preoperative	5 models (ENPLR, RF, SGB, NN, SVM)	SGB AUC .81 Brier score .075	Preoperative opioid duration, antidepressant use, and tobacco use
Karhade et al, (2019)³³	Prescription filled after surgery to at least 90 to 180 days after the index procedure	5413	67% naïve	Lumbar spine surgery	58, preoperative	5 models (ENPLR, RF, SGB, NN, SVM)	ENPLR AUC .81 Brier Score .064	First instrumentation, duration of preoperative opioid prescription, and comorbidity of depression
Karhade et al, (2019)³⁴	Continuous opioid prescriptions after surgery to at least 90 days after surgery	5507	81% naïve	Total hip arthroplasty	55, preoperative	5 models (ENPLR, RF, SGB, NN, SVM)	ENPLR AUC .81 Brier Score .051	Preoperative opioid use, age, comorbidities such as anxiety/depression
Anderson et al, (2020)²⁷	Any opioid prescription filled >90-365 days after ACL reconstruction	10 919	73.5% naïve	Arthroscopic assisted ACL reconstruction	44, preoperative	4 models (BBN, RF, GBM, LR)	GBM AUC .77 Brier score .10	Total number of ordering sites, total preOp MME, total days deployed
Karhade et al, (2020)³¹	Sustained prescription opioid use exceeding 90 days after surgery	8435	100% naïve	Lumbar spine surgery (decompression and/or fusion)	59, preoperative	5 models (RF, SGB, NN, SVM, ENPLR)	ENPLR AUC .70 Brier Score .039	Use of instrumented spinal fusion, preoperative benzodiazepine use, preoperative antidepressant use
Katakam et al, (2020)³⁵	Prolonged opioid use is categorized: As within 30 days postoperative, 30-90 days postoperative, and 90-180 days postoperative	12542	79% naïve	Primary total knee arthroplasty	49, preoperative	5 models (SGB, RF, SVM, NN, ENPLR)	SGB AUC .76 Brier score .073	Age, history of preoperative opioid use, marital status
Nair et al, (2020)³⁹	Predicts opioid requirement as very low, low, medium, or high postoperatively; not the duration of sustained use	13700	77% naïve	Ambulatory surgery	24 preoperative + 25 intraoperative	5 models (RF, MNLR, EGB, NB, NN)	RF Accuracy of 72% preoperative and 72% at the end of surgery	Type of procedure: General and plastic surgery, preoperative opioid use, and procedure duration
Zhang et al, (2020)⁴¹	Long-term opioid use was defined as filling ≥180 days of opioids within one year after surgery	19 317	Mixed, unknown	Elective spine surgery (thoracic or lumbosacral decompression with or without fusion)	24 preoperative + 30 days postoperative opioid prescription patterns	7 models (FLR, SLR, LASSO), SVM, two tree-based models (RF and SGB), and tCNN)	LR based models AUC .835-.847 Sensitivity 74.9-76.5%	High preoperative opioid use, the number of days, and number of dosages increases with active opioid prescription between postoperative days 15 to 30
Hur et al, (2021)³⁰	Refill, defined as filling an opioid prescription within 30 days after discharge from surgery. New persistent use after surgery has been defined as additional fills in the 91-180 days after surgery.	112898	100% naïve	13 major and minor surgeries	50 preoperative + surgery type only	Linear models (SVM) against non-linear (ensemble of decision trees)	Non-linear AUC .68 for refills AUC .66 for New persistent use	Undergoing major surgery, opioid prescriptions within 30 days before surgery, and abdominal pain helped predict refills; back/joint/head pain were the most important features in predicting new persistent use
Kunze et al, (2021)³⁷	Prolonged postoperative opioid use, defined as patients who requested one or more opioid prescription refills postoperatively	775	100% naïve	Primary hip arthroscopy	17 preoperative	5 models (SGB, RF, SVM, NN, ENPLR)	SGB AUC .75 Brier score .13	Preoperative Harris hip score, age, and BMI
Ward et al, (2021)⁴⁵	≥1 opioid prescription fill within 90-180 days after surgery	186493	Not mentioned	Any surgery with general anesthesia	136 preoperative variables	5 models (LR, LASSO RF, GBM, EGB	GBM AUC .711 overall, .823 for spinal fusion surgery	Days’ supply of opioids and oral MME of opioids in the year before surgery
Gabriel et al, (2022)²⁸	Continued opioid use after a 3-month postoperative cut-off, up to 6 months	1042	81% naïve	Hip or knee arthroplasty	24 preoperative, intraoperative, and postoperative variables	6 models (LR, RF, SFNN, BRF, BBC, SVM + SMOTE	BBC AUC .94 and .96 with SMOTE Brier score .80 and .84 with SMOTE	Postoperative day one opioid use, BMI, age, intraoperative ketamine use, severe osteoarthritis of the surgical joint, substance use, CHF, COPD, depression
Grazal et al, (2022)²⁹	Prolonged opioid use, which is at least filling one opioid prescription >90 days after surgery	6760	43% naïve	Arthroscopic hip surgery	5 preoperative and postoperative	6 models (NB, GBM, EGB, RF, ENPLR, ANN)	ANN AUC .71 Brier score .21	Age, preoperative opioid use, and postoperative opioid use
Klemt et al, (2022)³⁶	extended postoperative opioid use (>90 days)	8873	79% naïve	Primary total knee arthroplasty	19 preoperative and operative	5 models (ANN, SGB, RF, KNN, ENPLR)	NN AUC .87 Brier score .036	Preoperative opioid duration, drug abuse, depression
Lu et al, (2022)³⁸	Defined as opioid consumption at least 150 days following surgery	381	93.2% naïve	Elective knee arthroscopy	11 preoperative, intraoperative, and postoperative variables	5 models (SVM, RF, XGBoost, AdaBoost, ensemble model)	Ensemble AUC .74 Brier Score .12	Preoperative opioid use, use of instrumentation, and comorbidity of depression
Yen et al, (2022)⁴⁰	A prolonged opioid prescription is defined as a continuing prescription to at least 90 to 180 days after the first surgery	1316	69.4% naïve	Lumbar disc herniation surgery	19 preoperative variables	SORG-ML	AUROC .76 AUPRC .33 Brier score .30	Preoperative opioid use, depression

LR = Logistic Regression, FLR = Full Logistic Regression, SLR = Stepwise Logistic Regression, MNLR = Multi-Nominal Logistic Regression, ENPLR = Elastic Net Penalized Logistic Regression, LASSO = Least Absolute Shrinkage and Selection Operator, SVM = Support Vector Machine, BBC = Balanced Bagging Classifier, RF = Random Forest, bRF = Balanced Random Forest, SGB = Stochastic Gradient Boosting, EGB = Extreme Gradient Boosting, NN = Neural Network, ANN = Artificial Neural Network, SFNN = Simple Feed-Forward Neural Network, tCNN = time-varying Convolutional Neural Network, SMOTE = Synthetic Minority Oversampling Technique, BBN = Bayesian Belief Network, NB = Naïve Bayes, KNN = K-Nearest Neighbors, MME = Morphine Milli Equivalents.

Synthesis Of Evidence

Defining Opioid Misuse

The definition of long-term opioid misuse varied among the studies. For example, Nair et al predicted the opioid requirement on a very low, low, medium, and high scale, not the duration of use.³⁹ All other authors looked at the prolonged period of use postoperatively as a marker of misuse. For example, Kunze et al defined “any" patient-requested refills in the postoperative period as prolonged opioid use.³⁷ Katakam et al categorized OP postoperatively into within 30 days, 30-90 days, and beyond 90 days.³⁵ All other 12 studies used the time point of 90 days postoperatively as a milestone for opioid misuse: whether sustained use since the operation, refills, or new OP extending beyond that hallmark (^{27-34,36,38,41}).

Sampling and Datasets

Sample sizes ranged from 381³⁸ to 112898³⁰ adult patients who underwent surgery. Fourteen studies looked at patients at either an institutional level ^{28,31-36,38-40} or a more inclusive database set, for example, M2 Military Health System Data Repository,^27,29 Optum's Clinformatics DataMart database,³⁰ MarketScan Databases (Truven Health),⁴¹ and National Insurance Claims.⁴⁴ Only Kunze et al looked at the patients of a single fellowship-trained surgeon.³⁷

Opioid Exposure vs Naivety

Opioid naïve patients are those who have never been exposed to opioids before the surgery. Karhade et al, 2020,³¹ Hur et al,³⁰ and Kunze et al³⁷ included only opioid-naïve patients in their studies. Grazal et al included more opioid-exposed (57%) than opioid-naïve patients.²⁹ The remaining 11 authors had a mixture of both groups, with the opioid-naïve patients being at least 67% of the sample size.^{27,28,32-36,38-41}

Type of Surgery

Most of the included studies investigated the risk of PPO use in patients undergoing orthopedic surgeries. First, Karhade et al,^31-33 Zhang et al,⁴¹ and Yen et al⁴⁰ looked at patients undergoing spine surgeries, whether cervical,³² thoracic,⁴¹ or lumbar.^31,33,40,41 Second, Anderson et al,²⁷ Katakam et al,³⁵ Gabriel et al,²⁸ Klemt et al,³⁶ and Lu et al³⁸ studied knee-related procedures, namely, arthroscopic-assisted ACL reconstruction,²⁷ primary total knee arthroplasty,^28,35,36 or elective knee arthroscopy.³⁸ The third group of studies looked at either hip arthroplasty^28,34 or arthroscopy.^29,37 Finally, two studies examined patients undergoing various types of surgeries, whether major or minor.^30,39

Input Variables

The number of input variables used by the ML algorithm ranged from 9²⁷ to 23.³⁹ The nature of these variables was mainly preoperative in all studies; hence, the model prediction was processed before surgery. Alternatively, eight studies added intraoperative or post-operative variables to the predictive model,^{28-30,36,38-41} thereby processing the prediction either at the end of surgery immediately or in the immediate postoperative period, within 15 to 30 days of the procedure.

Training Sets

Two main training methods were observed across the 14 studies that developed their algorithm, excluding Yen et al,⁴⁵ who utilized a readily available model to validate its clinical utility on a different population.

Only Lu et al⁴⁶ trained and validated their model via .632 bootstrapping with 1000 resampled datasets. Bootstrapping is a method that simulates new data samples by replacement so that observations never run out. It estimates the accuracy of a sample statistic of the author’s choice by calculating its estimate, confidence interval, and standard error. Its advantage lies in providing a more accurate standard of error estimate as it doesn’t assume the model’s distribution. However, if a small data set is used, as with Lu et al, the representability of such generated samples, and therefore the training, is questioned.

The remaining thirteen authors adopted the more commonly used cross-validation approach, either k-fold or leave-one-out. Karhade et al,^31-34 Katakam et al,⁴⁷ Kunze et al,³⁷ and Zhang et al⁴¹ divided the total patient population into training (80%) and testing (20%), also known as hold-out, sets, referred to as a stratified 80:20 split. The subset of variables determined for final modeling was selected by recursive feature selection with a random forest algorithm. Next, 10-fold cross-validation of the training set, repeated three times, was used to develop the respective algorithms developed in each of their studies. However, those authors didn’t specifically mention other specific criteria.

On the other hand, Gabriel et al,²⁸ Klemt et al,⁴⁸ and Nair et al³⁹ specifically mentioned randomizing the master data set before splitting, which is a very critical detail that enhances the validity and accuracy of the model from an engineering perspective. Grazal et al²⁹ split the data into 80:20 but balanced by the outcome variable, keeping the prolonged post-operative opioid use percentage the same across both sets. Hur et al³⁰ did a 5-fold cross-validation instead of 10-fold on the training set. They also chose the held-out set to be mainly of those who underwent surgery most recently, as if the model was trained using older data and then tested on newer, more recent ones.

Anderson et al²⁷ shuffled and split data into 80% training and 20% hold-out sets, balanced by outcome variable at 90 days. Next, the training set was divided into training 75% and validation 25%. Each model was built on the training data set, tuned with the validation set as applicable, and tested on the separate hold-out dataset. Feature selection varied for models, and The Boruta algorithm for feature selection based on a 100-tree random forest algorithm was used to extract the relevant variables (It systematically eliminates irrelevant variables by comparing their calculated importance and randomly calculated importance out of 10 possible features)

Finally, the performance of all trained models was assessed through discrimination (c-statistic, AUC), calibration (plot, slope, intercept), overall performance (Brier score), and decision curve analysis for clinical utility analysis. Model interpretability and explanation were provided at the global and local levels before the models were deployed to run on the testing sets.

Nature of Machine learning Models

Yen et al used only one publicly accessible model, SORG-MLA, to test its clinical applicability in predicting PPO use.⁴⁰ The rest of the studies developed many ML models, ranging from 2³⁰ to 7,⁴¹ with the mode being 5 (n = 9). Supervised ML models included K-nearest neighbor,³⁶ logistic regression (LR)-based models as elastic-net penalized LR,^29,31-37 multi-nomial LR,³⁹ full LR,⁴¹ stepwise LR,⁴¹ Least Absolute Shrinkage and Selection Operator (LASSO),⁴¹ and LR with an L2 penalty and with an L1 LASSO penalty.²⁸ Decision tree-based models included random forest classifier,^27-39,41 stochastic gradient boosting,^31-37,41 gradient boosting machine,^27,29 extreme gradient boosting,^29,39 XGBoost,^30,38 and AdaBoost.³⁸ Other models included naïve bayes,²⁹ balanced bagging classifier,²⁸ and support vector machine.^{28,30-35,37,38,41} Finally, reinforcement ML algorithms comprised neural network,^31-35,37,39 artificial neural network,^29,36 simple-feedforward neural network,²⁸ and time-varying convolutional neural network.⁴¹

Assessment of Machine learning Models

The algorithms varied in nature, and their respective outputs were compared against each other. The best-performing algorithm type was supervised in thirteen studies^{27,28,30-35,37-41} and of a reinforcement nature one in only two studies.^29,36 Out of the thirteen supervised ML models, eight were decision-tree based models/classifiers^{27,28,30,32,35,37-39} and five were logistic regression-based.^{31,33,34,40,41} Five of the eight decision-tree-based models were boosting algorithms.^{27,30,32,35,37}

Models were assessed via two primary metrics: the area under the receiver operating curve (AUC) and the Brier score. AUC indicates an accurate prediction if closer to 1. The opposite goes for the Brier score, in which a score of 0 indicates a perfect model, and a score of 1 represents the worst model prediction. Gabriel et al reported the highest AUC of .96 of a balanced-bagging classifier model enhanced with the Synthetic Minority Oversampling Technique (SMOTE).²⁸ Klemt et al reported the best Brier score of .036 in a neural network model.³⁶ The 5 best-performing boosting algorithms had an AUC ranging from .81³² to .66³⁰ and a Brier score ranging from .073³⁵ to .13.³⁷ Only Nair et al assessed the performance using accuracy, where a random forest algorithm scored 72%, both before and at the end of surgery.³⁹

Most Predictive Variables

Several studies reported the top three contributing factors to their model prediction. These variables were the most likely to weigh in on the model's output and result in PPO misuse. Preoperative opioid use was the most reported predictive variable,^{27,29,30,32-36,39-41,44,49} followed by depression and antidepressant use.^{31-34,36,38,49} Other predictive variables included age, use of instrumentation, type and duration of surgery, OP pattern in the immediate postoperative period (day 1, first 2 weeks, and days 15-30), and tobacco and drug use.^27-41,44,49

Discussion

An Opioid Crisis

One distinctive feature of the opioid epidemic is its emergence from the professional clinical settings, the fighter against substance abuse being the main culprit.¹² Indeed, pharmaceutical campaigns’ misinformation on undertreated pain under the umbrella of ethical humane considerations has pushed for massive regulations to control the practice and is a contributing factor.^11,16 The complexity of the dilemma lies in its’ intertwinement with the delicate issue of appropriate pain management. Overtreatment of pain with potent analgesics ensures the absence of suffering at the expense of long-term consequences of opioid dependence, questioning their pressing indication.^4-7 The root of this issue starts with an OP.

The Opioid Prescription and Pain Dilemma

Pain is a mysteriously ambiguous phenomenon with various components, mechanisms for activation, and signaling pathways.⁵⁰ Owing to the variability in its subjective experience and the absence of an objective detection method, self-report remains the current gold standard for assessing pain.^51,52 This evaluation gives rise to the challenging issue of adequately balancing the potential for analgesia over prescription or abuse while respecting a highly variable subjective feeling and ensuring the absence of any suffering. Accordingly, choosing the best analgesic medication, opioid or non-opioid, is a controversial research subject with substantial evidence backing the different arguments put forward to guide its procedure-specific and multidisciplinary nature.^53-57 Perhaps studies predicting the type of analgesic protocol to be prescribed postoperatively may help address the need for opioids to begin with. Timing the administration of analgesia is another vital issue that impacts the pain response and, consequently, the pain experience and the dose of analgesia required. The concept of pre-emptive analgesia, which starts before surgery, aims to minimize the pain response and hypersensitivity that will have resulted from surgery; before any nerve stimulation or activation ensues.^58-60 Despite the lack of conclusive evidence on this matter, several animal, clinical studies, and meta-analyses have demonstrated many potential benefits, such as decreased dosage and potency of analgesia required.^60-65

All these aspects of pain management question the pressing need for high doses of potent analgesia as opioids. Nonetheless, despite the many regulations governing and controlling OP, it remains one of the highly sought strong pain-relieving medications by surgical patients, second to chronic pain and cancer patients.^18,66 Indeed, with prolonged use of opioids, more tolerance is built, and the more likely a dependence behavior to develop.⁶⁷ As such, a new OP to a naïve patient is the critical moment to target. This preventive approach is the rationale behind the studies included in this review. They aim to detect high-risk individuals who are more likely to develop any form of opioid overuse after surgery. Treating physicians’ awareness of such a precarious clinical setting will help guide their decisions along with the rest of the health care team towards an individualistic pain management protocol. In addition, early knowledge of such risks would be of great value during preoperative patient counseling, particularly regarding the choice and dose of analgesia as well as the timing of its administration.

Furthermore, identifying the exact factors that are majorly contributing to PPO use would improve future regulation of the issue. For instance, preoperative opioid use was found to be the most critical factor in predicting PPO use.^15,68 Any opioid or opioid-derived products used the year before surgery seems to increase PPO use significantly, either as a continued sustained use, higher dosage needs, or increased OP refills. Similarly, O'Connell et al. reported that preoperative depression increased cumulative opioid use and decreased the likelihood of post-operative opioid cessation following lumbar fusion procedures.⁶⁹ Accurately identifying such predictive factors was done through the use of ML algorithms.

Machine Learning Models

The past decade has witnessed an increased reliance on AI and ML technologies in medicine and health care.²⁵ They are more than simple automation tools as they consider the dynamic nature of specific variables and “learn” from such real-time fluctuations and past mistakes to constantly improve future performance. Moreover, their unique coding gives them a superior edge when performing complex tasks, like continued monitoring with real-time feedback processing or analysis of multifaceted problems. For these reasons, various ML algorithms were developed to predict PPO use; see Figure 2.

Figure 2.

Shows the process of ML model prediction at three different timepoints: (a) Preoperative prediction.^{27,30-35,37,40} through the following steps¹ Collecting relative preoperative variables, such as patient demographics, medical conditions and comorbidities, and medication taken regularly.² Data is split into in two sets where most of it is used for training of the algorithms.³ Different ML models are developed.⁴ Performance of the different algorithms was compared and the best-performing models were then enhanced by considering the weight of top contributing factors to the predicted outcome.⁵ Actual test data is entered into the final model to be processed. (b) Prediction immediately at the end of surgery, through steps 1-5 plus a sixth step of adding operative variables to the ML mode, for example, duration of surgery, bleeding amount, tissue manipulation, and intraoperative drug administration.^36,39 before processing. (c) Postoperative prediction through steps 1-6 plus a seventh step of adding postoperative variables to the ML model, for example, dosage and frequency of opioids on the first day after surgery and the increase in the dosage over the immediate 2-4 weeks postoperatively^28,29,38,41 before processing.

Prediction at either of the three time points outlined provides an estimate of PPO use. However, an early prediction at point A prediction^{27,30-35,37,40} will have the advantage of sharing these findings with involved health care personnel and patients during preoperative counseling. A postoperative forecast at point C^28,29,38,41 may have acquired the most comprehensive data encompassing possibly significant operative and postoperative factors and, therefore, a more accurate prediction. Unfortunately, the output would be at a late point after the patient had already been exposed to opioids. Prediction at point B^36,39 may be more balanced in acquiring the necessary variables while producing output before postoperative analgesia is initiated. Future research is needed to determine the optimal predictive time point and the complex interaction between variables to analyze the most contributing ones for further enhancement to the applicability of this technology in clinical practice. Likewise, researchers are encouraged to develop a standardized assessment metric, which is lacking owing to the novel nature of these models.

A few notable challenges are worth mentioning. Firstly, these predictions were based on OP obtained from EMRs. Surely, an OP does not necessarily imply precise adherence to the physician’s directions, and seeking opioids from a different source couldn’t be ruled out. Furthermore, every study devised a different approach to deal with missing data and documentation bias, emphasizing the significance of complete and comprehensive medical documentation. Another issue is that pain in the postoperative setting is multicausal, and other non-opioid analgesic protocols, such as NSAID or nerve blocks, were not studied. Additionally, many authors excluded possible causes of chronic pain as confounding factors. For example, Katakam et al³⁵ and Kunze et al³⁷ did not include revision surgeries for TKA and hip arthroscopy, respectively. Karhade et al excluded patients with preparative conditions that may cause chronic pain or are likely to complicate the surgical outcome requiring extended opioid use, for example, trauma, tumor, and infections.^31-34 Another significant issue is that a one-size-fits-all approach is not the way to implement such algorithms or unjustifiably generalize the outcomes. The variability in ML model performances and most predictive factors highlights the individuality of an algorithm to its respective setting and population. For example, Ward et al investigated similar outcomes in adolescents undergoing surgery.⁴⁴ Undoubtedly, their pain experience and response to opioids would be very different from elderly undergoing more aggressive orthopedic surgeries by Lu et al³⁸ Many authors uploaded the final ML model to an online platform as an open-access source for public use. Sharing such vital data is highly encouraged during these early times of AI and ML research to accelerate the utilization of such novel tools.

Conclusion

Machine learning algorithms have demonstrated a promising potential to predict PPO use accurately. Their introduction into the clinical setting is an excellent step toward improved patient-centered care. With more validation studies, prospective cohorts, and application to larger patient populations in different settings, their accuracy and proper implementation in clinical practice would be more reliable. They can be refined into a convenient decision-supportive tool for surgeons, anesthetists, and other physicians involved in patient care. Efficient preoperative counseling and rationalized postoperative consumption of opioids will have an imminent impact on patients directly, physicians, health care institutions, and society.

Supplemental Material

Supplemental Material - Machine Learning Algorithms Predict Long-Term Postoperative Opioid Misuse: A Systematic Review

Supplemental Material for Machine Learning Algorithms Predict Long-Term Postoperative Opioid Misuse: A Systematic Review by Omar S. Emam, Abdullah S. Eldaly, Francisco R. Avila, Ricardo A. Torres-Guzman, Karla C. Maita, John P. Garcia, Sally Anne Brown, Clifton R Haider, and Antonio J. Forte in The American Surgeon™.

Footnotes

Acknowledgments

Figure 1 and 2 were created using BioRender.com.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported in part by the Mayo Clinic Clinical Research Operations Group (CROG) and Mayo Clinic Center for Regenerative Medicine (CRM).

Supplemental Material

Supplemental material for this article is available online.

References

Celentano

. The Worldwide Opioid Pandemic: Epidemiologic Perspectives. Epidemiol Rev. 2020;42(1):1-3.

Manchikanti

Singh

. Therapeutic opioids: A ten-year perspective on the complexities and complications of the escalating use, abuse, and nonmedical use of opioids. Pain Physician. 2008;11(2 Suppl):S63-88.

Jayawardana

Forman

Johnston-Webber

Campbell

Berterame

de Joncheere

, et al. Global consumption of prescription opioid analgesics between 2009-2019: A country-level observational study. EClinicalMedicine. 2021;42:101198.

Ahmad

FBRL

Sutton

. Provisional drug overdose death counts. National Center for Health Statistics2021; 2023.

Kolodny

Courtwright

Hwang

Kreiner

Eadie

Clark

, et al. The prescription opioid and heroin crisis: A public health approach to an epidemic of addiction. Annu Rev Public Health. 2015;36:559-574.

Darnall

Stacey

Chou

. Medical and psychological risks and consequences of long-term opioid therapy in women. Pain Med. 2012;13(9):1181.

Edlund

Martin

Russo

DeVries

Braden

Sullivan

. The role of opioid prescription in incident opioid abuse and dependence among individuals with chronic noncancer pain: the role of opioid prescription. Clin J Pain. 2014;30(7):557-564.

Florence

Zhou

Luo

. The economic burden of prescription opioid overdose, abuse, and dependence in the United States, 2013. Med Care. 2016;54(10):901.

Reider

. Opioid Epidemic. Am J Sports Med. 2019;47(5):1039-1042.

10.

Florence

Luo

Rice

. The economic burden of opioid use disorder and fatal opioid overdose in the United States, 2017. Drug Alcohol Depend. 2021;218:108350.

11.

Lyden

Binswanger

. The United States opioid epidemic. Semin Perinatol. 2019;43(3):123-131.

12.

Gostin

Hodge

Jr. Noe

. Reframing the Opioid Epidemic as a National Emergency. Jama. 2017;318(16):1539-1540.

13.

Fields

. The doctor's dilemma: opiate analgesics and chronic pain. Neuron. 2011;69(4):591.

14.

Chou

Wagner

Ahmed

Blazina

Brodt

Buckley

, et al. AHRQ Comparative effectiveness reviews. treatments for acute pain: a systematic review. Rockville (MD): Agency for Healthcare Research and Quality (US); 2020.

15.

Kalakoti

Hendrickson

Bedard

Pugely

. Opioid utilization following lumbar arthrodesis: Trends and factors associated with long-term use. Spine (Phila Pa 1976). 2018;43(17):1208-1216.

16.

Manchikanti

Helm

2nd. Fellows

Janata

Pampati

Grider

, et al. Opioid epidemic in the United States. Pain Physician. 2012;15(3 Suppl):Es9-38.

17.

Volkow

McLellan

Cotto

Karithanom

Weiss

. Characteristics of opioid prescriptions in 2009. Jama. 2011;305(13):1299-1301.

18.

Sherry

Sabety

Maestas

. Documented Pain Diagnoses in Adults Prescribed Opioids: Results From the National Ambulatory Medical Care Survey, 2006-2015. Ann Intern Med. 2018;169(12):892.

19.

Chaudhary

Schoenfeld

Harlow

Ranjit

Scully

Chowdhury

, et al. Incidence and Predictors of Opioid Prescription at Discharge After Traumatic Injury. JAMA Surg. 2017;152(10):930.

20.

Sun

Darnall

Baker

Mackey

. Incidence of and risk factors for chronic opioid use among opioid-naive patients in the postoperative period. JAMA Intern Med. 2016;176(9):1286-1293.

21.

Jiang

Orton

Feng

Hossain

Malhotra

Zager

, et al. Chronic opioid usage in surgical patients in a large academic center. Ann Surg. 2017;265(4):722.

22.

Brummett

Waljee

Goesling

Moser

Lin

Englesbe

, et al. New persistent opioid use after minor and major surgical procedures in US Adults. JAMA Surg. 2017;152(6):e170504.

23.

Gomes

Tadrous

Mamdani

Paterson

Juurlink

. The Burden of Opioid-Related Mortality in the United States. JAMA Network Open. 2018;1(2):e180217.

24.

Lin

T-C

Ger

L-P

Pergolizzi

Raffa

Wang

J-O

S-T

. Long-term use of opioids in 210 officially registered patients with chronic noncancer pain in Taiwan: A cross-sectional study. Journal of the Formosan Medical Association. 2017;116(4):257-265.

25.

AmishaMalik

Pathania

Rathaur

. Overview of artificial intelligence in medicine. J Family Med Prim Care. 2019;8(7):2328-2331.

26.

Rajpurkar

Chen

Banerjee

Topol

. AI in health and medicine. Nat Med. 2022;28(1):31.

27.

Anderson

Grazal

Balazs

Potter

Dickens

Forsberg

. Can predictive modeling tools identify patients at high risk of prolonged opioid use after ACL reconstruction? Clin Orthop Relat Res. 2020;478(7):0-1618.

28.

Gabriel

Harjai

Prasad

Simpson

Chu

Fisch

, et al. Machine learning approach to predicting persistent opioid use following lower extremity joint arthroplasty. Reg Anesth Pain Med. 2022;47(5):313.

29.

Grazal

Anderson

Booth

Geiger

Forsberg

Balazs

. A machine-learning algorithm to predict the likelihood of prolonged opioid use following arthroscopic hip surgery. Arthroscopy. 2022;38(3):839-847.

30.

Hur

Tang

Gunaseelan

Brummett

Englesbe

, et al. Predicting postoperative opioid use with machine learning and insurance claims in opioid-naïve patients. Am J Surg. 2021;222(3):659-665.

31.

Karhade

Cha

Fogel

Hershman

Tobert

Schoenfeld

, et al. Predicting prolonged opioid prescriptions in opioid-naïve lumbar spine surgery patients. Spine J. 2020;20(6):888-895.

32.

Karhade

Ogink

Thio

Broekman

MLD

Cha

Hershman

, et al. Machine learning for prediction of sustained opioid prescription after anterior cervical discectomy and fusion. Spine J. 2019;19(6):976-983.

33.

Karhade

Ogink

Thio

Cha

Gormley

Hershman

, et al. Development of machine learning algorithms for prediction of prolonged opioid prescription after surgery for lumbar disc herniation. Spine J. 2019;19(11):1764-1771.

34.

Karhade

Schwab

Bedair

. Development of machine learning algorithms for prediction of sustained postoperative opioid prescriptions after total hip arthroplasty. J Arthroplasty. 2019;34(10):2272.

35.

Katakam

Karhade

Schwab

Chen

Bedair

. Development and validation of machine learning algorithms for postoperative opioid prescriptions after TKA. Journal of Orthopaedics. 2020;22:95.

36.

Klemt

Harvey

Robinson

Esposito

Yeo

Kwon

. Machine learning algorithms predict extended postoperative opioid use in primary total knee arthroplasty. Knee surgery, sports traumatology, arthroscopy: official journal of the ESSKA. 2022;5.

37.

Kunze

Polce

Alter

Nho

. Machine learning algorithms predict prolonged opioid use in opioid-naïve primary hip arthroscopy patients. J Am Acad Orthop Surg Glob Res Rev. 2021;5(5):e21.00093.

38.

Forlenza

Wilbur

Lavoie-Gagne

Yanke

, et al. Machine-learning model successfully predicts patients at risk for prolonged postoperative opioid use following elective knee arthroscopy. Knee Surgery Sports Traumatology Arthroscopy. 2022;30(3):762-772.

39.

Nair

Velagapudi

Lang

Behara

Venigandla

Velagapudi

, et al. Machine learning approach to predict postoperative opioid requirements in ambulatory surgery patients. PLoS One. 2020;15(7):e0236833.

40.

Yen

Ogink

Huang

Groot

Chen

, et al. A machine learning algorithm for predicting prolonged postoperative opioid prescription after lumbar disc herniation surgery. An external validation study using 1,316 patients from a Taiwanese cohort. Spine Journal. 2022;22(7):1119-1130.

41.

Zhang

Fatemi

Medress

Azad

Veeravagu

Desai

, et al. A predictive-modeling based screening tool for prolonged opioid use after surgical management of low back and lower extremity pain. Spine J. 2020;20(8):1184-1195.

42.

Page

McKenzie

Bossuyt

Boutron

Hoffmann

Mulrow

, et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. PLOS Medicine. 2021;18(3):e1003583.

43.

Slim

Nini

Forestier

Kwiatkowski

Panis

Chipponi

. Methodological index for non-randomized studies (minors): development and validation of a new instrument. ANZ J Surg. 2003;73(9):712.

44.

Ward

Jani

De Souza

Scheinker

Bambos

Anderson

. Prediction of Prolonged Opioid Use After Surgery in Adolescents: Insights From Machine Learning. Anesth Analg. 2021;133(2):304-313.

45.

Yen

Ogink

Huang

Groot

Chen

46.

Forlenza

Wilbur

Lavoie-Gagne

Yanke

, et al. Machine-learning model successfully predicts patients at risk for prolonged postoperative opioid use following elective knee arthroscopy. Knee Surg Sports Traumatol Arthrosc. 2022;30(3):762-772.

47.

Katakam

Karhade

Schwab

Chen

Bedair

. Development and validation of machine learning algorithms for postoperative opioid prescriptions after TKA. J Orthop. 2020;22:95.

48.

Klemt

Harvey

Robinson

Esposito

Yeo

Kwon

. Machine learning algorithms predict extended postoperative opioid use in primary total knee arthroplasty. Knee Surg Sports Traumatol Arthrosc. 2022;30(8):2573-2581.

49.

Afshar

Sharma

Bhalla

Thompson

Dligach

Boley

, et al. External validation of an opioid misuse machine learning classifier in hospitalized adult patients. Addict Sci Clin Pract. 2021;16(1):19.

50.

Melzack

Casey

. Sensory,Motivational,and Central Control Determinants of Pain. In The Skin Senses; 1968:423-439.

51.

Schiavenato

Craig

. Pain assessment as a social transaction: beyond the "gold standard. Clin J Pain. 2010;26(8):667-676.

52.

Aydede

. Defending the IASP definition of pain. Monist. 2017;100:439-464.

53.

Shim

. Multimodal analgesia or balanced analgesia: the better choice? Korean J Anesthesiol. 2020;73(5):361.

54.

Momeni

Crucitti

De Kock

. Patient-controlled analgesia in the management of postoperative pain. Drugs. 2006;66(18):2321-2337.

55.

Waelkens

Alsabbagh

Sauter

Joshi

Beloeil

. Pain management after complex spine surgery: A systematic review and procedure-specific postoperative pain management recommendations. Eur J Anaesthesiol. 2021;38(9):985-994.

56.

Roofthooft

Joshi

Rawal

Van de Velde

. Prospect guideline for elective caesarean section: updated systematic review and procedure-specific postoperative pain management recommendations. Anaesthesia. 2021;76(5):665-680.

57.

Jacobs

Lemoine

Joshi

Van de Velde

Bonnet

. Prospect guideline for oncological breast surgery: a systematic review and procedure-specific postoperative pain management recommendations. Anaesthesia. 2020;75(5):664-673.

58.

Rosero

Joshi

. Preemptive, preventive, multimodal analgesia: what do they really mean? Plast Reconstr Surg. 2014;134(4 Suppl 2):85s-93s.

59.

Ong

Lirk

Seymour

Jenkins

. The efficacy of preemptive analgesia for acute postoperative pain management: A meta-analysis. Anesth Analg. 2005;100(3):757-773.

60.

Kissin

Weiskopf Richard

. Preemptive Analgesia. Anesthesiology. 2000;93(4):1138-1143.

61.

Woolf

. Evidence for a central component of post-injury pain hypersensitivity. Nature. 1983;306(5944):686.

62.

Kissin

. Preemptive analgesia. Why its effect is not always obvious. Anesthesiology. 1996;84(5):1015.

63.

Pasqualucci

. Experimental and clinical studies about the preemptive analgesia with local anesthetics. Possible reasons of the failure. Minerva Anestesiol. 1998;64(10):445-457.

64.

Niv

Lang

Devor

. The effect of preemptive analgesia on subacute postoperative pain. Minerva Anestesiol. 1999;65(4):127-140. discussion 40-1.

65.

Coşkun

Dinçer

Turan

Özgültekin

. Postoperative Analgesic Efficacy of Preemptive and Postoperative Lornoxicam or Tramadol in Lumbar Disc Surgery. Turk J Anaesthesiol Reanim. 2019;47(5):375-381.

66.

Ballantyne

. Opioid analgesia: Perspectives on right use and utility. Pain Physician. 2007;10(3):479-491.

67.

Morgan

Christie

. Analysis of opioid efficacy, tolerance, addiction and dependence from cell culture to human. Br J Pharmacol. 2011;164(4):1322-1334.

68.

Jain

Brock

Phillips

Weaver

Khan

. Chronic preoperative opioid use is a risk factor for increased complications, resource use, and costs after cervical fusion. Spine J. 2018;18(11):1989-1998.

69.

O'Connell

Azad

Mittal

Vail

Johnson

Desai

, et al. Preoperative depression, lumbar fusion, and opioid use: an assessment of postoperative prescription, quality, and economic outcomes. Neurosurg Focus. 2018;44(1):E5.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.40 MB