Application of XGBoost in the prediction of acute postoperative pain after major noncardiac surgery in older patients

Abstract

Background:

Acute postoperative pain (APP) are key factors in the recovery of surgical patients after surgery. This study used the machine learning eXtreme Gradient Boosting (XGBoost) algorithm for the prediction of acute postoperative pain after major noncardiac surgery in older patients.

Methods:

This was a secondary analysis of data from a randomized controlled trial containing 1720 older patients undergoing general anesthesia. The training and test sets were divided according to the timeline. The Boruta function was made to screen for relevant characteristic variables. The XGBoost model was built on the training set using 10-fold cross-validation and hyperparameter optimization, and the tuned optimal model plotted the importance ranking diagram of feature variables, partial dependence profile (PDP) and Break down profile (BDP). The optimal model was used to calculate the confusion matrices and their parameters for the training and validation sets, and to plot the receiver operating characteristic curve (ROC), precision recall curve (PRC), calibration curve and Clinical decision curve (CDC) on the validation set.

Results:

The Boruta function was used to screen the relevant characteristic variables, and the screened postoperative acute pain characteristic variables were CHARLSON score, Mini-Mental State Examination (MMSE), duration of surgery, preoperative depression score, smoking or not, duration of anesthesia, intraoperative mean heart rate, lidocaine dosage, age, intraoperative morphine dosage, grouping, preoperative anxiety score, loperamide dosage, intraoperative colloid amount, APACHE -II score, postoperative ICU or not, surgical site and postoperative tracheal intubation or not. Test set and validation set accuracy (ACC) for acute postoperative pain: 0.921 and 0.871; AUC-ROC: 0.964 and 0.920; AUC-PRC: 0.983 and 0.959; Brier: 0.067 and 0.098; Matthews Correlation Coefficient (MCC): 0.847 and 0.746.

Conclusions:

A high-performance algorithm was developed and validated to predict the degree of change in postoperative pain; controlling important characterizing variables may be helpful for postoperative analgesia.

Keywords

Acute postoperative pain Boruta XGBoost older patients

Background

Prevention and treatment of acute postoperative pain (APP) are key factors in the recovery of surgical patients after surgery and in reducing hospitalization and health care costs.¹ Despite advances in medications and technology, 30% to 75% of surgical patients experience moderate to severe pain,² which is associated with higher risk of morbidity, cost of care,³ and increased risk of chronic pain.⁴ A review of the American Pain Society Clinical Practice Guidelines⁵ revealed that up to now only low-quality insufficient evidence has been identified and used to guide clinical work, especially in postoperative pain management, and that disparities continue to exist between healthcare organizations as well as between different countries.

However, APP is a clinical problem that encompasses multiple factors,^6–8 and clinical practitioners need to strive to eliminate all factors that have the potential to cause postoperative pain in patients. Effective postoperative pain management is a key component of perioperative care and, in conjunction with factors such as early mobilization and nutrition, can directly reduce the incidence of postoperative complications and length of hospital stay.⁹ And APP management needs to be tailored to patients undergoing different surgeries to promote early mobilization and rapid postoperative recovery and to minimize the long-term use of opioid analgesics.^10–12

Previously, Fang et al. presented a dataset for large-scale clinical bouts of pain in children and developed machine learning to make it effective in assessing pain in children.¹³ Omar et al. conducted a Meta-analysis of prolonged postoperative use of bouts of medication and found that machine learning algorithms can be used as a decision-support tool in the context of opioid use.¹⁴ The potential for machine learning applications in medicine is huge. Machine learning algorithms are more accurate in predicting outcomes than traditional prognostic scores and statistics.¹⁵ With the advancement of computer technology, new machine learning techniques have emerged as a promising method for predicting outcomes in various areas of medical research, including anesthesiology.^16–20 XGBoost is an optimized distributed gradient boosting tree designed for efficiency, flexibility, and portability. It is a powerful algorithm in machine learning, demonstrating exceptional performance in terms of predictive accuracy, speed, model robustness, and scalability. The core algorithm is based on Gradient Boosting Decision Trees (GBDT), with the fundamental idea of iteratively training decision trees to minimize the loss function. XGBoost is widely popular in global data science and machine learning, finding extensive applications across various research fields. Previously, Gayeon et al.²¹ used the XGBoost model to assess pain metrics in patients before skin incision, after incision, and intraoperatively, and the results demonstrated good sensitivity and specificity. Chen et al.²² also used the XGBoost model to predict postoperative pain for rupture of the abdomen surgery to enhance pain management in women after planter birth. However, as of now there are fewer studies on the prediction of acute postoperative pain in elderly patients, so this study used XGBoost in conjunction with clinical disciplines to assess the risk factors for acute postoperative pain in perioperative elderly patients.

Methods

Data collection

The data in this study are derived from the trial “Delirium in Older Patients after Combined Epidural-General Anesthesia or General Anesthesia for Major Surgery: A Randomized Trial,” which involved a secondary analysis of a previously established database. The study protocol was approved by the Institutional Review Board of Peking University (Approval No. 00001052-11048) and the ethics committees of five participating centers. It is registered with the Chinese Clinical Trial Registry (www.chictr.org.cn; Identifier: chictr-TRC-90000543) and ClinicalTrials.gov (Identifier: NCT01661907). All patients included in this retrospective study signed an informed consent form.

Inclusion criteria were (1) included patients between 60 and 90 years of age (2) underwent elective noncardiac thoracic and abdominal surgery of at least 2 h duration and (3) used a self-controlled analgesic pump after the procedure.

Exclusion criteria: patients with severe neurological disease, acute myocardial infarction or stroke, severe cardiac insufficiency, severe hepatic insufficiency or renal failure or contraindications to epidural anesthesia within 3 months.

Postoperative pain assessment

Pain was assessed using the Visual Analog Scale (VAS) and Numeric Rating Scale (NRS) during rest and coughing, conducted daily from days 1 to 3 between 8–10 AM and 6–8 PM. VAS was primarily used, while NRS was utilized for patients with visual impairments. Previous research has shown that both VAS and NRS demonstrate good consistency and sensitivity in assessing postoperative pain,²³ with NRS being applicable for visually impaired patients.²³ All patients received a single dose of 50 mg morphine postoperatively. For those dissatisfied with the effectiveness of postoperative epidural analgesia, adjustments to the pain pump settings (such as increasing background dosage or single bolus, shortening the dosing interval) or adding other analgesics (morphine 50 mg) were made. Similarly, for patients receiving intravenous patient-controlled analgesia, adjustments and additional opioid analgesics were provided as needed.

Observation indicators

Predictive variables

The study includes 59 potentially useful features as predictive variables, which are:

Demographics: Age, gender, years of education, Body Mass Index (BMI), American Society of Anesthesiologists (ASA) score, grouping (general anesthesia + Patient Controlled Intravenous Analgesia, or combined epidural-general anesthesia + Peridural Continuous Epidural Analgesia). Cognitive and Psychological Assessments: MMSE score, depression score, anxiety score, Charlson comorbidity score. Preoperative Laboratory Tests: Hematocrit (HCT), albumin (ALB), blood glucose, serum sodium, serum potassium, creatinine (CREA), blood urea nitrogen (BUN), BUN/CREA ratio. Preoperative Comorbidities: Stroke, transient ischemic attack (TIA), chronic obstructive pulmonary disease (COPD), chronic bronchitis, asthma, coronary artery disease, hypertension, arrhythmia, diabetes, thyroid disease, liver failure, renal failure, hyperlipidemia.

Cardiovascular Health: New York Heart Association (NYHA) classification. Lifestyle Factors: Smoking history, alcohol consumption. Intraoperative Anesthesia: Use of nitrous oxide, sevoflurane, midazolam, atropine, antiemetics, NSAIDs, lidocaine, ropivacaine, total morphine dose. Intraoperative Fluids and Monitoring: Crystalloid and colloid fluids, red blood cells, plasma, blood loss, urine output, mean arterial pressure (MAP), mean heart rate (MHR), surgery duration, anesthesia duration, surgery site (abdomen or chest), use of laparoscopy, intraoperative hypotension, APACHE-II score, postoperative ICU admission, and postoperative intubation.

Outcome variables

Acute Postoperative Pain Group (APP): Median resting pain score of 4 or higher based on the Visual Analog Scale (VAS) across six assessments over the first 3 days post-surgery; median movement pain score of 4 or higher over the same period; total morphine consumption greater than 50 mg. Non-Acute Postoperative Pain Group (NAPP): Median resting pain score of less than 4 based on the VAS across 6 assessments over the first 3 days post-surgery; median movement pain score of less than 4; total morphine consumption of 50 mg. The outcome variable is a binary classification between APP and NAPP.

Statistical analysis and sample size

RStudio (version 2023.06.0+421) were used for statistical analysis. Normality of continuous variables was tested using the Shapiro-Wilk test. Normally distributed continuous variables are presented as mean ± standard deviation, while non-normally distributed variables are expressed as median (interquartile range). Categorical data are presented as frequency (percentage).

In this study, 10-fold cross-validation is the most commonly used method to deal with unequal number sets by repeatedly using randomly generated sub-samples for training and validation, validating the results one at a time. The dataset was split into training and testing sets based on time. Missing data were imputed using the missForest package. Independent factors influencing postoperative pain were identified through the Boruta feature selection method on the training set. We used random grids for joint tuning of hyperparameters, including eta (lower = 0.01, upper = 1),max_depth =to_tune(lower = 1, upper = 30),nrounds =to_tune(lower = 1, upper = 30), the optimization goal is 10 fold cross-validation weighted accuracy. When max_depth is large (5 in this article), a higher gamma or min_child_weight is required to prevent overfitting. The mlr3verse package was used to implement 10-fold cross-validation and hyperparameter tuning to build the XGBoost model on the training set. The optimized model's performance was evaluated by plotting the feature importance ranking, partial dependence plots, and decomposition plots of the predictions. The confusion matrix and related metrics for both the training and validation sets were calculated, and the receiver operating characteristic (ROC) curve, precision-recall curve, calibration curve, and decision curve analysis (DCA) were plotted for the validation set.

For the binary classification model, the pmsampsize function in RStudio was used to calculate the sample size. With a given c-statistic of 0.9, 30 predictor variables, a shrinkage factor of 0.9, and an incidence rate of acute postoperative pain of 0.4, the final sample size requirement was 601, which is smaller than the 1207 samples available in the training set for this study.

Results

Baseline clinical data and flowchart

The total dataset comprised clinical data from 1,720 patients. The training (n = 1027; APP = 756, NAPP = 451; 2015.01-2017.02) and test sets (dataset n = 513; APP = 306; NAPP = 207; 2017.03-2018.06) were divided according to the timeline. The incidence of acute postoperative pain in the training set was 62.63% (756/1207), while in the testing set, it was 59.65% (306/513). There were no statistically significant differences in clinical data between the training and testing sets (p > 0.05), as shown in Table 1. The process of data collection, standardization, splitting, model development, validation, and interpretation is illustrated in Figure 1.

Table 1.

Baseline characteristics.

Factor		Train (n = 1207)		Test (n = 513)
Factor		NAPP (n = 451)	APP (n = 756)	NAPP (n = 207)	APP (n = 306)
Age (years)		69.00 [65.00, 75.00]	68.00 [64.00, 74.00]	69.00 [65.00, 75.00]	69.00 [64.25, 74.00]
Education (years)		9.00 [6.00, 13.50]	9.00 [6.00, 12.00]	9.00 [6.00, 12.00]	9.00 [5.25, 13.00]
BMI (kg/m²)		23.62 [21.50, 26.15]	23.56 [21.30, 25.83]	23.44 [21.82, 25.39]	23.49 [21.30, 25.39]
MMSE		29.00 [28.00, 30.00]	29.00 [27.00, 30.00]	29.00 [27.00, 30.00]	29.00 [27.00, 30.00]
Anxiety		0.00 [0.00, 2.00]	0.00 [0.00, 2.00]	0.00 [0.00, 2.00]	0.00 [0.00, 2.00]
Depression		0.00 [0.00, 2.00]	1.00 [0.00, 3.00]	0.00 [0.00, 2.00]	0.00 [0.00, 2.00]
CHARLSON		50.00 [50.00, 51.00]	120.00 [120.00, 121.00]	50.00 [50.00, 51.00]	120.00 [114.86, 121.00]
Hct (%)		38.70 [35.40, 42.05]	38.50 [35.10, 41.60]	37.90 [35.15, 40.85]	39.00 [35.75, 42.30]
ALB (g/L)		40.70 [37.40, 43.10]	40.50 [37.38, 43.10]	40.10 [37.00, 43.00]	40.70 [37.42, 43.27]
GLU (mmol/L)		5.37 [4.85, 6.12]	5.40 [4.88, 6.15]	5.29 [4.71, 6.07]	5.42 [5.01, 6.17]
Na (mmol/L)		141.90 [140.00, 143.00]	141.70 [140.00, 143.00]	142.00 [139.85, 143.45]	141.90 [140.00, 143.00]
K (mmol/L)		4.00 [3.75, 4.26]	4.03 [3.74, 4.30]	4.04 [3.76, 4.30]	4.02 [3.76, 4.31]
CREA (μmol/L)		89.00 [77.00, 100.00]	86.00 [75.00, 99.00]	87.00 [77.00, 96.50]	84.00 [75.00, 95.00]
BUN (mmol/L)		5.69 [4.63, 6.90]	5.63 [4.60, 6.79]	5.56 [4.54, 6.69]	5.50 [4.65, 6.62]
BUN/ CREA		15.62 [13.10, 19.33]	16.20 [13.34, 19.72]	15.96 [13.35, 19.48]	16.34 [13.64, 19.24]
Nitrousoxide (%)		1.00 [0.00, 2.00]	1.00 [0.00, 2.00]	1.00 [0.00, 2.00]	1.00 [0.00, 1.50]
Sevoflurance (%)		0.50 [0.00, 1.00]	0.60 [0.00, 1.00]	0.40 [0.00, 1.00]	0.60 [0.00, 1.00]
Midazolam (mg)		1.50 [1.00, 2.00]	1.60 [1.00, 2.00]	1.50 [1.20, 2.00]	1.50 [1.00, 2.00]
Lidocaine (mg)		0.00 [0.00, 0.00]	60.00 [0.00, 80.00]	0.00 [0.00, 0.00]	60.00 [0.00, 80.00]
Ropivacaine (mg)		0.00 [0.00, 0.00]	75.00 [0.00, 112.12]	0.00 [0.00, 0.00]	75.00 [0.00, 110.00]
Crystalloid (ml)		1850.00 [1600.00, 2475.00]	1950.00 [1600.00, 2600.00]	1800.00 [1500.00, 2475.00]	1850.00 [1512.50, 2600.00]
Colloidal (ml)		500.00 [500.00, 1000.00]	500.00 [500.00, 1000.00]	500.00 [0.00, 1000.00]	500.00 [500.00, 1000.00]
RBC (ml)		0.00 [0.00, 0.00]	0.00 [0.00, 0.00]	0.00 [0.00, 0.00]	0.00 [0.00, 0.00]
Plasma (ml)		0.00 [0.00, 0.00]	0.00 [0.00, 0.00]	0.00 [0.00, 0.00]	0.00 [0.00, 0.00]
Bleeding (ml)		350.00 [100.00, 600.00]	400.00 [150.00, 700.00]	300.00 [100.00, 600.00]	400.00 [157.50, 700.00]
Urine (ml)		100.00 [50.00, 300.00]	135.00 [50.00, 300.00]	100.00 [50.00, 300.00]	100.00 [50.00, 300.00]
MAP (mmHg)		82.82 [77.47, 87.75]	79.22 [74.14, 84.56]	83.17 [76.60, 87.90]	79.57 [75.00, 84.22]
MHR (times/min)		65.30 [59.09, 71.80]	67.55 [61.66, 74.43]	65.21 [59.88, 71.94]	68.19 [61.53, 75.55]
Anesthesia-duration (min)		285.56 [217.64, 350.76]	292.03 [221.90, 369.66]	280.56 [217.86, 353.32]	281.42 [227.67, 359.44]
Operation duration (min)		227.00 [164.50, 293.50]	233.50 [170.00, 312.00]	225.00 [161.00, 295.00]	224.50 [169.25, 303.00]
Perioperative morphine (mg)		211.00 [160.00, 280.00]	173.70 [145.00, 233.00]	210.00 [149.50, 286.50]	166.50 [145.00, 230.00]
APACHE-II		0.00 [0.00, 0.00]	0.00 [0.00, 0.00]	0.00 [0.00, 0.00]	0.00 [0.00, 0.00]
VAS-Rest- Median		0.00 [0.00, 1.00]	0.50 [0.00, 2.00]	0.00 [0.00, 1.00]	0.50 [0.00, 1.88]
VAS-Move- Median		2.00 [1.00, 3.00]	2.50 [1.50, 4.00]	2.00 [0.00, 2.50]	2.50 [1.00, 4.00]
Postoperative morphine(mg)		50.00 [50.00, 50.00]	120.00 [78.75, 120.00]	50.00 [50.00, 50.00]	120.00 [75.00, 120.00]
Gender	Male	296 (65.6)	489 (64.7)	135 (65.2)	203 (66.3)
Gender	Female	155 (34.4)	267 (35.3)	72 (34.8)	103 (33.7)
ASA	I	31 (6.9)	57 (7.5)	10 (4.8)	25 (8.2)
	II	385 (85.4)	649 (85.8)	181 (87.4)	257 (84.0)
	III	35 (7.8)	50 (6.6)	16 (7.7)	24 (7.8)
Group	GA	423 (93.8)	179 (23.7)	192 (92.8)	69 (22.5)
Group	GA+EA	28 (6.2)	577 (76.3)	15 (7.2)	237 (77.5)
Stroke	Yes	25 (5.5)	37 (4.9)	13 (6.3)	10 (3.3)
Stroke	No	426 (94.5)	719 (95.1)	194 (93.7)	296 (96.7)
TIA	Yes	4 (0.9)	11 (1.5)	2 (1.0)	6 (2.0)
TIA	No	447 (99.1)	745 (98.5)	205 (99.0)	300 (98.0)
COPD	Yes	7 (1.6)	14 (1.9)	3 (1.4)	8 (2.6)
COPD	No	444 (98.4)	742 (98.1)	204 (98.6)	298 (97.4)
Chronic bronchitis	Yes	7 (1.6)	14 (1.9)	0 (0.0)	11 (3.6)
Chronic bronchitis	No	444 (98.4)	742 (98.1)	207 (100.0)	295 (96.4)
Asthma	Yes	9 (2.0)	10 (1.3)	1 (0.5)	7 (2.3)
Asthma	No	442 (98.0)	746 (98.7)	206 (99.5)	299 (97.7)
Smoke	Yes	93 (20.6)	204 (27.0)	43 (20.8)	76 (24.8)
Smoke	No	358 (79.4)	552 (73.0)	164 (79.2)	230 (75.2)
CHD	Yes	39 (8.6)	79 (10.4)	21 (10.1)	27 (8.8)
CHD	No	412 (91.4)	677 (89.6)	186 (89.9)	279 (91.2)
HT	Yes	197 (43.7)	293 (38.8)	96 (46.4)	125 (40.8)
HT	No	254 (56.3)	463 (61.2)	111 (53.6)	181 (59.2)
Arrhythmia	Yes	18 (4.0)	23 (3.0)	13 (6.3)	9 (2.9)
Arrhythmia	No	433 (96.0)	733 (97.0)	194 (93.7)	297 (97.1)
NYHA	I	347 (76.9)	585 (77.4)	141 (68.1)	226 (73.9)
NYHA	II	104 (23.1)	171 (22.6)	66 (31.9)	80 (26.1)
DM	Yes	78 (17.3)	142 (18.8)	43 (20.8)	51 (16.7)
DM	No	373 (82.7)	614 (81.2)	164 (79.2)	255 (83.3)
Thyroid diseases	Yes	12 (2.7)	19 (2.5)	7 (3.4)	7 (2.3)
Thyroid diseases	No	439 (97.3)	737 (97.5)	200 (96.6)	299 (97.7)
Liver dysfunction	Yes	3 (0.7)	4 (0.5)	3 (1.4)	5 (1.6)
Liver dysfunction	No	448 (99.3)	752 (99.5)	204 (98.6)	301 (98.4)
HL	Yes	8 (1.8)	23 (3.0)	6 (2.9)	9 (2.9)
HL	No	443 (98.2)	733 (97.0)	201 (97.1)	297 (97.1)
Renal dysfunction	Yes	2 (0.4)	4 (0.5)	1 (0.5)	0 (0.0)
Renal dysfunction	No	449 (99.6)	752 (99.5)	206 (99.5)	306 (100.0)
Drink	Yes	108 (23.9)	199 (26.3)	49 (23.7)	70 (22.9)
Drink	No	343 (76.1)	557 (73.7)	158 (76.3)	236 (77.1)
Atropine	Yes	341 (75.6)	554 (73.3)	157 (75.8)	232 (75.8)
Atropine	No	110 (24.4)	202 (26.7)	50 (24.2)	74 (24.2)
Anti-nausea	Yes	405 (89.8)	654 (86.5)	187 (90.3)	282 (92.2)
Anti-nausea	No	46 (10.2)	102 (13.5)	20 (9.7)	24 (7.8)
NSAIDs	Yes	123 (27.3)	133 (17.6)	58 (28.0)	57 (18.6)
NSAIDs	No	328 (72.7)	623 (82.4)	149 (72.0)	249 (81.4)
Surgical-site	Abdomen	368 (81.6)	561 (74.2)	160 (77.3)	227 (74.2)
Surgical-site	Chest	83 (18.4)	195 (25.8)	47 (22.7)	79 (25.8)
Endoscope	Yes	162 (35.9)	219 (29.0)	76 (36.7)	95 (31.0)
Endoscope	No	289 (64.1)	537 (71.0)	131 (63.3)	211 (69.0)
ICU	Yes	85 (18.8)	153 (20.2)	40 (19.3)	61 (19.9)
ICU	No	366 (81.2)	603 (79.8)	167 (80.7)	245 (80.1)
Intubation	Yes	46 (10.2)	86 (11.4)	23 (11.1)	27 (8.8)
Intubation	No	405 (89.8)	670 (88.6)	184 (88.9)	279 (91.2)
Intra-operative hypotension	Yes	145 (32.2)	351 (46.4)	72 (34.8)	141 (46.1)
Intra-operative hypotension	No	306 (67.8)	405 (53.6)	135 (65.2)	165 (53.9)
VAS-Rest-Median-status	<4	451 (100.0)	713 (94.3)	207 (100.0)	285 (93.1)
VAS-Rest-Median-status	≥4	0 (0.0)	43 (5.7)	0 (0.0)	21 (6.9)
VAS-Move-Median-status	<4	451 (100.0)	494 (65.3)	207 (100.0)	198 (64.7)
VAS-Move-Median-status	≥4	0 (0.0)	262 (34.7)	0 (0.0)	108 (35.3)
Postoperative morphine-status	<50 mg	451 (100.0)	123 (16.3)	207 (100.0)	58 (19.0)
Postoperative morphine-status	≥50 mg	0 (0.0)	633 (83.7)	0 (0.0)	248 (81.0)

Figure 1.

Flowchart.

Feature selection

Using the Boruta function, the key variables associated with acute postoperative pain were identified. These features include Charlson comorbidity score, MMSE, surgery duration, preoperative depression score, smoking status, anesthesia duration, intraoperative average heart rate, lidocaine dose, age, intraoperative morphine dose, grouping, preoperative anxiety score, ropivacaine dose, intraoperative colloid volume, APACHE-II score, postoperative ICU admission, surgical site, and postoperative intubation status (Figure 2).

Figure 2.

Boruta screening diagram. This figure shows the results after feature selection using Boruta's algorithm, with each feature variable on the horizontal axis and its importance score on the vertical axis. The green boxes indicate the selected important feature variables, the red boxes indicate unimportant features, and the blue boxes are random controls.

Hyperparameter Tuning

The optimal parameters for XGBoost were determined as follows: nrounds = 5, eta = 0.508793, and max_depth = 5. The hyperparameter tuning process is shown in Figure 3.

Figure 3.

Hyperparameter tuning graph.

Confusion matrix parameters for the test and validation sets

The confusion matrix parameters for acute postoperative pain in both the test and validation sets were as follows: Accuracy (ACC): 0.921 (test set) and 0.871 (validation set). AUC-ROC: 0.964 (test set) and 0.920 (validation set). AUC-PRC: 0.983 (test set) and 0.959 (validation set). Brier score: 0.067 (test set) and 0.098 (validation set). Matthews Correlation Coefficient (MCC): 0.847 (test set) and 0.746 (validation set).

ROC, PRC, calibration curve, and DCA for the test set

The ability of the model to discriminate between categories is shown using the receiver operating characteristic (ROC) curve (Figure 4(a)), with an AUC-ROC score of 0.9198. the precision-recall curve (PRC) reflects the trade-off between precision and recall (Figure 4(b)). The AUC-PRC score is 0.9585. The calibration curve shows that the predicted probabilities calibrate well with the actual results (Figure 4(c)). The DCA plot shows that the XGBoost model delivers high net benefits in the 20%–70% threshold range, which is suitable for decision making on whether to intervene in this range (Figure 4(d)).

Figure 4.

ROC, PRC, calibration curve and DCA for the test set: Figure (a) shows the ROC curve, demonstrating the relationship between sensitivity and specificity of the model. Figure (b) shows the PRC curve, demonstrating the trade-off between precision and recall of the model. Figure (c) is the calibration curve showing the fit between the model's predicted probability and the actual incidence. Figure (d) shows the Decision Curve Analysis (DCA), which evaluates the net gain of the model for different threshold probabilities.

Feature importance and partial dependence plots

Feature importance rankings for acute postoperative pain were created using the XGBoost model (Figure 5), along with partial dependence plots (Figure 6). The importance rankings provide a clear visualization of each feature’s contribution to the prediction of acute postoperative pain. The partial dependence plots illustrate the relationship between individual features and the risk of acute postoperative pain, showing how the likelihood of pain changes with different feature values.

Figure 5.

Characterization variable ranking chart.

Figure 6.

Univariate partial dependence plot.

Feature importance and bias plots

Feature importance rankings (Figure 5) as well as partial dependency plots (Figure 6) were created for acute postoperative pain using the XGBoost model. The importance rankings clearly show the contribution of each feature to predicting acute postoperative pain. The partial dependency plot illustrates the relationship between each feature and the risk of acute postoperative pain, showing how the likelihood of pain varies with different feature values.

Decomposition prediction plot

The decomposition prediction plot shows how each feature contributed to the prediction for a single sample. For this sample, the XGBoost model predicted an acute postoperative pain probability of 0.994, which was higher than the decision threshold of 0.487. The model correctly predicted that the patient would experience acute postoperative pain, which was confirmed (Figure 7). The red and blue bars represent the positive and negative contributions of each variable to the prediction, with the final prediction value being the sum of all feature contributions.

Figure 7.

Decomposition prediction map.

Discussion

In this study, we counted baseline population characteristics, laboratory data, intraoperative and postoperative data to develop a model and screened 18 characteristic variables by Boruta function to identify risk factors for postoperative acute pain in elderly noncardiac surgery patients. These key variables may help clinical practitioners to provide personalized surgical plans and anesthetic preparations in the perioperative period to minimize the probability of patients experiencing acute postoperative pain. The ROC curves, and calibration curves obtained for these variables in the test set turned out to be better.

Ranking these variables in the XGBoost model revealed that the CHARLSON score contributed more to acute postoperative pain in both the ranked and decomposed prediction plots of the characteristic variables. The Charlson Comorbidity Index scoring criteria involves a comprehensive assessment of the patient's 19 disorders. In one study, the highest quartile of Charlson scores (5–11) was found to result in patients using more postoperative opioid analgesics, and higher Charlson scores have also been reported to be associated with postoperative pain.²⁴ In a study by et al. it was confirmed that Charlson scores were associated with continued opioid use for 90 days after minor and major surgery.¹² These reports are consistent with our study, and it is worth noting that we excluded patients suffering from severe cardiac insufficiency, severe hepatic insufficiency, or renal failure at the beginning of the study, which may have some impact on the Charlson score.

MMSE score is another important predictor of postoperative delirium, which was mentioned in a previous study that postoperative delirium was associated with patients’ pain,^25,26 which is consistent with our findings. In the report by Liu et al.²⁷ suggests that for every 1-point increase in pain score, the risk of developing postoperative delirium is elevated by 2.421 times. It is worth mentioning that CHARLSON score was independently associated with postoperative delirium,^28,29 and in our study, both scale scores were significant predictors of postoperative pain, which implies that further studies can be conducted to more accurately predict the incidence and severity of postoperative pain that may occur in patients before surgery by weighting the two scales.

This study uses the mlr3 ecosystem and its extension package, which provides a unified interface to access various learning algorithms, data preprocessing steps, and performance evaluation methods. The goal of designing this package is to simplify the process of training, predicting, and evaluating machine learning models while maintaining a high degree of flexibility and scalability. In machine learning, feature selection is a crucial step that aims to select the most representative and informative features from the original dataset to improve the performance of the model and reduce computational costs. The Boruta algorithm is a Random Forest-based feature selection method that features automated feature selection without the need to manually tweak the parameters or select a specific subset of features, which helps to reduce the need for manual intervention; it can identify features that are fully correlated with the target variable, which may contribute significantly to the predictive model and help to provide a more comprehensive understanding of the important information in the feature set; compared with traditional feature selection methods, Boruta algorithm can effectively handle large-scale datasets by using random forests and self-service resampling techniques and maintains the ability to generalize, thus avoiding the overfitting problem.

The use of cross-validation and hyper-parameter tuning in the process of model building can significantly improve the stability and predictive ability of the model, e.g., cross-validation can provide a more reliable assessment of the model performance, reduce overfitting, and effectively utilize the data. Meanwhile hyperparameter tuning also has outstanding contributions in improving model performance, accelerating convergence, and enhancing generalization ability. In the external validation process, the confusion matrix of the classification model and its parameters are validated and computed using time periods to verify the robustness and extrapolation of the model. Visualization and interpretation of machine learning models is an important tool for understanding and evaluating model performance. Visualization provides an intuitive understanding of the structure, performance, and feature importance of the model; interpretation provides insight into the internal working mechanism of the model and provides strong support for model optimization and decision making. The performance index of the predictive model is visualized using confusion matrix parameters, while the feature importance of the predictive model is visualized through feature variable ordering diagrams, biased dependency diagrams and decomposition prediction diagrams, and the internal operating mechanism of machine learning is explained by applying both mathematics and visualization.

Limitations

Limitations of this study may have influenced the results. Firstly, this study was only aimed at establishing a risk prediction model for postoperative pain in elderly patients undergoing non-cardiac thoracic and abdominal surgery and the model is not generalizable. Secondly, this study used time series segmentation to achieve external validation, and an independent dataset is needed to test the extrapolation of the model in the future. Finally, we did not perform a comparison of multimodal bouts and different bouts in this paper because these were not counted, which may be a limitation of our study.

Conclusion

This study used machine learning algorithms to predict the probability of occurrence of acute postoperative pain in 1720 elderly patients undergoing non-cardiac thoracic and abdominal surgery under general anesthesia, identified important characteristic variables, and developed a predictive model for the occurrence of acute postoperative pain with acceptable generalization ability.

Footnotes

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Huihui Miao

References

Liu

Fan

Zhang

Wang

Chen

Sun

Zhang

. Transcutaneous electrical acupoint stimulation reduces postoperative patients' length of stay and hospitalization costs: a systematic review and meta-analysis. Int J Surg 2024; 110(8): 5124–5135.

Zaslansky

Rothaug

Chapman

Backström

Brill

Fletcher

Fodor

Gordon

Konrad

Layman Young

Puig

Rawal

Short

Staender

Strassels

Stubhaug

Taylor

Tölle

Volk

Meissner

. PAIN OUT: the making of an international acute pain registry. Eur J Pain 2015; 19(4): 490–502.

Rawal

. Current issues in postoperative pain management. Eur J Anaesthesiol 2016; 33(3): 160–171.

Gewandter

Dworkin

Turk

McDermott

Baron

Gastonguay

Gilron

Katz

Mehta

Raja

Senn

Taylor

Treede

Versavel

Wasan

White

Ziegler

. Research design considerations for chronic pain prevention clinical trials: IMMPACT recommendations. Pain 2015; 156(7): 1184–1197.

Gordon

de Leon-Casasola

Sluka

Brennan

Chou

. Research gaps in practice guidelines for acute postoperative pain management in adults: findings from a review of the evidence for an American Pain Society clinical practice guideline. J Pain 2016; 17(2): 158–166.

Mędrzycka-Dąbrowska

Dąbrowski

Basiński

Małecka-Dubiela

. Identification and comparison of barriers to assessing and combating acute and postoperative pain in elderly patients in surgical wards of Polish hospitals: a multicenter study. Adv Clin Exp Med 2016; 25(1): 135–144.

Joshi

Beck

Emerson

Halaszynski

Jahr

Lipman

Minkowitz

Sentovich

Sinatra

Summers

Watkins-Pitchford

. Defining new directions for more effective management of surgical pain in the United States: highlights of the inaugural Surgical Pain Congress™. Am Surg 2014; 80(3): 219–228.

Pogatzki-Zahn

Kutschar

Nestler

Osterbrink

. A prospective multicentre study to improve postoperative pain: identification of potentialities and problems. PLoS One 2015; 10(11): e0143508.

Kehlet

Wilmore

. Evidence-based surgical care and the evolution of fast-track surgery. Ann Surg 2008; 248(2): 189–198.

10.

Fregoso

Wang

Tseng

Wang

. Transition from acute to chronic pain: evaluating risk for chronic postsurgical pain. Pain Physician 2019; 22(5): 479–488.

11.

Levene

Weinstein

Cohen

Andreae

Chou

Darnall

Gordon

Langford

McGreevy

Ring

Suresh

Hooten

. Local anesthetics and regional anesthesia versus conventional analgesia for preventing persistent postoperative pain in adults and children: a Cochrane systematic review and meta-analysis update. J Clin Anesth 2019; 55: 116–127.

12.

Brummett

Waljee

Goesling

Moser

Lin

Englesbe

Bohnert

ASB

Kheterpal

Nallamothu

. New persistent opioid use after minor and major surgical procedures in US adults. JAMA Surg 2017; 152(6): e170504.

13.

Fang

Liu

Zhang

. Deep learning-guided postoperative pain assessment in children. Pain 2023; 164(9): 2029–2035.

14.

Emam

Eldaly

Avila

Al-Mazrou

Alsulaiman

Althagafi

Attia

Badawy

El-Hadidi

Kassem

Mostafa

. Machine learning algorithms predict long-term postoperative opioid misuse: a systematic review. Am Surg 2024; 90(1): 140–151.

15.

Schönnagel

Caffard

Vu-Han

Strube

Navarro

Navarro-Ramirez

Tessitore

. Predicting postoperative outcomes in lumbar spinal fusion: development of a machine learning model. Spine J 2024; 24(2): 239–249.

16.

Dong

Feng

Thapa-Chhetry

Nguyen

Zhang

Sun

Luo

Feng

. Machine learning model for early prediction of acute kidney injury (AKI) in pediatric critical care. Crit Care 2021; 25(1): 288.

17.

Heo

Yoon

Park

Kim

Nam

Heo

. Machine learning-based model for prediction of outcomes in acute stroke. Stroke 2019; 50(5): 1263–1265.

18.

Raita

Goto

Faridi

Brown

DFM

Camargo

Jr Hasegawa

. Emergency department triage prediction of clinical outcomes using machine learning models. Crit Care 2019; 23(1): 64.

19.

Karhade

Shah

Bono

Nelson

Harris

Schoenfeld

. Development of machine learning algorithms for prediction of mortality in spinal epidural abscess. Spine J 2019; 19(12): 1950–1959.

20.

Dong

Zhu

Yang

Zhou

Sun

. Evaluation of the predictors for unfavorable clinical outcomes of degenerative lumbar spondylolisthesis after lumbar interbody fusion using machine learning. Front Public Health 2022; 10: 835938.

21.

Ryu

Choi

Seok

Kim

Lee

Kim

Park

. Machine learning based quantitative pain assessment for the perioperative period. NPJ Digit Med 2025; 8(1): 53.

22.

Tan

Koh

Jin

Ong

Lim

Lee

Chan

Sng

. Machine learning approach to predict postoperative pain after spinal morphine administration during caesarean delivery. Heliyon 2024; 10(23): e40602.

23.

Jensen

Chen

Brugger

. Interpretation of visual analog scale ratings and change scores: a reanalysis of two clinical trials of postoperative pain. J Pain 2003; 4(7): 407–414.

24.

Wittekindt

Schneider

Meissner

Guntinas-Lichius

. Postoperative pain assessment after septorhinoplasty. Eur Arch Otorhinolaryngol 2012; 269(6): 1613–1621.

25.

Denny

Such

. Exploration of relationships between postoperative pain and subsyndromal delirium in older adults. Nurs Res 2018; 67(6): 421–429.

26.

Ding

Gao

Chen

Zhou

Zhu

Zhang

. Preoperative acute pain is associated with postoperative delirium. Pain Med 2021; 22(1): 15–21.

27.

Liu

Zhang

Liu

Rong

. The age-adjusted Charlson comorbidity index predicts postoperative delirium in the elderly following thoracic and abdominal surgery: a prospective observational cohort study. Front Aging Neurosci 2022; 14: 979119.

28.

Pérez-Ros

Martínez-Arnau

Baixauli-Alacreu

Caballero-Pérez

García-Gollarte

Tarazona-Santabalbina

. Delirium predisposing and triggering factors in nursing home residents: a cohort trial-nested case-control study. J Alzheimers Dis 2019; 70(4): 1113–1122.

29.

Ramos

Vergara

Shackleford

Anzueto

Cerón

González

Ponce

. Risk for postoperative delirium related to comorbidities in older adult cardiac patients: an integrative review. J Clin Nurs 2023; 32(9–10): 2128–2139.