Sage Journals: Discover world-class research

Abstract

Surgery cancellations waste scarce operative resources and hinder patients’ access to operative services. In this study, the Wilcoxon and chi-square tests were used for predictor selection, and three machine learning models – random forest, support vector machine, and XGBoost – were used for the identification of surgeries with high risks of cancellation. The optimal performances of the identification models were as follows: sensitivity − 0.615; specificity − 0.957; positive predictive value − 0.454; negative predictive value − 0.904; accuracy − 0.647; and area under the receiver operating characteristic curve − 0.682. Of the three models, the random forest model achieved the best performance. Thus, the effective identification of surgeries with high risks of cancellation is feasible with stable performance. Models and sampling methods significantly affect the performance of identification. This study is a new application of machine learning for the identification of surgeries with high risks of cancellation and facilitation of surgery resource management.

Keywords

elective surgery hospital information system identification machine learning surgery cancellation

Introduction

Surgery cancellation is a universal problem that results in many wasted healthcare resources and thus has a considerable negative impact on the efficiency of healthcare resource management. It forces scarce operative resources to remain idle and hinders patients’ access to operative services. In addition, the costs of cancellations are high.^1–4 According to our review, the global cancellation rate (CR) generally ranges from 4.65 to 30.3 per cent,^1,5–21 a high proportion that increases the wastage of resources. The CR of different types of surgery varies even in the same hospital.^7,12,18,19 In previous studies, the highest CR has been reported for general surgery;^7,12 however, other studies present a contrary situation, in which general surgery does not have the highest CR.¹⁸ Such inconsistency between the CRs reported in different studies impedes the development of effective surgery resource management methods. Given this prevailing problem, research has been conducted to investigate the main risk factors of surgery cancellation to reduce CRs. Previous studies have validated the relationship between certain risk factors (e.g. increasing medical admission and preoperative clinic visits) and surgery cancellation.^5,22–24 In addition, studies have been conducted to explore a feasible method of lowering the general surgery CR in certain contexts.^25–30

The previously mentioned studies have focused on hospital-level management. However, specific patient-level preventive actions, which refer to specific and effective services according to patients’ situations, must also be investigated. Allocating a specific service to each patient may not be feasible because of limited healthcare resources. Hence, identifying surgeries with high risks of cancellation is the superior solution. If such surgeries are marked, preventive actions can be performed to avoid cancellation. Since the CR ranges from 4.65 to 30.3 per cent, identifying even only a fraction of these cancelled surgeries in advance will significantly enhance the efficiency of surgery resource allocation. If precise identification is achieved, surgery resources can be prevented from becoming idle and the latent surgery CR can be decreased. This is the motivation for this study.

A hospital information system (HIS) records information about patients’ healthcare processes and thus includes abundant data on aspects such as admission and surgery schedule information. Numerous healthcare management-related studies on HISs have been conducted, including on the identification of patients with high risks of hospital-acquired infection,³¹ time length estimation,³² identification of critical factors in patient falls,³³ prediction for the risk of death,³⁴ assessment of fractures after falls at the hospital,³⁵ and other important fields.^36–38 In our view, applying HIS data to identifying surgeries with high and low risks of cancellation is feasible.³⁹

Machine learning (ML) is a powerful and effective tool for healthcare management. Izad Shenas et al.⁴⁰ used ML to build predictive models for identifying patients in the top five percentile of cost among the general population; the results of their study can be used to improve the delivery of health services. Liu et al.⁴¹ considered ensemble-of-trees methods as an alternative for risk adjustment in evaluating a hospital’s performance, and the results showed that ML is superior to logistic regression for investigating risk adjustment. Furthermore, there were similar applications in the fields of healthcare cost prediction,⁴² readmission and hospitalization,^43–46 and healthcare insurance.⁴⁷

This study aims to identify surgeries with high risks of cancellation based on ML techniques and an HIS to facilitate surgery management. Routine risk factors and newly proposed predictors are considered; these are sourced from the HIS. This study experimentally validates the performance of a combination of ML techniques and HIS data in identifying surgeries with high risks of cancellation. Based on the results of this study, a surgery manager can identify surgeries with high risks of cancellation and therefore alert the healthcare system and adopt preventive actions to achieve a lower CR.

Data and methods

Data source and description

This study was based on data sourced from West China Hospital (WCH), which is the largest hospital in southeast China. It focused on elective urologic surgeries from 1 January 2013 to 31 December 2014. Overall, the data contained 5125 cases, of which 810 were cancelled (positive) and 4315 were not, providing a CR of 15.80 per cent. All surgeries were scheduled one day in advance, and all cancellations were institutional resource- and capacity-related cancellations. All the considered predictors are listed in Table 1. The statistics of several predictors are shown in Appendix 1 (Table 6 and Figure 1).

Table 1.

Predictors considered in this study.

Category	Number of predictors	Predictor(s)
Demographic	3	Name, age, and sex
Admission information	5	Admission date, visit number, identification number of patient, register number, and discharge date
Drug allergy history and blood type	3	Drug allergy, names of the drugs, and blood type
Surgery operation information	4	Surgery name, surgery type, reoperation among this admission, reoperation according to the plan, and the purpose of this surgery
Surgery schedule information	5	Order number of surgery, surgery date, surgery time, OR, surgeon, and the number of surgeries in the OR in the day
Administrative information	10	Operation staff, department, ward, bed number, last updated date, last updated time, the staff who last updated the information, and surgery expenditure
Surgery process record	24	Actual date when surgery began, actual time when surgery began, actual date when surgery ended, actual time when surgery ended, actual date when patient left OR, actual time when patient left OR, actual date when anaesthesia was started, actual time when anaesthesia was ended, actual date when predictive medicine was given, actual time when predictive medicine was given, body temperature, blood transfusion among surgery, autologous blood, allogeneic blood, plasma, thrombocyte, pathological examination, state of consciousness, general skin conditions, special skin conditions, drainage situation, surgery item delivery, anaesthesia degree, surgical incision category, and anaesthesia type

OR: operating room.

Yan et al.⁴⁸ analysed the data of WCH and reported that patient age, surgeon, and type of surgery had important impacts on surgery cancellation. Moreover, according to our survey in WCH, there were several predictors related to surgery cancellation in addition to routine risk factors. These newly proposed predictors are selected and provided below.

Cancellation record

Generally, surgeries with an existing cancellation record were less likely to be cancelled owing to the exposure of their latent causes for cancellation, and a remedial solution was likely to be implemented.

First surgery of a surgeon

Several surgeries were cancelled because the surgeon was occupied in a previous surgery. Hence, if a surgeon does not have a surgery immediately before another surgery, the probability of cancellation may be diminished, assuming that other risk factors are fixed.

First surgery in an operating room (OR)

Several surgeries were cancelled because the surgery scheduled before them in the same OR exceeded its planned schedule. In this case, if there is no surgery scheduled immediately before another surgery, the occurrence of cancellation may be diminished, assuming that other risk factors are fixed.

Holiday

This predictor is related to hospital preparation, as we assumed that when there is a holiday, medical resources, particularly staff resources, are scarcer. Hence, holidays may be more likely to lead to surgery cancellation.

Number of days in admission

Because all surgeries in our data are elective surgeries, the number of days in admission is strongly related to surgery preparation. More preparation is required if the duration is longer.

Predictor selection

We used the Wilcoxon and chi-square tests to assess the significance of continuous and discrete predictors on surgery cancellation for routine predictors, respectively. We set the significance level as 0.95 and selected the predictors, the p-values of which were less than 0.05. As shown in Table 2, among all continuous predictors, the number of days in admission and the sequence in the waiting list show significant impacts on surgery cancellation. Regarding discrete predictors (including binary predictors), only the type of surgery, surgeon, OR, and the total number of surgeries in the OR show significant impacts on surgery cancellation. According to Yan et al.,⁴⁸ patient age is a significant factor of CR; however, it did not prove significant in our data. This may be because we focused on elective urologic surgery, while Yan et al.⁴⁸ focused on general surgery.

Table 2.

Statistical tests for routine predictor selection (significant only).

Type	Predictor	Statistics			p-value
Continuous	Number of days in admission	1.86 × 10⁶			0.005
Continuous	Sequence in the waiting list	2.15 × 10⁶			<0.001
Binary	First surgery in an OR	Statistics			0.050
		Chi-square	Df	Chi-square Df
		3.837	1	2.837
Discrete	Type of surgery	171.514	28	143.515	<0.001
Discrete	Surgeon	66.149	13	53.149	<0.001
Discrete	OR	26.702	6	20.702	<0.001
Discrete	Total number of surgeries in the OR	42.584	10	32.584	<0.001

Df: degree of freedom; OR: operating room.

The Wilcoxon test was conducted for continuous predictors and the chi-square test was performed for discrete and binary predictors.

Applied models and experimental setup

The ML techniques used in this study were random forest (RF), support vector machine (SVM), and XGBoost algorithms. The RF classifier, which was first proposed by Breiman,⁴⁹ combines a number of trees for training and prediction, and it is widely used in healthcare management with satisfactory results.^28,43,46 An SVM is a supervised learning method with an associated learning algorithm that analyses the data used for classification and regression analysis.⁵⁰ In this study, two versions of SVMs were employed, SVM-linear and SVM-radial, which use different kernel functions. Boosting is an ensemble algorithm in ML used primarily to reduce the bias and variance of learners; thus, a family of boosting algorithms was proposed to combine a series of weak learners into a strong learner.⁵¹ XGBoost is the most widely used and powerful variation of boosting, with different components. XGBoost-linear is XGBoost with a linear kernel function, and XGBoost-tree is XGBoost with a tree kernel function.⁵² Over- and under-sampling have been employed because of the extreme imbalance in the positive–negative ratio (2:11) to achieve better performance, except for the original dataset, in which no changes were made. This method has performed well in several fields, such as customer churn prediction⁵³ and customer classification⁵⁴ with imbalanced class distributions (for more details, please refer to Chawla et al.⁵⁵). All cases were divided into two sets, the train and test sets, with a ratio of 8:2. The train set was used to fit the ML models, and the test set was employed to validate the performance of the ML models.

In this study, we designed 15 schemes (i.e. three sampling methods) and 5 ML models, as mentioned above. Each scheme was run independently 10 times to find and validate the optimal pattern of best performance.

Evaluation criteria

The performance was measured according to six metrics: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and area under the receiver operating characteristic curve (AUC). Among them, sensitivity, specificity, PPV, NPV, and accuracy are the basic criteria for algorithm evaluation. For a binary classification system, the performance evaluation is typically illustrated using the receiver operating characteristic (ROC) curve and AUC (which is the area under the ROC curve) to enable comprehensive consideration of the sensitivity and specificity. In this study, AUC was considered the key criterion.

Results

The performances of each scheme in the train and test sets are shown in Tables 3 and 7, respectively. For the scheme performance in the test set, the following conclusions can be drawn: (1) the RF model with over-sampling achieved the best performance according to NPV, accuracy, and AUC, whereas the one with under-sampling achieved the best sensitivity. Furthermore, the best values of specificity and PPV were achieved by the original SVM-radial model; (2) for all the 15 schemes, the mean NPV showed a stable performance, and all the obtained NPVs were approximately 0.9; (3) the 95 per cent confidence intervals of AUCs were very narrow, and the gaps between the upper and lower bounds were all less than 0.04; (4) the AUCs of the RF, XGBoost-linear, and XGBoost-tree models were all greater than 0.6, with the maximum being 0.682, whereas those of the SVM-linear and SVM-radial models ranged from 0.501 to 0.612; (5) regardless of the sampling method, the RF model achieved the highest performance in sensitivity, NPV, accuracy, and AUC, whereas the SVM-radial model achieved the lowest sensitivity and highest specificity and PPV; (6) when we changed the sampling method from original to over- or under-sampling, sensitivity increased and specificity decreased, while the AUCs nearly remained unchanged, for all ML models.

Table 3.

Performance in the test set with the confidence level of 0.95.

		Original						Over-sampling						Under-sampling
		Sensitivity	Specificity	PPV	NPV	Acc	AUC	Sensitivity	Specificity	PPV	NPV	Acc	AUC	Sensitivity	Specificity	PPV	NPV	Acc	AUC
Random forest	Mean	0.48	0.799	0.326	0.892	0.639	0.676	0.601	0.693	0.275	0.904	0.647	0.682	0.615	0.655	0.261	0.904	0.635	0.672
Random forest	CI	0.419 0.542	0.738 0.860	0.284 0.368	0.886 0.898	0.625 0.655	0.660 0.693	0.527 0.674	0.627 0.759	0.254 0.296	0.895 0.913	0.635 0.659	0.668 0.696	0.51 0.721	0.562 0.749	0.236 0.285	0.89 0.918	0.624 0.646	0.656 0.687
XGBoost-linear	Mean	0.404	0.828	0.306	0.881	0.616	0.646	0.351	0.865	0.329	0.877	0.608	0.643	0.606	0.593	0.218	0.889	0.599	0.634
XGBoost-linear	CI	0.376 0.433	0.819 0.837	0.289 0.323	0.876 0.886	0.602 0.63	0.628 0.664	0.324 0.379	0.856 0.875	0.306 0.352	0.872 0.881	0.594 0.623	0.626 0.66	0.579 0.633	0.581 0.604	0.212 0.225	0.883 0.895	0.588 0.611	0.621 0.648
XGBoost-tree	Mean	0.398	0.846	0.327	0.882	0.622	0.644	0.367	0.86	0.329	0.879	0.613	0.643	0.607	0.592	0.218	0.889	0.599	0.632
XGBoost-tree	CI	0.37 0.425	0.831 0.86	0.304 0.351	0.877 0.887	0.607 0.636	0.628 0.66	0.339 0.395	0.853 0.867	0.311 0.347	0.874 0.883	0.599 0.627	0.627 0.658	0.576 0.637	0.576 0.607	0.21 0.227	0.882 0.896	0.585 0.613	0.618 0.646
SVM-linear	Mean	0.368	0.652	0.265	0.847	0.51	0.501	0.569	0.656	0.238	0.89	0.612	0.612	0.557	0.645	0.227	0.886	0.601	0.601
SVM-linear	CI	0.191 0.545	0.481 0.823	0.149 0.381	0.84 0.854	0.497 0.522	0.5 0.503	0.538 0.599	0.631 0.682	0.227 0.249	0.884 0.896	0.599 0.626	0.599 0.626	0.525 0.588	0.624 0.666	0.22 0.235	0.881 0.891	0.59 0.612	0.59 0.612
SVM-radial	Mean	0.191	0.957	0.454	0.863	0.574	0.574	0.196	0.943	0.391	0.862	0.569	0.569	0.312	0.824	0.277	0.868	0.568	0.568
SVM-radial	CI	0.169 0.213	0.954 0.961	0.419 0.489	0.86 0.866	0.563 0.585	0.563 0.585	0.177 0.214	0.938 0.947	0.356 0.426	0.859 0.865	0.559 0.579	0.559 0.579	0.175 0.45	0.693 0.956	0.247 0.306	0.856 0.881	0.557 0.58	0.557 0.58

CI: confidence level of 0.95; PPV: positive predictive value; NPV: negative predictive value; Acc: accuracy; AUC: area under the receiver operating characteristic curve; SVM: support vector machine.

Table 4 shows the comparison of AUCs between the schemes by the one-sided t-test. According to the table, the following conclusions can be drawn: (1) the RF model outperformed other ML models for all sampling methods; (2) the XGBoost models (XGBoost-linear and XGBoost-tree) were superior to the SVM models (SVM-linear and SVM-radial); (3) no difference was observed between the models belonging to the same category (SVM or XGBoost); (4) different sampling methods showed similar performances; however, slight differences still existed at the significant level, where models with over-sampling achieved the highest significant level, and the original model was superior to the model with under-sampling.

Table 4.

The p-values of t-test for the areas under the receiver operating characteristic curve (AUCs) in the test set.

		Original					Over-sampling					Under-sampling
		RF	XGBoost-linear	XGBoost-tree	SVM-linear	SVM-radial	RF	XGBoost-linear	XGBoost-tree	SVM-linear	SVM-radial	RF	XGBoost-linear	XGBoost-tree	SVM-linear	SVM-radial
Original	RF	—	**	**	***	***	—	**	**	***	***	—	***	***	***	***
	XGBoost-linear	—	—	—	***	***	—	—	—	**	***	—	—	—	***	***
	XGBoost-tree	—	—	—	***	***	—	—	—	**	***	—	—	—	***	***
	SVM-linear	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
	SVM-radial	—	—	—	***	—	—	—	—	—	—	—	—	—	—	—
Over-sampling	RF	—	**	***	***	***	—	***	***	***	***	—	***	***	***	***
	XGBoost-linear	—	—	—	***	***	—	—	—	**	***	—	—	—	***	***
	XGBoost-tree	—	—	—	***	***	—	—	—	**	***	—	—	—	***	***
	SVM-linear	—	—	—	***	***	—	—	—	—	***	—	—	—	—	***
	SVM-radial	—	—	—	***	—-	—	—	—	—	—	—	—	—	—	—
Under-sampling	RF	—	*	**	***	***	—	**	**	***	***	—	***	***	***	***
	XGBoost-linear	—	—	—	***	***	—	—	—	*	***	—	—	—	***	***
	XGBoost-tree	—	—	—	***	***	—	—	—	*	***	—	—	—	***	***
	SVM-linear	—	—	—	***	***	—	—	—	—	***	—	—	—	—	***
	SVM-radial	—	—	—	***	—	—	—	—	—	—	—	—	—	—	—

RF: random forest; SVM: support vector machine.

The t-test was conducted on one side to validate the difference. When the p-value is low enough, we would like to reject H0 and accept that the scheme in the row is better than the scheme in the column.

Significance codes: 0 *** 0.001 ** 0.01 * 0.05 — 1.

The results of this study indicate that a model or a sampling method influences the AUC; however, scientific validation is still needed. To determine whether these two factors have a significant impact on AUCs, an analysis of variance (ANOVA) was performed to validate the data. Table 5 shows the ANOVA results of AUCs based on models and sampling methods. According to the results, all p-values were less than 0.001, indicating that the model type, sampling method, and their interaction can significantly influence AUCs.

Table 5.

ANOVA for the areas under the receiver operating characteristic curve (AUCs) in the test set.

	Df	Sum Sq.	Mean Sq.	F value	Pr(>F)
Model	4	0.363	0.091	8.81 × 10²⁸	<0.001
Sampling method	2	0.011	0.005	5.25 × 10²⁷	<0.001
Model: sampling method	8	0.087	0.011	1.06 × 10²⁸	<0.001
Residuals	135	0	0

ANOVA: analysis of variance; Df: degree of freedom.

Discussion

This study indicates the feasibility of identifying surgeries with high risks of pre-cancellations. The general AUCs in the test set are above 0.6, with the maximum being 0.682 for the RF with over-sampling. Moreover, the ML models show stable performance with a difference of less than 0.04 between the upper and lower bounds. All ML models achieved a high NPV (approximately 0.9), meaning that 90 per cent of the surgeries labelled as low risks were not cancelled. In this case, almost all negative cases can be determined, thereby narrowing the sets of suspicious surgeries. Considering the extreme imbalance between positive and negative cases (2:11), the ML models can still effectively identify the considerably high-risk patients, even though PPV was slightly lower.

Different sampling methods can effectively adjust the performance of the ML models. According to this study, over- and under-sampling would lead to the increase of sensitivity and the decrease of specificity compared to the original method. As expected, practitioners’ concerns greatly affect their preference on model performance. This study’s findings help the practitioners adjust the ML models according to their needs.

The models and sampling methods significantly affect the performance of identification. Currently, no existing fixed omnipotent scheme exists for achieving the best performance in the identification of surgeries with high risks of cancellation, owing to the different datasets involving diverse features. Using various ML models and sampling methods, optimal identification results can be appropriately achieved.

Although independent repeated experiments and over/under-sampling were used to guarantee a rigorous work, limitations still exist. First, surgeries of other diseases in other hospitals and countries must be considered in future studies. Second, this research only focused on institutional resource- and capacity-related cancellations. Further comparison is required on other kinds of cancellations, as well as on the cancellations in other medical institutions. The obtained comparison results might provide more details on surgery cancellations.

Conclusion

This study pioneered the identification of surgeries with high risks of cancellation through ML techniques. The results of the study indicate that with a stable performance the effective identification of surgeries with high risks of cancellation is feasible. Moreover, we validated that model types and sampling methods have significant effects on the performance of identification. This study is a new application of ML for identifying surgeries with high risks of cancellation, thereby facilitating surgery scheduling and resource management. Based on this research, a surgery manager can identify surgeries with high risks of cancellation, alert the healthcare system, and adopt preventive actions, leading to a lowered CR. As mentioned earlier, a lowered CR will lead to a higher utility rate of institutional resources, such as ORs, resulting in improved cost efficiency of the healthcare system. Further studies on simulations to optimize surgery scheduling considering CRs are required.

Footnotes

Appendix 1

Table 7.

Performance in the training set with the confidence level of 0.95.

		Original						Over-sampling						Under-sampling
		Sensitivity	Specificity	PPV	NPV	Accuracy	AUC	Sensitivity	Specificity	PPV	NPV	Accuracy	AUC	Sensitivity	Specificity	PPV	NPV	Accuracy	AUC
Random forest	Mean	0.746	0.884	0.579	0.95	0.815	0.888	0.857	0.748	0.777	0.842	0.803	0.899	0.768	0.825	0.828	0.789	0.796	0.89
Random forest	CI	0.69 0.802	0.835 0.934	0.5 0.658	0.941 0.958	0.806 0.824	0.885 0.892	0.829 0.886	0.692 0.804	0.744 0.81	0.824 0.86	0.786 0.82	0.895 0.903	0.701 0.835	0.749 0.9	0.776 0.879	0.752 0.825	0.786 0.807	0.884 0.896
XGBoost-linear	Mean	0.987	0.963	0.834	0.997	0.975	0.998	0.989	0.964	0.965	0.988	0.976	0.998	0.977	0.979	0.979	0.977	0.978	0.999
XGBoost-linear	CI	0.984 0.989	0.96 0.966	0.824 0.845	0.997 0.998	0.974 0.976	0.998 0.998	0.988 0.99	0.961 0.967	0.962 0.968	0.987 0.989	0.975 0.978	0.998 0.998	0.974 0.98	0.974 0.983	0.974 0.983	0.974 0.98	0.974 0.981	0.999 0.999
XGBoost-tree	Mean	0.974	0.969	0.857	0.995	0.972	0.997	0.988	0.965	0.966	0.987	0.976	0.998	0.977	0.98	0.98	0.977	0.978	0.999
XGBoost-tree	CI	0.969 0.979	0.965 0.974	0.839 0.874	0.994 0.996	0.97 0.973	0.997 0.998	0.986 0.989	0.962 0.968	0.963 0.968	0.986 0.989	0.975 0.978	0.998 0.998	0.974 0.98	0.976 0.984	0.976 0.984	0.974 0.98	0.975 0.981	0.999 0.999
SVM-linear	Mean	0.352	0.649	0.272	0.842	0.5	0.502	0.64	0.671	0.661	0.651	0.655	0.655	0.632	0.693	0.674	0.654	0.663	0.663
SVM-linear	CI	0.183 0.52	0.477 0.822	0.14 0.404	0.839 0.844	0.496 0.505	0.5 0.504	0.619 0.661	0.65 0.692	0.653 0.669	0.644 0.659	0.651 0.66	0.651 0.66	0.618 0.646	0.675 0.711	0.662 0.685	0.646 0.661	0.654 0.671	0.654 0.671
SVM-radial	Mean	0.854	0.997	0.983	0.973	0.926	0.926	0.977	0.954	0.955	0.977	0.966	0.966	0.963	0.975	0.975	0.963	0.969	0.969
SVM-radial	CI	0.849 0.859	0.997 0.997	0.981 0.984	0.972 0.974	0.923 0.928	0.923 0.928	0.975 0.98	0.951 0.956	0.953 0.957	0.975 0.979	0.964 0.967	0.964 0.967	0.957 0.969	0.972 0.978	0.972 0.978	0.958 0.969	0.967 0.971	0.967 0.971

CI: confidence interval at 0.95 level; PPV: positive predictive value; NPV: negative predictive value; AUC: area under the receiver operating characteristic curve; SVM: support vector machine.

Acknowledgements

The authors would like to thank Dr Yingkang Shi for his support and guidance and anonymous reviewers for their advice and comments.

Authors’ note

Li Luo, Fengyi Zhang, Yao Yao and RenRong Gong are also affilated with West China Hospital, Sichuan University, China.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the National Natural Science Foundation of China (grant nos 71532007, 71501135, 71471124, 71131006, and 7117219), Major Project of the National Social Science Foundation of China (grant no 18VZL006) and the Excellent Youth Fund of Sichuan University (grant nos skqy201607, skzx2016-rcrw14, and sksyl201709).

ORCID iD

Fengyi Zhang

References

Basson

Butler

Verma

Predicting patient nonappearance for surgery as a scheduling strategy to optimize operating room utilization in a veterans’ administration hospital. Anesthesiology 2006; 104: 826–834.

Macario

Dexter

Traub

RD.

Hospital profitability per hour of operating room time can vary among surgeons. Anesth Analg 2001; 93: 669–675.

Dexter

Blake

Penning

, et al. Calculating a potential increase in hospital margin for elective surgery by changing operating room time allocations or increasing nursing staffing to permit completion of more cases: a case study. Anesth Analg 2002; 94: 138–142.

Maimaiti

Rahimi

Aghaie

LA.

Economic impact of surgery cancellation in a general hospital, Iran. Ethiop J Health Dev 2017; 30: 92–95.

Robb

O’Sullivan

Brannigan

, et al. Are elective surgical operations cancelled due to increasing medical admissions? Ir J Med Sci 2004; 173: 129–132.

Argo

Vick

Graham

, et al. Elective surgical case cancellation in the Veterans Health Administration system: identifying areas for improvement. American Journal of Surgery 2009; 198: 600–606.

Chalya

Gilyoma

Mabula

, et al. Incidence, causes and pattern of cancellation of elective surgical operations in a university teaching hospital in the Lake Zone, Tanzania. Afr Health Sci 2011; 11: 438–443.

Chang

Chen

, et al. Case review analysis of operating room decisions to cancel surgery. BMC Surg 2014; 14: 47.

Chiu

Lee

Chui

PT.

Cancellation of elective operations on the day of intended surgery in a Hong Kong hospital: point prevalence and reasons. Hong Kong Med J 2012; 18: 5–10.

10.

Haana

Sethuraman

Stephens

, et al. Case cancellations on the day of surgery: an investigation in an Australian paediatric hospital. ANZ J Surg 2009; 79: 636–640.

11.

Leslie

Beiko

van Vlymen

, et al. Day of surgery cancellation rates in urology: identification of modifiable factors. Can Urol Assoc J 2013; 7: 167–173.

12.

Nanjappa

Kabeer

Smile

SR.

Elective surgical case cancellation – an audit. Int J Curr Res Rev 2014; 6: 21–23.

13.

Schofield

Rubin

Piza

, et al. Cancellation of operations on the day of intended surgery at a major Australian referral hospital. Med J Aust 2005; 183: 612–615.

14.

Seim

Fagerhaug

Ryen

, et al. Causes of cancellations on the day of surgery at two major university hospitals. Surg Innov 2009; 16: 173–180.

15.

Macarthur

Bevan

JC.

Determinants of pediatric day surgery cancellation. J Clin Epidemiol 1995; 48: 485–489.

16.

Pollard

Olson

Early outpatient preoperative anesthesia assessment: does it help to reduce operating room cancellations?

Anesth Analg 1999; 89: 502–505.

17.

Dakum

Ramyil

Misauno

, et al. Reasons for cancellations of urologic day care surgery. Niger J Surg Res 2010; 8: 30–33.

18.

Kolawole

Bolaji

BO.

Reasons for cancellation of elective surgery in Ilorin. Niger J Surg Res 2002; 4: 28–33.

19.

Laisi

Tohmo

Keranen

Surgery cancelation on the day of surgery in same-day admission in a Finnish hospital. Scand J Surg 2013; 102: 204–208.

20.

Garg

Bhalotra

Bhadoria

, et al. Reasons for cancellation of cases on the day of surgery – a prospective study. Indian J Anaesth 2009; 53: 35–39.

21.

Sultan

Rashid

Abbas

SM.

Reasons for cancellation of elective cardiac surgery at Prince Sultan Cardiac Centre, Saudi Arabia. J Saudi Heart Assoc 2012; 24: 29–34.

22.

Hovlid

Bukve

A qualitative study of contextual factors’ impact on measures to reduce surgery cancellations. BMC Health Serv Res 2014; 14: 215–225.

23.

Ferschl

Tung

Sweitzer

, et al. Preoperative clinic visits reduce operating room cancellations and delays. Anesthesiology 2005; 103: 855–859.

24.

Boudreau

Gibson

MJ.

Surgical cancellations: a review of elective surgery cancellations in a tertiary care pediatric institution. J Perianesth Nurs 2011; 26: 315–322.

25.

Hovlid

A new pathway for elective surgery to reduce cancellation rates. BMC Health Serv Res 2012; 12: 154–162.

26.

Schuster

Neumann

, et al. The effect of hospital size and surgical service on case cancellation in elective surgery: results from a prospective multicenter study. Anesth Analg 2011; 113: 578–585.

27.

Van Klei

Moons

Rutten

, et al. The effect of outpatient preoperative evaluation of hospital inpatients on cancellation of surgery and length of hospital stay. Anesth Analg 2002; 94: 644–649.

28.

Knox

Myers

Wilson

, et al. The impact of pre-operative assessment clinics on elective surgical case cancellations. Surgeon 2009; 7: 76–78.

29.

Dexter

Marcon

Epstein

, et al. Validation of statistical methods to compare cancellation rates on the day of surgery. Anesth Analg 2005; 101: 465–473.

30.

Hovlid

von Plessen

Haug

, et al. Patient experiences with interventions to reduce surgery cancellations: a qualitative study. BMC Surg 2013; 13: 30–36.

31.

Evans

Burke

Classen

, et al. Computerized identification of patients at high risk for hospital-acquired infection. Am J Infect Control 1992; 20: 4–10.

32.

Redfern

Langlotz

Abbuhl

, et al. The effect of PACS on the time required for technologists to produce radiographic images in the emergency department radiology suite. J Digit Imaging 2002; 15: 153–160.

33.

Lee

Liu

Kuo

, et al. Application of data mining to the identification of critical factors in patient falls using a web-based reporting system. Int J Med Inform 2011; 80: 141–150.

34.

Martins

Use of comorbidity measures to predict the risk of death in Brazilian in-patients. Rev Saude Publica 2010; 44: 448–456.

35.

Toyabe

World Health Organization fracture risk assessment tool in the assessment of fractures after falls in hospital. BMC Health Serv Res 2010; 10: 106–117.

36.

Gao

, et al. Examining individuals’ adoption of healthcare wearable devices: an empirical study from privacy calculus perspective. Int J Med Inform 2016; 88: 8–17.

37.

Eden

Totten

Kassakian

, et al. Barriers and facilitators to exchanging health information: a systematic review. Int J Med Inform 2016; 88: 44–51.

38.

Zhang

, et al. Residents’ numeric inputting error in computerized physician order entry prescription. Int J Med Inform 2016; 88: 25–33.

39.

Tung

Dexter

Jakubczyk

, et al. The limited value of sequencing cases based on their probability of cancellation. Anesth Analg 2010; 111: 749–759.

40.

Izad Shenas

Raahemi

Hossein

, et al. Identifying high-cost patients using data mining techniques and a small set of non-trivial attributes. Comput Biol Med 2014; 53: 9–18.

41.

Liu

Traskin

Lorch

, et al. Ensemble of trees approaches to risk adjustment for evaluating a hospital’s performance. Health Care Manage Sci 2014; 18: 58–66.

42.

Bertsimas

Bjarnadóttir

Kane

, et al. Algorithmic prediction of health-care costs. Operat Res 2008; 56: 1382–1392.

43.

Dai

Brisimi

Adams

, et al. Prediction of hospitalization due to heart diseases by supervised learning methods. Int J Med Inform 2015; 84: 189–197.

44.

Futoma

Morris

Lucas

A comparison of models for predicting early hospital readmissions. J Biomed Inform 2015; 56: 229–238.

45.

Zheng

Zhang

Sang

, et al. Predictive modeling of hospital readmissions using metaheuristics and data mining. Exp Syst Appl 2015; 42: 7110–7120.

46.

Yeh

Tsao

CW.

Using data mining techniques to predict hospitalization of hemodialysis patients. Decision Support Systems 2011; 50: 439–448.

47.

Kose

Gokturk

Kilic

An interactive machine-learning-based electronic fraud and abuse detection system in healthcare insurance. Appl Soft Comput 2015; 36: 283–299.

48.

Yan

Shi

Gong

RR.

Influencing factors analysis of cancellation of inpatient-elective operation. Chin Hosp Manage 2016; 36: 40–41.

49.

Breiman

Random forests. Mach Learn 2001; 45: 5–32.

50.

Cortes

Vapnik

Support-vector networks. Mach Learn 1995; 20: 273–297.

51.

Breiman

. Bias, variance, and arcing classifiers. Technical report no. 460. Berkeley, CA: University of California.

52.

Chen

Guestrin

XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, 13–17 August 2016, pp. 785–794. New York: ACM Press.

53.

Xiao

Huang

, et al. Feature-selection-based dynamic transfer ensemble model for customer churn prediction. Knowl Inform Syst 2015; 43: 29–51.

54.

Xiao

Xie

, et al. Dynamic classifier ensemble model for customer classification with imbalanced class distribution. Exp Syst Appl 2012; 39: 3668–3675.

55.

Chawla

Bowyer

Hall

, et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 2002; 16: 321–357.

Machine learning for identification of surgeries with high risks of cancellation

Abstract

Keywords

Introduction

Data and methods

Data source and description

Cancellation record

First surgery of a surgeon

First surgery in an operating room (OR)

Holiday

Number of days in admission

Predictor selection

Applied models and experimental setup

Evaluation criteria

Results

Discussion

Conclusion

Footnotes

Appendix 1

Acknowledgements

Authors’ note

Declaration of conflicting interests

Funding

ORCID iD

References