Abstract
Background
Most cardiac surgery clinical prediction models (CPMs) are developed using pre-operative variables to predict post-operative outcomes. Some CPMs are developed with intra-operative variables, but none are widely used. The objective of this systematic review was to identify CPMs with intra-operative variables that predict short-term outcomes following adult cardiac surgery.
Methods
Ovid MEDLINE and EMBASE databases were searched from inception to December 2022, for studies developing a CPM with at least one intra-operative variable. Data were extracted using a critical appraisal framework and bias assessment tool. Model performance was analysed using discrimination and calibration measures.
Results
A total of 24 models were identified. Frequent predicted outcomes were acute kidney injury (9/24 studies) and peri-operative mortality (6/24 studies). Frequent pre-operative variables were age (18/24 studies) and creatinine/eGFR (18/24 studies). Common intra-operative variables were cardiopulmonary bypass time (16/24 studies) and transfusion (13/24 studies). Model discrimination was acceptable for all internally validated models (AUC 0.69-0.91). Calibration was poor (15/24 studies) or unreported (8/24 studies). Most CPMs were at a high or indeterminate risk of bias (23/24 models). The added value of intra-operative variables was assessed in six studies with statistically significantly improved discrimination demonstrated in two.
Conclusion
Weak reporting and methodological limitations may restrict wider applicability and adoption of existing CPMs that include intra-operative variables. There is some evidence that CPM discrimination is improved with the addition of intra-operative variables. Further work is required to understand the role of intra-operative CPMs in the management of cardiac surgery patients.
Introduction
The ability to reliably predict post-operative mortality and morbidity in patients undergoing cardiac surgery helps support shared decision making and risk stratification. 1 This is especially important to clinicians when attempting to determine the most appropriate treatment option for each patient. The application of clinical prediction models (CPMs) in cardiac surgery has helped to improve patient selection and risk-adjusted outcome analysis and has been a key feature in both institutional benchmarking and quality improvement programmes. 2 CPMs developed to predict post-operative outcomes in patients undergoing cardiac surgery have typically used pre-operative variables only, because they are primarily used as part of the pre-operative decision-making process.
The EuroSCORE II model, 3 recognised as one of the most frequently used cardiac surgery CPMs across the UK and Europe, was designed to pre-operatively predict post-operative mortality but includes a small number of variables which could technically be modified intra-operatively. 4 While inclusion of a variable such as the extent of surgery can largely be anticipated pre-operatively, the inclusion of intra-operative variables which cannot be predicted or calculated prior to surgery would render a model unsuitable for use as part of pre-operative work-up. However, the inclusion of intra-operative variables in CPMs could allow for the updating of a predicted risk estimate initially calculated using pre-operative variables. 5 Information from CPMs that incorporate intra-operative data could be used to facilitate post-operative clinical management and decision making.
The emergence of new electronic health data platforms within the clinical environment that can capture complex intra-operative data means that CPMs that utilise this information can be readily calculated. 5 Intra-operative data are likely to more accurately reflect surgical complexity, provide information on significant unexpected intra-operative events and capture the individual physiological response to the insult of surgical and anaesthetic intervention. 6 Information on these parameters could help to optimise CPMs for the prediction of specific modifiable post-operative outcomes, such as acute kidney injury.
Several CPMs for cardiac surgery that include intra-operative variables have previously been developed but none are widely used in clinical practice. The objective of this review was to systematically identify developed cardiac surgery CPMs that include intra-operative variables and assess their quality and characteristics to understand potential reasons for a lack of clinical adoption.
Methods
Search strategy
This systematic review was undertaken in conjunction with a medical librarian. It has been registered with PROSPERO (International Prospective Register of Systematic Reviews, CRD42021277013) and was based on the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. Separate literature searches of the MEDLINE (searched using OVID, the online library of databases) and EMBASE databases were undertaken in order to identify studies published between inception of the databases and April 2022. The search topics used adapted medical subject heading (MeSH) terms, keywords and wildcards. Search terms used were “cardiac surgery”, “preoperative”, “intraoperative”, “perioperative”, “mortality”, “morbidity” and other prediction model terminology based on Geersing et al. 7 The search strategy used can be found in the Supplemental Material.
Selection criteria
Articles were screened by title and abstract. Inclusion criteria were studies that included adult patients with acquired heart disease, who had received cardiac surgery. Exclusion criteria included CPMs developed specifically for paediatric surgery, adult congenital surgery, cardiopulmonary transplantation or surgery for mechanical circulatory support.
Post-operative outcomes to be assessed could include short-term mortality or morbidity with CPMs developed for longer-term outcomes excluded. Abstracts were screened and the full texts of those that were considered relevant were subsequently evaluated for suitability. Only articles describing the development of a predictive CPM including at least one intra-operative variable were included. Studies performing univariable or multivariable analysis but not undertaking development of a prediction model were not included. Predictive CPMs were defined when the purpose of the multivariable analysis was to detect the optimal combination of risk factors through association, that best predict a current diagnosis or future event. Studies describing only external validation of a model were excluded.
Data extraction
All studies identified as potentially suitable following review of titles and abstracts were analysed in full by two investigators (CJ & MT). The reference lists of included studies were also reviewed in full. Any disagreements over inclusion of studies between the two reviewers were resolved by discussion by a third reviewer (SWG). The Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS) framework was adhered to when documenting and evaluating these studies. 8 Information on data source, study date, participants, outcome of interest, predictors associated with the outcome, their availability at the time of prediction and included in the final model, cohort sample size, methods of handling missing data, model development, events per predictor parameter (EPP) and model performance were extracted. 8 Risk of bias and study quality were assessed in each prediction model using both, the Prediction model Risk Of Bias Assessment Tool (PROBAST) and Quality in Prognostic Factor Studies (QUIPS) instruments. 9
Model analysis
Model performance was evaluated through measures of discrimination and calibration. Model discrimination, referring to the ability of the model to differentiate between patients who do and do not experience the event, is generally assessed by the Area Under the Curve (AUC). An AUC of 1 represents perfect discrimination and a value of 0.5 indicates that the model is no better than chance. An AUC ≥0.7 is deemed acceptable and an AUC ≥0.8 represents excellent discriminatory ability. 10 Calibration refers to how closely predicted outcomes match observed outcomes. This can be calculated for the cohort as a whole, by calculating the observed to expected (O:E) ratio. Other measures include the Hosmer-Lemeshow (H-L) test, flexible calibration plots, calibration-in-the-large and calibration slope. 10 Finally, the number of variables included in a model must be considered in relation to the number of events in the cohort used to develop the model. Traditionally, a minimum EPP of at least 10 was recommended, although more sophisticated measures to determine sample size are now also available. 11
Results
Selected studies
In total, the literature search returned 5352 articles (Medline n = 1991, Embase n = 3361). After exclusion of duplicates and non-English language articles, 4009 articles remained with 30 articles identified for full assessment. Following the identification of additional studies from the reference lists of articles reviewed in full and exclusions a total of 24 studies remained for analysis.3,12–34 The study selection process is detailed in Figure 1. In total, 21 models were developed using retrospective data, with the other three models developed using prospectively collected data. Models were developed using datasets ranging in size from 168 to 378,572 and included patients undergoing surgery between 2000 and 2022. The median sample size for model development was 2446. Full details of CPM development and performance are detailed in Table 1. Data extraction from Embase and Medline searches. Model performance for cardiac surgery prediction models with preoperative and intraoperative variables Int = Internal Validation, NR = Not Recorded, Disc = Discrimination, AUC = Area under the curve, EPCP = Events Per Candidate Predictors, EPP = Events per Predictor per Parameter, Ext = External Validation, D = Development, Cal: Calibration, HL = Hosmer-Lemeshow test, RSS = Random Split Sampling, Ren. Fail = Renal Failure, V = Validation, O/E: Observed to Expected Ratio, BS = Bootstrapping, C = Concordance Index, LR = Logistic Regression* Models compared in the same dataset, ML = Machine Learning, LIR = Linear Regression, XGB = XG Boost.
Variable selection and inclusion
The number of variables included in the 24 final models ranged from 4 to 40, with a median number of 9 variables per model. EPPs ranged from 0.70 to 975, with an EPP <10 identified in ten studies. Predictors were selected for multivariable modelling using methods including backwards stepwise selection algorithm (n = 5), forwards stepwise selection algorithm (n = 5), forwards and backwards stepwise selection algorithm (n = 5) and LASSO regression for predictor selection (n = 5), with four models not recording their method of predictor selection.
Summary of pre-operative variables included in the development studies.
*Variables not described in research paper. a predictors only evident in single model; b same predictors previously described in precursor model:
The following risk factors were only observed in a single model: Single model risk factor =
Summary of intra-operative variables included in the development studies.
a predictors only evident in single model.
The following risk factors were only observed in a single model: Single model risk factor =
Summary of postoperative endpoints from the models selected and the intraoperative variables common to each individual endpoint.
Outcomes
Amongst the studies identified, 3 CPMs were developed solely to predict mortality endpoints.3,12,13 Mortality endpoints included in-hospital and 30-day mortality and a composite endpoint comprising both. A further 18 models were developed to predict morbidity endpoints.14–31 The most common morbidity endpoints were acute kidney injury (9/18 studies) and post-operative pneumonia (3/18 studies). Additional morbidity endpoints included renal dysfunction, (including both acute kidney injury [AKI] and the need for renal replacement therapy), acute renal event, neurological complications, low cardiac output syndrome (LCOS), atrial fibrillation, multi-organ dysfunction (MOD), respiratory complications, pneumonia, re-operation and post-operative length of stay. Three models included both mortality and morbidity endpoints.32–34
CPM performance
Calibration was assessed in 16 of the 24 models and was claimed to be acceptable in 15. However, a number of different measures of calibration were used across the models identified, several of which are now known to be problematic. These include the H-L test (used in 12 studies in this review), which cannot provide any information on either the extent or direction of miscalibration. 1 Superior measures of calibration, including flexible calibration plots, calibration-in-the-large and calibration slope were used in only one study, meaning that acceptable calibration results should be interpreted with a degree of caution.
Added value of intra-operative variables
Six of the models undertook a nested model comparison to compare model performance when only pre-operative variables were included versus model performance once intra-operative variables were added.12,13,22,32–34 All of the models identified demonstrated improved discriminatory ability once intra-operative variables were added to the initial model comprised solely of pre-operative variables. Only two of the studies undertook analysis to determine whether the difference in AUC between models was statistically significant. Liu et al. developed models to predict LCOS, AKI and MOD. The AUCs improved significantly with the addition of intra-operative variables (LCOS: 0.57 to 0.82, p < .01; AKI: 0.69 to 0.78, p < .01; MOD: 0.66 to 0.77, p < .01). Durant et al. used data from 2905 patients to develop a model with a composite endpoint of mortality or re-operation. The discrimination of the model improved from 0.75 to 0.79 once intra-operative variables were added. The improvement was found to be statistically significant (p < .01).
Study quality
Risk of bias for cardiac surgery prediction models (Determined using PROBAST framework).
Frequent criteria not met included the handling of missing data, which was associated with a high risk of bias in eleven studies, whilst a further nine studies did not discuss how missing data was handled. All the studies identified described internal validation of the models developed.3,12–34 Seventeen models were assessed by random split sampling (commonly found in data preparation of prediction models), recognised as inefficient as it reduces the available sample size.12–14,16,17,19–21,24,25,27,30–34 Two models used cross-validation3,28 and the other five models using bootstrapping.15,22,23,26,29
Quality in Prognostic Factor (PF) Studies (QUIPS tool).
Discussion
This systematic review has identified 24 CPMs developed for use in adult cardiac surgery designed to predict short-term mortality and morbidity after cardiac surgery that include intra-operative variables. The most common variables were age and creatinine/eGFR, whilst the most frequent intra-operative variables were CPB time and RBC transfusion. The most common outcomes associated with the identified CPMs were acute kidney injury and mortality.
Overall model performance in terms of discrimination was broadly acceptable across the studies, with all models showing moderate to good discrimination on internal validation. Methodological issues were apparent in a number of the studies, with one-third of the models not reporting calibration. This is a common issue in CPM research and prevents a full assessment of model performance. Indeed, even when calibration is assessed, statistically unsound methods such as the H-L test are frequently utilised. Several models in this study have been externally validated,3,24,27 however a full assessment of these external validations was beyond the scope of this review.
Some evidence was identified to suggest that model performance is improved with the addition of intra-operative variables. This is particularly apparent in the six nested models,12,13,22,32–34 where the models including intra-operative variables had better discrimination values compared with the baseline model, with improvements ranging between 0.025 to 0.25. Whilst only two studies strengthened this potentially beneficial effect by undertaking formal statistical analysis, in both cases it was found that addition of intra-operative variables led to a statistically significant improvement in the discriminatory ability. However, neither of these studies recorded calibration and both were single centre studies (n = 930 & n = 2905).
Most of the included models were observed to have a high or unclear risk of bias, as a result of issues with reporting and methodological quality. Only one model by Kalisnik et al. 28 was assessed as low risk of bias. These issues included inadequate sample sizes for model development and validation, split sampling for internal validation, omitting missing data at study initiation or analysis, univariable selection, and categorising continuous predictors. While high risk of bias does not mean that the models are unsuitable for use in clinical practice, they should generally be used with caution and only following successful external validation.
The findings of this review mirror the results of other systematic reviews of prediction models that have included pre-operative and intra-operative variables. Although to the authors’ knowledge, no review exists within cardiac surgery. In a review conducted by Grantham et al. 36 that observed combined intra-operative risk models in oesophagectomy surgery, no model could be confidently recommended for clinical use and all required further validation. A review of blood transfusion models in elective surgery by Dhiman et al. 37 concluded that the poor methodological quality and study reporting of models meant that none of the models could be considered for clinical practice without further research. A review of pre-operative and intra-operative scores used in colorectal surgery for surgical decision making by Venn et al. 38 found that calibration assessment was not always performed when assessing model performance, but when it was, sub-optimal calibration metrics were used.
There appears to be a limited number of externally validated models used in routine clinical practice that predict short term outcomes that use both pre-operative and intra-operative variables. The only model that incorporates intra-operative variables and is generalisable to different populations is the Euroscore II. 3 This model has undergone multiple external validations across a diverse range of time periods, geographical locations and patient populations. Despite the inclusion of intra-operative variables, the EuroSCORE II is designed to be used pre-operatively and is not intended to be used to facilitate post-operative management. In the context of this work, precisely defining an intra-operative variable is of key importance. Whilst most variables are fairly easy to define as either “pre-operative” (i.e. demographics, comorbidities and investigations performed prior to surgery) or “intra-operative” (i.e. CPB and cross-clamp time), there are a number of variables which can potentially straddle both of these definitions. Type and extent of procedure falls within this area. Whilst the intended procedure (such as coronary artery bypass grafting) is discussed and confirmed prior to surgery, unexpected intra-operative findings or events can sometimes necessitate additional or alternative procedures being undertaken. Consequently, variables such as these should be considered as primarily pre-operative variables which may be modified intra-operatively.
This review has identified existing models currently developed for use in cardiac surgery, the risk of bias associated with them in their prediction ability and their applicability to daily practice. It has been conducted using PRISMA method of study search and selection strategy. A limitation of this study is that a detailed review of external validations of the CPMs identified has not been performed. Another potential limitation is the decision to only include studies in English. This means that a number of relevant studies written in other languages may have been excluded. By design, the models identified in the review had a tendency to focus on patients receiving coronary artery bypass graft, valvular and combined valve and coronary artery bypass graft procedures which may limit the generalisability of the findings.Although not included in this review due to its study population not meeting the inclusion criteria, an important study on this topic is the analysis by Newland et al. 35 of Australian and New Zealand Collaborative Perfusion Registry (ANZCPR) data, which demonstrated that CPB parameters improve the prediction of 30-day mortality. Despite this, the nature of the patient cohorts and outcome metrics across the studies remained relatively heterogeneous and limits direct comparisons between models. 38 Although, heterogeneous cohorts can limit direct comparisons of model performance between studies, heterogeneous validation cohorts are vital for a comprehensive assessment of CPM performance. In the future, better sharing of multiple patient cohorts underpinning studies such as these could allow for models to be comprehensively evaluated and compared. 39
Conclusion
This systematic review has identified 24 models designed to predict short-term outcomes after cardiac surgery that include intra-operative variables. Whilst a number demonstrated acceptable model performance, all except one had a high or unclear risk of bias. Thus, issues with model design may explain their lack of use in facilitating intra-operative or post-operative patient management. At least two studies were identified that demonstrate that the inclusion of clinically relevant intra-operative variables in CPMs may improve model performance. Further work is required if intra-operative data are to be incorporated into CPMs designed to facilitate patient management.
Supplemental Material
Supplemental Material - A systematic review of cardiac surgery clinical prediction models that include intra-operative variables
Supplemental Material for A systematic review of cardiac surgery clinical prediction models that include intra-operative variables by Ceri Jones, Marcus Taylor, Matthew Sperrin and Stuart W. Grant in Perfusion.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Appendix
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
