Abstract
Objectives:
Machine learning has emerged as a potential tool for improving outcome prediction accuracy in anterior cruciate ligament (ACL) reconstruction (ACLR). A machine learning-based ACL revision prediction model has been developed using data from the Norwegian Knee Ligament Register (NKLR) but lacks external validation outside of Scandinavia. This study aimed to assess the external validity of the previously published ACL revision prediction model (https://swastvedt.shinyapps.io/calculator_rev/) using data from the Stability 1 randomized clinical trial. The hypothesis was that the model's performance would be similar, indicating validity of the algorithm.
Methods:
This was a level 3 cohort study. The Cox Lasso prediction model from the original study was selected for external validation owing to its superior performance during model development and internal validation testing. The Stability 1 trial was an external randomized controlled trial that assigned patients to receive either hamstring tendon (HT) autograft or HT with lateral extra-articualr tenodesis (LET). Patients from Stability 1 with all 5 predictors required by the Cox Lasso model were included. Since all patients in the Stability 1 trial received hamstring HT plus/minus LET, three configurations were tested: 1: all patients coded as HT, 2: HT + LET group coded as bone-patellar tendon-bone (BPTB) autograft, 3: HT + LET group coded as unknown/other graft choice. The different configurations were tested because previous evaluation of the Stability 1 cohort by the Multicenter Orthopaedic Outcomes Network (MOON) suggests failure rates of HT + LET are most similar to the failure rates of patients who receive BPTB. Concordance and calibration were calculated to determine model performance with censoring of the time-to-event outcome.
Results:
In total, 591 patients from the Stability 1 trial were included, and 39 patients (6.6%) underwent revision surgery within 2 years. Model performance was best when patients randomized to HT + LET were coded as BPTB. Validation concordance was similar to the original NKLR prediction model for 1- and 2-year revision prediction (Stability: 0.71; NKLR: 0.68-0.69). Concordance CI ranged from 0.63-0.79. The model was well calibrated for 1-year prediction while the 2-year prediction demonstrated evidence of miscalibration.
Conclusions:
The most important finding of this study was that when patients in the Stability 1 trial who received HT + LET were coded as BPTB in the Norwegian prediction calculator, the model concordance was similar to the index study. However, the CI for the validation set was wider than the original model, suggesting that more data is required to definitively determine the external validity. The finding that HT + LET behaved most similarly to BPTB supports the notion that the addition of LET can decrease the failure and subsequent revision rate of primary ACLR when HT is used. In addition, if further external validation efforts are consistent with this study, the prediction model may require updating to equate HT + LET to BPTB for the purposes of prospectively estimating revision surgery risk.
