Abstract
BACKGROUND:
Despite the promising effects of robot-assisted gait training (RAGT) on balance and gait in post-stroke rehabilitation, the optimal predictors of fall-related balance and effective RAGT attributes remain unclear in post-stroke patients at a high risk of fall.
OBJECTIVE:
We aimed to determine the most accurate clinical machine learning (ML) algorithm for predicting fall-related balance factors and identifying RAGT attributes.
METHODS:
We applied five ML algorithms— logistic regression, random forest, decision tree, support vector machine (SVM), and extreme gradient boosting (XGboost)— to a dataset of 105 post-stroke patients undergoing RAGT. The variables included the Berg Balance Scale score, walking speed, steps, hip and knee active torques, functional ambulation categories, Fugl– Meyer assessment (FMA), the Korean version of the Modified Barthel Index, and fall history.
RESULTS:
The random forest algorithm excelled (receiver operating characteristic area under the curve; AUC = 0.91) in predicting balance improvement, outperforming the SVM (AUC = 0.76) and XGboost (AUC = 0.71). Key determinants identified were knee active torque, age, step count, number of RAGT sessions, FMA, and hip torque.
CONCLUSION:
The random forest algorithm was the best prediction model for identifying fall-related balance and RAGT determinants, highlighting the importance of key factors for successful RAGT outcome performance in fall-related balance improvement.
Introduction
Post-stroke falls are a leading cause of long-term disability and are implicated in hip and knee muscle weakness, dynamic balance, and sensorimotor impairment (Kligyte et al., 2003; Marigold et al., 2004; Nobile et al., 2014). The absence of appropriate management of post-stroke falls can result in serious secondary problems (Batchelor et al., 2012; Dockery & Sommerville, 2015). Additionally, age, sex, sensorimotor recovery, dynamic balance, gait speed, and a history of falls have been identified as important demographic and clinical attributes of post-stroke falls (Campbell & Matthews, 2010; Foster et al., 2018).
To mitigate post-stroke falls, we developed a novel robot-assisted gait training (RAGT) system, which is designed to provide dynamic balance and gait training using impedance force and assist-as-needed mode in individuals at a high risk of fall after stroke (Seo et al., 2018; Aprile et al., 2022; Calafiore et al., 2022). Current RAGT systemic evidence suggests that RAGT is effective and promising for improving balance and gait function in post-stroke rehabilitation; however, fall-related and RAGT attributes have not been identified in individuals with post-stroke at a high risk of fall (Tedla et al., 2019; Kuo et al., 2021; Loro et al., 2023). Moreover, the exact determinant factors associated with fall-related balance clinical outcome measures and RAGT parameters that potentially best contribute to recovery outcomes in post-stroke patients at a high risk of fall remain unknown (Lu et al., 2021; Yang et al., 2021; Jonsdottir et al., 2023).
Fall-related balance clinical outcome measures include the Berg Balance Scale (BBS), Functional Ambulation Categories (FAC), Fugl– Meyer assessment (FMA), Korean version of the Modified Barthel Index (KBMI), and fall history (Maeda et al., 2009; Hiengkaew et al., 2012; Shin et al., 2013). The RAGT parameters encompass walking speed, steps, hip and knee active torques, and number of RAGT sessions (Tanaka et al., 2019; Park et al., 2022). Hence, accurately predicting and identifying fall-related balance and gait factors, as well as determining the most effective RAGT attributes for post-stroke falls is needed (Lee & Jung, 2017; Jonsdottir et al., 2023; Abdollahi et al., 2024).
Contemporary machine learning (ML) studies have demonstrated an outstanding precision in predicting and identifying important determinants associated with the effectiveness of RAGT properties on gait recovery, which has helped improve clinical decision-making for robotic stroke rehabilitation (Kuo et al., 2021; Wardhana et al., 2023). Nevertheless, the accurate prediction of fall-related balance and gait attributes and the identification of RAGT attributes in post-stroke patients at a high risk of fall warrant further investigation. We aimed to ascertain five clinical ML algorithms, including logistic regression, random forest, decision tree, support vector machine (SVM), and extreme gradient boosting (XGboost), to best predict fall-related balance factors accurately and to identify the most effective RAGT attributes in individuals with post-stroke at high risk of fall.
Materials and methods
Data collection
This retrospective analysis involved data collected from 163 patients with subacute stroke who underwent Walkbot training at Myongji Choonhey Rehabilitation Hospital between March 2018 and December 2023. After ensuring data completeness and integrity, the sample was narrowed to 105 adult patients diagnosed with subacute or chronic stroke. The inclusion criteria were as follows: (1) first occurrence of ischemic or hemorrhagic stroke; (2) completion of at least 10 sessions of stroke rehabilitation intervention; (3) an initial BBS score of < 21; (4) an initial FAC score of < 4; and (5) no other complicating neurological conditions such as dementia or brain tumors. The exclusion criteria were as follows: (1) participation in any other medical or rehabilitation study within the past 6 months; (2) receipt of botulinum toxin injections within the last 3 months; (3) presence of severe verbal, cognitive, or visual deficits, as identified by the National Institutes of Health Stroke Subscale; and (4) any prior surgical interventions that could influence balance or gait. Figure 1 illustrates the methodological flowchart used in this study. This study was approved by the Institutional Review Board of Myongji Choonhey Rehabilitation Hospital (MJCHIRB-2023-02) on October 04, 2023.

Flowchart of model development for predicting the effectiveness of robot-assisted gait training for patients with stroke.
The stroke rehabilitation intervention combined conventional stroke neurorehabilitation (CSN) and RAGT, each lasting for 30 min per session, totaling 60 min daily, 5 d a week. CSN incorporates proprioceptive neuromuscular facilitation (PNF) and neurodevelopmental treatment (NDT) techniques to improve sensorimotor recovery and function (Krukowska et al., 2016; Smedes et al., 2016; Smedes & da Silva, 2019).
Balance improvement was assessed using the BBS, which included 14 items rated on a scale of 0 (unable to perform) to 4 (performed with ease) (Wang et al., 2021; Alshahrani & Reddy, 2024). A high risk of fall in post-stroke patients is operationally defined as a BBS score < 20 (Berg, 1992; Eichler et al., 2022). Patients were categorized into high or low improvement groups based on the minimal clinically important difference, defined as a 12.5-point change in the BBS (Song et al., 2018). The distribution of changes in BBS scores was analyzed, with changes > 12.5 points categorized as high improvement (55 patients, 52.4%) and changes of ≤12.5 points as low improvement (50 patients, 47.6%).
Prediction model selection and training
This phase aimed to predict balance improvement by selecting the optimal number of input sessions based on the highest area under the receiver operating characteristic (ROC) curve (AUC) in the test set. A ten-fold cross-validation was used to develop and validate Model 1 using five ML algorithms: logistic regression, random forest, decision tree, SVM, and XGboost. These models underwent a 10-fold cross-validation process on the training dataset to ensure a robust and unbiased evaluation. These predictive models were developed and evaluated using Python (version 3.11.0; Python Software Foundation) leveraging the latest computational techniques to enhance the analytical rigor of the study.
Model evaluation
The model performance was evaluated based on accuracy, recall, precision, F1 score, and AUC curve (Wardhani et al., 2019). The accuracy was calculated as the sum of true positives and negatives divided by the total number of predictions (Ali et al., 2023). Recall assesses the proportion of correctly identified true positives, whereas precision measures the accuracy of positive predictions (Rostampour, 2023). The F1 score, which represents the harmonic mean of the precision and recall, was used for a balanced assessment of the model’s performance (Liao et al., 2022). The AUC-ROC demonstrates the discriminatory power of the model across all classification thresholds (Rostampour, 2023). Together, these metrics provide a comprehensive assessment of the model’s predictive performance, from its overall accuracy to its balance between precision and recall to its discriminative power, as illustrated by the AUC-ROC (Xiong et al.,2024).
Statistical analysis
We employed a descriptive statistical analysis to differentiate patients based on their level of improvement following RAGT, incorporating clinical data and parameters derived from the RAGT sessions, employing the BBS as a comparative measure of patient progress from the initial session to the final session before discharge. The dataset included a mix of continuous variables such as age, number of RAGT sessions, FAC, FMA, KBMI, fall history, and categorical variables such as the affected brain side, sex, and diagnosis. Means and standard deviations were calculated for continuous variables, and frequencies and percentages were used for categorical variables. The independent t-test and chi-square test were applied for continuous and categorical variables, respectively, to compare the baseline characteristics between the improved and nonimproved groups. Statistical significance was set at p < 0.05. All statistical analyses were performed using the SPSS software (version 27, IBM, Chicago, IL) to ensure a robust and comprehensive evaluation of the data.
Results
RAGT parameters
Descriptive statistical analyses of the continuous and categorical variables of the 105 patients, and the means and standard deviations of all variables are provided in Tables 1 and 2. Significant differences were observed between the low- and high-improvement groups regarding patient age and history of falls. Regarding the RAGT parameters, significant differences were noted between the two groups in first left knee torque and last right knee torque.
Descriptive statistics of continuous variables of the improvement and no-improvement groups in RAGT
Descriptive statistics of continuous variables of the improvement and no-improvement groups in RAGT
*p < 0.05; RAGT, robot-assisted gait training; FMA, Fugl– Meyer assessment; KBMI, Korean version of the Modified Barthel Index; LT, left; Rt, right; SD, standard deviation.
Descriptive statistics of categorical variables of the improvement and no-improvement groups in RAGT
*p < 0.05; RAGT, robot-assisted gait training; FAC, functional ambulation category; CVA, cerebral vascular accident.
We assessed five ML models— logistic regression, random forest, decision tree, SVM, and XGboost— using patient data to predict balance improvement after Walkbot training. The models were evaluated using performance metrics including AUC, accuracy, sensitivity, and specificity (Table 3). Random forest was the best method, achieving an AUC of 0.803 and an accuracy of 68.9% during training; the evaluation of the test set demonstrated an accuracy of 79%, precision of 0.74, F1 score of 0.77, and a notable ROC AUC of 0.91 (Table 4 and Fig. 2).
Machine learning model predictive accuracy for BBS improvements
Machine learning model predictive accuracy for BBS improvements
Prediction performance of the model using clinical data and RAGT parameters for improvement by ten-fold cross-validation. BBS, Borg Balance Scale; AUC, area under the curve; RAGT, robot-assisted gait training.
Machine learning model predictive accuracy of test set for BBS improvements
BBS, Borg Balance Scale; AUC, area under the curve.

Retrained machine learning model results. The receiver operator characteristic curve is used to evaluate the merit of the classification model.
As shown in Table 5, the precision to predict patients who would show low improvement (Class 0) was calculated at 0.80. The recall for this class was 0.89. For the prediction of high patient improvement (Class 1), the model achieved a marginally superior precision of 0.83. Conversely, recall was lower at 0.71. The resulting F1 score for this category stood at 0.77. The overall accuracy of the model was 0.81, indicating that it correctly predicted 81% of the test set outcomes. The macro-average precision was 0.82, with an almost parallel average recall of 0.80, yielding a macro-average F1 score of 0.81.
Classification efficacy in predicting balance improvement
Classification efficacy in predicting balance improvement
Figure 3 shows that the last session knee torque is the attribute considered the most important within the random forest model, followed by the first-session knee torque, age, last step rate, number of RAGT sessions, post-FMA, first hip torque, last hip torque, KBMI, and pre-FMA.

The 10 important features in random forest. RAGT, robot-assisted gait training; FMA, Fugl– Meyer assessment; KBMI, Korean version of the Modified Barthel Index.
The random forest algorithm was excellent in the prediction of fall-related balance improvement. Most importantly, knee active torque or strength, followed by age, steps, number of RAGT sessions, FMA, and hip torque, were identified as major determinants. The present investigation is the first clinical ML study to highlight the importance of key factors for successful RAGT outcome performance in fall-related balance and gait improvement.
The five outstanding ML algorithms were implemented to accurately predict and identify fall-related balance factors and RAGT determinants in 105 hemiparetic stroke datasets. The fall-related balance factors, including BBS score and RAGT determinants encompassing speed, steps, and hip and knee active torques recorded during RAGT sessions, were used along with the clinical variables (FAC, FMA) to identify key predictors of RAGT neurorehabilitation. This finding is consistent with earlier clinical ML evidence that demonstrated that the random forest algorithm (AUC = 0.91) was excellent in predicting fall-related balance improvement, followed by SVM (AUC = 0.762) and XGboost (AUC = 0.708) (Kuo et al., 2021; Campagnini et al., 2022). Kuo et al. (2021) demonstrated that the random forest algorithm (AUC = 0.879) was excellent in predicting the performance of the model for FAC improvement, followed by XGboost (AUC = 0.854) and logistic regression (AUC = 0.853) (Kuo et al., 2021). Campagnini et al. (2022) confirmed that the forest algorithm had the best overall results in terms of accuracy (76.2%) of predicting models for the functional prognosis of post-stroke patients, followed by SVM (72.6%) and logistic regression (64.1%) (Campagnini et al., 2022). Contemporary ML evidence suggested that, with AUC values of ≥0.80 representing excellent discrimination and values between 0.70 and 0.79 considered acceptable, these metrics are particularly valuable in situations that require greater precision (Mar et al., 2020). Moreover, the random forest algorithm demonstrated superior performance compared with the four other ML models in this study. This advantage is primarily attributed to the ensemble method, which effectively manages complex datasets and data irregularities, including overfitting, outliers, and incomplete data (Liu et al., 2012; Schonlau & Zou, 2020).
The random forest model identified five major determinants of balance, including knee active torque from the last and first sessions, followed by age, step rate, and number of RAGT sessions. Knee active torque or strength was the most important attribute, which was potentially enhanced by the application of RAGT. Similar to our findings, Marques et al. (2017) suggested that increased active knee torque improves balance performance (Marques et al., 2017). Age is another determining factor, as evidenced by older post-stroke patients with lower BBS balance scores at admission (Maeda et al., 2009). Additionally, Meyer et al. suggested that older age and greater stroke severity negatively affect functional and motor recovery (Meyer et al., 2015). The last step rate and number of RAGT sessions were also important determinants of fall-related balance. The last step rate variable might be indicative of the degree of recovery of the patient’s gait asymmetry. Gait asymmetry is correlated with stride length and balance measures (r = 0.39 to 0.54), suggesting an association between gait asymmetry and falls after stroke (Lewek et al., 2014). Moreover, our results are consistent with those of previous studies showing a relationship between balance ability and step variables, including gait speed, step width, and symmetry, in patients with hemiplegic stroke (r=-0.36 to 0.63) (Lewek et al., 2014; Liu et al., 2016; An et al., 2017).
The number of RAGT sessions was identified as an important determinant. Earlier RAGT studies suggest that an increased number of RAGT sessions is positively associated with fall-related balance improvements (Swinnen et al., 2014; Chung, 2017). Straudi et al. found that stroke patients who achieved higher functional recovery spent more time in the hospital and received more RAGT sessions than those who did not (Straudi et al., 2020). This finding supports the idea that the RAGT stroke rehabilitation strategy enhances hip and knee muscle strength (as evidenced by hip and knee torque) in the paretic limb, leading to improvements in fall-related balance control, gait symmetry, and speed determinant factors.
Study limitations
The limitations of this study should be considered in future research. One major limitation is that our study included a heterogeneous sample size and number of RAGT intervention sessions. Another limitation is that we focused on changes in fall-related balance in response to RAGT parameters and clinical outcome measures. Careful interpretation should be applied when generalizing our RAGT parameters for subacute post-stroke neurorehabilitation. Nevertheless, this is the first clinical evidence highlighting an important predictive model and identifying the cardinal determinants of RAGT intervention.
Conclusion
Our clinical machine model data demonstrated that the random forest algorithm was the best prediction model for identifying fall-related balance factors and RAGT determinants, highlighting the importance of key factors for successful RAGT outcome performance in fall-related balance and gait improvement. Our results provide important clinical insights into how clinical ML can help accurately identify fall-related balance factors and important attributes of RAGT protocols when designing effective and sustainable RAGT strategies in post-stroke individuals at a higher risk of falls.
Footnotes
Acknowledgments
The authors have no acknowledgments.
Conflict of interest
The authors have no conflicts of interest to declare.
Funding
This study was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korean Government (MSIT) (Grant No. RS-2023-00221762); Regional Innovation Strategy (RIS) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE) (2022RIS-005); the “Brain Korea 21 FOUR Project”; the Korean Research Foundation for the Department of Physical Therapy in the Graduate School of Yonsei University (Grant No. 2021-51-0151); and the Institute for Project-Y Seed Grant of 2023 (Grant No. 2023 -22 -0277).
Ethics statement
This study was approved by the Institutional Review Board of Myongji Choonhey Rehabilitation Hospital (MJCHIRB-2023-02) on October 04, 2023.
Data availability
The datasets used and/or analyzed in the current study are available from the corresponding author upon reasonable request.
