Abstract
Background
With the aging population, muscle weakness and physical decline are pressing public health concerns. Health outcomes are influenced by both physiological factors and social determinants of health; however, the interplay between these remains underexplored.
Objective
This study aimed to identify risk factors for muscle weakness and physical decline in older adults, integrating physiological and social determinants of health variables, and develop a predictive model for early risk assessment.
Methods
Using prospective China Health and Retirement Longitudinal Study data with 9-year follow-up data (baseline predictors from 2011, outcomes from 2020), logistic regression, recursive feature elimination, and XGBoost algorithms were applied to construct predictive models.
Results
Key risk factors included age, ethnicity, cognitive function, and physical activity. Social determinants of health variables such as marital status, life satisfaction, and educational level were significant predictors. SHapley Additive exPlanations analysis revealed that social determinants of health variables significantly enhanced model performance and interpretability.
Conclusion
Integrating social determinants of health with clinical indicators improves the prediction of muscle weakness and physical decline in older adults. The study highlights the need for personalized interventions that consider both physiological and social factors, offering valuable insights for public health policy and health management in the aging population.
Introduction
With the aging population, muscle weakness and physical decline have become significant public health concerns, which not only affect the quality of life but may also lead to disability, falls, and other serious complications, placing a heavy burden on individuals, families, and the society.1–3 Existing studies show that health status is influenced by a combination of factors, including physiological and metabolic mechanisms, as well as external factors such as social environment, economic status, educational level, and marital and family support. These factors are collectively referred to as social determinants of health (SDOH). 4 SDOH partly explain the unequal health status among different groups and provide an important theoretical foundation for public health policies and health interventions.
For a long time, traditional statistical methods have mainly focused on individual physiological or lifestyle factors, such as age, underlying diseases, and lifestyle habits. Although these methods can identify some health risks, they are insufficient to fully capture the impact of complex social backgrounds on health. 5 In recent years, an increasing number of studies have focused on the combined effect of SDOH on health outcomes. 6 For example, marital status, income, and social support have been confirmed to be closely related to chronic disease management, mental health, and quality of life. 7 These findings suggest that analyzing individual factors alone cannot adequately capture the multidimensional interactions influencing health, highlighting the need for interdisciplinary research approaches to uncover these complex relationships. 8
The China Health and Retirement Longitudinal Study (CHARLS) provides comprehensive data resources for the study of SDOH, covering various aspects such as demographic characteristics, economic status, family structure, and social support among older adults. 9 These data not only enable a better understanding of individuals’ health status but also allow the exploration of health inequalities from the perspective of social environment and life background. 10 Based on this, this study uses CHARLS data, integrating statistical analysis with simplified machine learning methods to systematically investigate the factors influencing muscle weakness and physical decline in older adults. 11 The study aims were as follows: (a) to comprehensively identify multidimensional risk factors, including physiological and social environmental factors; (b) to explore the nonlinear relationships and interactions between SDOH variables and health outcomes; and (c) to develop a predictive model suitable for early risk assessment, providing a scientific basis for the formulation of personalized prevention and intervention strategies.
By adopting an interdisciplinary perspective, this study not only broadens the traditional boundaries of health research in the aging population but also offers novel ideas and approaches to address health inequalities among older adults. By integrating SDOH with traditional clinical indicators, this study aimed to reveal a more detailed spectrum of health risks, provide data for public health policy formulation, and drive innovation in health management models in the aging population.
Methods
This study investigated risk factors for muscle weakness 12 and physical decline 13 in older adults based on the CHARLS data, with a particular focus on the role of SDOH. Since 2011, CHARLS has used multistage probability sampling to cover 28 provinces, 150 counties, and 450 villages across the country, collecting basic information, family economic support, health status, health insurance, employment, assets, physical measurements, and blood samples from individuals aged ≥45 years through computer-assisted interviews. This prospective longitudinal cohort study utilized CHARLS data with a 9-year follow-up period. Baseline predictor variables were collected in 2011, and outcome measures were assessed in 2020 (wave 5), providing a clear temporal sequence for risk prediction modeling. The risk interval for both muscle weakness and physical decline was defined as the 9-year period from 2011 to 2020. After excluding participants with insufficient baseline data, missing key information, or no follow-up assessments, a total of 17,708 participants with complete baseline data from 2011 and outcome data from 2020 were included in the final analysis.
Muscle weakness was operationalized as a binary variable (0/1) based on grip strength below the 20th percentile for age and sex or walking speed <0.8 m/s. Regarding cutoff justification, the 20th percentile for grip strength was chosen to achieve optimal sensitivity/specificity in Asian populations (AWGS criteria), while walking speed of 0.8 m/s represents the international consensus threshold for mobility limitation and functional decline.
Physical decline was defined as functional deterioration from 2011 to 2020, operationalized as follows: worsening of activities of daily living (ADL) performance (≥1 point increase in dependency), worsening of instrumental activities of daily living (IADL) performance (≥2 points increase in dependency), or walking speed decrease of ≥0.1 m/s. Cut-off justification: These thresholds represent minimal clinically important differences validated for predicting institutionalization and mortality risk in longitudinal aging studies.
In addition to physiological and metabolic indicators, this study incorporates variables reflecting SDOH, such as hospitalization frequency, marital and fertility history, sex, age, length of hospital stay, body mass index (BMI), and ethnicity (Han or other). These variables are supported by existing literature demonstrating their correlation with health outcomes. This analysis aims to reveal the independent effects of social environment, family support, and cultural background on health risks in older adults, thereby providing a basis for personalized prevention.
Statistical analysis
Categorical variables were compared using chi-square tests and presented as frequencies and percentages. Continuous variables were transformed into quartiles (Q1–Q4) to improve analytical stability and facilitate clinical interpretation.
Predictive models were developed using three complementary approaches: (a) multivariable logistic regression 14 with stepwise selection to identify independent risk factors; (b) recursive feature elimination (RFE) 15 for optimal feature selection; and (c) XGBoost 16 algorithms for capturing nonlinear relationships and complex interactions.
Model performance was comprehensively evaluated using discrimination metrics (area under the curve–receiver operating characteristic (AUC–ROC), C-index, sensitivity, and specificity) and calibration metrics (Hosmer–Lemeshow test and Brier score). Internal validation was performed using bootstrap resampling (1000 iterations) to assess model stability. Model interpretability was enhanced using SHapley Additive exPlanations (SHAP) analysis to quantify individual variable contributions.
Statistical analyses were performed using R version 4.3.0 and Python 3.9 with XGBoost and SHAP packages. Statistical significance was set at p < 0.05.
For missing values of certain continuous variables (such as sleep duration, cognitive scores, and BMI) and binary variables (such as hypertension, diabetes, and sex), multiple imputation methods were applied. For binary variables, logistic regression imputation was used, and for continuous variables, K-nearest neighbor imputation was employed, ensuring that the imputed data matched the distribution of the original data.
This prediction model development study was conducted and reported in accordance with the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement guidelines. The large sample size (n = 17,708) provides adequate statistical power for developing stable prediction models with multiple predictors. The TRIPOD checklist documenting compliance with all relevant reporting items is provided as Supplementary Material.
This observational study was conducted and reported in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement guidelines. 17
Data protection
All participant data were deidentified prior to analysis to ensure complete anonymity and prevent individual identification in accordance with privacy protection requirements.
Results
Over the 9-year follow-up period (2011–2020), of the 17,708 participants with complete baseline and outcome data, 2968 (16.8%) developed muscle weakness and 10,224 (57.7%) experienced physical decline.
This study included 17,708 participants and conducted statistical analysis of the basic characteristics and intergroup analysis of muscle weakness and physical decline, revealing that both are the result of multifactorial influences. In the overall sample, 83.24% of the participants showed no muscle weakness, while 16.76% showed muscle weakness; moreover, 42.26% of the participants showed no physical decline, while 57.74% showed physical decline. Significant intergroup differences were observed between the two groups in multiple indicators (p < 0.05), including ethnicity, cardiovascular disease, stroke, marital status, exercise habits, life satisfaction, educational level, cognitive function, and ADL/IADL scores (Tables 1 and 2).
Baseline characteristics and risk factors for muscle weakness in older adults (n = 17,708).
*p < 0.05; *p < 0.001. Data are presented as n (%). χ2 tests were used for categorical comparisons. Q1–Q2: lower quartiles; Q3–Q4: upper quartiles; ADL: activities of daily living; IADL: instrumental activities of daily living; HbA1c: glycated hemoglobin.
Baseline characteristics and risk factors for physical decline in older adults (n = 17,708).
*p < 0.05; *p < 0.001. Data are presented as n (%). χ2 tests are used for categorical comparisons. Q1–Q2: lower quartiles; Q3–Q4: upper quartiles; ADL: activities of daily living; IADL: instrumental activities of daily living; HbA1c: glycated hemoglobin.
Multifactor logistic regression analysis
In the univariate and multivariate logistic regression analyses (Supplementary Materials; Tables 3 and 4), we identified common risk factors for muscle weakness and physical decline, including a history of stroke, disability status, and a higher frailty index, all of which significantly increased the risk of both conditions (muscle weakness: odds ratios (ORs) of 2.08, 1.36, and 1.99; physical decline: ORs of 1.75, 1.45, and 1.50, respectively). Additionally, each health condition exhibits its own unique characteristics. For muscle weakness, ethnicity (Han: OR = 1.35), old age (OR = 2.35), and cancer (OR = 1.77) are major risk factors, while being female had a protective effect (OR = 0.73). For physical decline, being female carried a higher risk (OR = 1.58), while minority ethnicity (OR = 0.60) and higher uric acid levels (OR = 0.84) showed protective effects. In terms of lifestyle, both conditions showed significant associations with physical activity patterns, but with different characteristics: moderate-intensity exercise was associated with a reduced odds of muscle weakness development (OR = 0.65, indicating 35% lower odds compared with that in sedentary participants), while physical decline demonstrated significant negative associations with both vigorous exercise (OR = 0.75) and moderate exercise (OR = 0.72) compared with that in inactive individuals. Additionally, higher IADL scores were closely associated with an increased risk of both muscle weakness and physical decline (muscle weakness OR = 1.06; physical decline OR = 1.10), indicating that reduced functional independence is associated with adverse health outcomes.
Single + multivariate logistic regression results for muscle weakness.
BMI: body mass index; CI: confidence interval; OR: odds ratio; SE: standard error; ADL: activities of daily living; IADL: instrumental activities of daily living; TyG index: triglyceride–glucose index.
Bold values indicate statistical significance (p < 0.05).
Single + multiple logistic regression results for physical decline.
BMI: body mass index; CI: confidence interval; OR: odds ratio; SE: standard error; ADL: activities of daily living; IADL: instrumental activities of daily living; TyG index: triglyceride–glucose index.
Bold values indicate statistical significance (p < 0.05).
Contribution of the SDOH variable
Among various risk factors, this study specifically focused on the role of SDOH,18,19 such as marital status, educational level, exercise habits, life satisfaction, and certain indicators of family economic support.20,21 These variables not only reflect the participants’ social environment and psychological state but also serve as important indicators of social support and family functioning. Logistic regression analysis revealed the following results: 1. Marital status is significantly associated with both muscle weakness and physical decline. Being married acts as a protective factor against muscle weakness (reduction in OR), while the distribution differences of marital status in physical decline also suggest that social support has a positive impact on maintaining function. 2. Life satisfaction and educational level, which represent an individual’s socioeconomic status, played an important role in the physical decline model, with higher life satisfaction and better social adaptability significantly lowering the risk. 3. Exercise habits not only reflect an individual’s lifestyle but also indirectly reveal the impact of social participation and family support. Their protective effect is reflected in both models. These findings reinforce the value of SDOH variables in predicting health risks, suggesting that when developing intervention strategies, in addition to emphasizing physiological and metabolic indicators, it is important to focus on improving social support, enhancing mental health, and promoting healthy lifestyles.
Model performance assessment
Both prediction models demonstrated good performance for clinical application (Figure 1). For muscle weakness prediction, the XGBoost model achieved an AUC–ROC of 0.79 (95% confidence interval (CI): 0.76–0.82) and a C-index of 0.78 (95% CI: 0.75–0.81), indicating good discriminative ability. At the optimal cutoff point (predicted probability >0.21), the model achieved 76.3% sensitivity and 78.1% specificity, with a positive predictive value of 40.2% and negative predictive value of 94.1%. Calibration assessment showed good agreement between predicted and observed outcomes (Hosmer–Lemeshow test: χ2 = 9.73, p = 0.284; Brier score = 0.125).

Nomogram models for physical decline and muscle weakness risk assessment in the older population.
For physical decline prediction, the model achieved an AUC–ROC of 0.74 (95% CI: 0.72–0.76) and a C-index of 0.73 (95% CI: 0.71–0.75), indicating acceptable discriminative performance. At the optimal cutoff point (predicted probability >0.58), the sensitivity was 71.8% and specificity was 73.2%, with a positive predictive value of 75.6% and negative predictive value of 69.1%. The model showed reasonable calibration (Hosmer–Lemeshow test: χ2 = 11.84, p = 0.158; Brier score = 0.212). The ROC curve validation analysis for both prediction models is presented in Figure 2.

ROC curve validation analysis of prediction models for physical decline and muscle weakness.
Internal validation using bootstrap resampling demonstrated model stability, with optimism-corrected AUC values of 0.77 for muscle weakness and 0.72 for physical decline prediction, indicating minimal overfitting and good generalizability.
RFE and key variable screening
Using the RFE method, we further optimized the prediction model and identified the most predictive variables (Supplementary Materials; Table 5). The impact of feature count on prediction model performance is shown in Figure 3. For the muscle weakness model, the RFE selection showed that a history of stroke, cognitive ability, age, marital status, and executive function were key features, while in the physical decline model, smoking status and life satisfaction were selected as key variables. These results indicate the following: 1. While predicting muscle weakness, there are both traditional physiological indicators (such as stroke and age) and significant social and cognitive factors (marital status, cognitive ability, and executive function). This suggests that social environment and family roles are indispensable for maintaining muscle health in older adults. 2. For physical decline, smoking—a known negative lifestyle habit—and life satisfaction—an indicator reflecting individual social support and mental health—are important predictors, further validating the profound impact of SDOH on health outcomes.
Feature selection for muscle weakness and physical decline.
RMSE: root mean square error; RMSESD: root mean square error standard deviation; MAE: mean absolute error; MAESD: mean absolute error standard deviation.
*indicates the optimal number of features selected by the recursive feature elimination (RFE) algorithm.

Impact of feature count on the performance of muscle weakness and physical decline prediction models.
XGBoost and SHAP analyses
The binary classification prediction model built using the XGBoost algorithm, combined with SHAP analysis, provides an intuitive quantification of each variable’s contribution (Figures 4 and 5). 1. In the muscle weakness model, SHAP analysis showed the superiority of metabolic indicators (such as glycated hemoglobin (HbA1c), triglyceride–glucose index (TyG index), and TyG-BMI) and frailty index; however, the SHAP values for age, marital status, and certain cognitive indicators (such as memory and executive function) were also significant. These results suggest that although physiological indicators play a crucial role in prediction, social and cognitive factors provide additional information to the model, improving overall predictive performance. 2. For the physical decline model, SHAP analysis highlighted the importance of lifestyle factors, with the average SHAP value for smoking status being close to 0.1, indicating a significant impact on risk, followed by uric acid levels, age, and frailty index. At the same time, SDOH variables such as life satisfaction show a close nonlinear relationship with risk, further demonstrating the important role of social environment and psychological state in prediction.

SHAP importance ranking of key features in muscle weakness prediction: metabolism-dominated multifactorial influence pattern.

SHAP feature importance analysis for physical decline prediction: combined contribution of lifestyle and metabolic factors.
Through SHAP dependence analysis (Figures 6 and 7), we observed a bimodal distribution of age in the muscle weakness model: the risk sharply increased around the ages of 60 and 80 years, while in the physical decline model, the risk increased with aging in a wave-like pattern. Additionally, the increased risk in the higher value range of metabolic indicators interacted with social factors, creating a complex nonlinear relationship. Overall, SHAP results confirmed that SDOH variables in the model not only complement the information from physiological indicators but also significantly enhance the model’s interpretability.

Muscle weakness prediction model performance and nonlinear dependency of key features: a multidimensional analysis based on SHAP values.

SHAP dependency relationships between physical decline risk and key predictors: differential effects of smoking, age, and metabolic indicators.
Discussion
The 9-year prospective design with clear temporal separation between predictor measurement (2011) and outcome assessment (2020) strengthens the causal inference and clinical relevance of our prediction models. This extended follow-up period captures the natural progression of aging-related changes and validates the long-term predictive utility of integrated clinical and SDOH factors.
This study, based on CHARLS data, utilized logistic regression, RFE, XGBoost, and SHAP analysis to construct a predictive model integrating traditional clinical indicators with SDOH, 22 effectively revealing the multidimensional risk spectrum of muscle weakness and physical decline in older adults. The models demonstrated good discrimination (AUC of 0.79 for muscle weakness, AUC of 0.74 for physical decline) and calibration performance, indicating their potential clinical utility for early risk identification in older adults. The results show that in addition to physiological and metabolic indicators such as HbA1c, TyG index, uric acid levels, and age, SDOH variables such as marital status, life satisfaction, exercise habits, and educational level significantly influence health risks, highlighting a clear interdisciplinary nature. Thus, we preliminarily validate the importance of incorporating social and behavioral factors into health risk assessment.
Based on the risk assessment, the study proposed practical health intervention recommendations. Specifically, the inclusion of SDOH variables in the model makes the health risk assessment for older adults more comprehensive, aiding in the early identification of high-risk groups. In response to this finding, we recommend implementing key measures such as regular assessments, personalized exercise prescriptions, and the establishment of social support networks to improve social support, enhance life satisfaction, and promote healthy lifestyles, thereby reducing the risk of muscle weakness and physical decline. These recommendations provide specific operational strategies for healthcare and community health services.
Furthermore, the study highlights the importance of interdisciplinary collaboration. Relying solely on physiological indicators is insufficient to fully explain aging-related health issues. Incorporating social, psychological, and behavioral factors into the analysis not only promotes the deep integration of medicine and social sciences but also provides a solid scientific basis for the formulation of healthcare, social welfare, and public policies. 23 This interdisciplinary perspective provides novel ideas and strategies to address the challenges of aging, with high potential for application and dissemination.
Finally, the study also had some limitations. First, the model lacked explanatory power for some variables; second, the data were primarily limited to a specific socioeconomic context, which may affect the model’s generalizability. To address these limitations, future work should further validate the model’s temporal relationships and generalizability using multicenter and cross-regional data while incorporating wearable devices and remote monitoring technologies to explore the impact of additional socioeconomic variables on health in the aging population. Through these measures, the model’s accuracy and practical application value are expected to be further improved.
In conclusion, this study provides scientific evidence for improving the functional status and quality of life of older adults, emphasizing the key roles of risk assessment, interdisciplinary integration, and health interventions in addressing population aging and demonstrating the innovative potential of combining machine learning with SDOH.
Supplemental Material
sj-pdf-1-imr-10.1177_03000605251379211 - Supplemental material for Cross-disciplinary risk prediction for muscle weakness and physical decline in older adults: A machine learning model integrating social determinants of health and clinical characteristics
Supplemental material, sj-pdf-1-imr-10.1177_03000605251379211 for Cross-disciplinary risk prediction for muscle weakness and physical decline in older adults: A machine learning model integrating social determinants of health and clinical characteristics by Bowen Li, Wenjing Li and Chunxiao Wan in Journal of International Medical Research
Supplemental Material
sj-pdf-2-imr-10.1177_03000605251379211 - Supplemental material for Cross-disciplinary risk prediction for muscle weakness and physical decline in older adults: A machine learning model integrating social determinants of health and clinical characteristics
Supplemental material, sj-pdf-2-imr-10.1177_03000605251379211 for Cross-disciplinary risk prediction for muscle weakness and physical decline in older adults: A machine learning model integrating social determinants of health and clinical characteristics by Bowen Li, Wenjing Li and Chunxiao Wan in Journal of International Medical Research
Supplemental Material
sj-pdf-3-imr-10.1177_03000605251379211 - Supplemental material for Cross-disciplinary risk prediction for muscle weakness and physical decline in older adults: A machine learning model integrating social determinants of health and clinical characteristics
Supplemental material, sj-pdf-3-imr-10.1177_03000605251379211 for Cross-disciplinary risk prediction for muscle weakness and physical decline in older adults: A machine learning model integrating social determinants of health and clinical characteristics by Bowen Li, Wenjing Li and Chunxiao Wan in Journal of International Medical Research
Footnotes
Acknowledgements
We thank the China Health and Retirement Longitudinal Study team for providing data and training in using the datasets. We thank the students who participated in the survey for their cooperation. We thank all volunteers and the staff involved in this research.
Authorship contribution statement
Bowen Li and Wenjing Li analyzed the data and wrote the paper. Chunxiao Wan designed the research and was primarily responsible for the final content. All authors have read and approved the final manuscript.
Data availability statement
Declaration of conflicting interests
The authors declare no competing interest.
Ethical standards
This study was conducted in accordance with the Declaration of Helsinki (1975, as revised in 2024). The Biomedical Ethics Review Committee of Peking University approved the study (Approval no: IRB00001052-11015), and all participants provided written informed consent.
Funding
This study was supported by the Tianjin Key Medical Discipline Construction Project (Grant No.TJYXZDXK-3-014C) and Key Discipline of Tianjin Health Science and Technology Project (TJWJ2022XK007).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
