Sage Journals: Discover world-class research

Abstract

Sarcopenia is associated with an elevated burden of depressive symptoms, yet screening tools may have limited accuracy and generalizability in this population. We developed and validated an interpretable machine-learning model to predict depressive symptoms risk among middle-aged and older adults with sarcopenia using National Health and Nutrition Examination Survey (NHANES) 2007-2020 data. In this cross-sectional study, we included 913 participants with sarcopenia aged ≥45 years from NHANES 2007-2020. Candidate predictors were selected using Boruta followed by least absolute shrinkage and selection operator (LASSO). Multiple machine-learning models were developed and internally validated for discrimination, calibration, and clinical utility. Shapley Additive exPlanations (SHAP) were used to support interpretability. Reporting followed the TRIPOD+AI guidance. Nine predictors were retained after Boruta–LASSO selection. In the validation set, the logistic regression model showed the best overall performance (AUC 0.794; Brier score 0.065). SHAP analysis highlighted key contributors including education level, sleep disorder, sex, poverty-income ratio, blood urea nitrogen, osteoarthritis, white blood cell count, absolute lymphocyte count, and body mass index. The final model was presented as a clinically usable nomogram for individualized depressive symptoms risk estimation. We developed a validated, interpretable machine-learning model for predicting depressive symptoms risk in middle-aged and older adults with sarcopenia using NHANES data. The nomogram may facilitate rapid risk stratification and targeted interventions to support risk stratification and targeted supportive care addressing both physical and mental health needs.

Keywords

depressive symptoms sarcopenia risk prediction logistic regression nomogram

Introduction

Sarcopenia is a geriatric syndrome characterized by progressive, generalized loss of skeletal muscle mass, decreased muscle strength, and decline in physical function, representing a common pathological state during the aging process.¹ The core pathophysiological changes involve multiple aspects, including muscle fiber atrophy, reduced muscle quantity, intramuscular fat infiltration, increased fibrosis, imbalance between muscle protein synthesis and catabolism, mitochondrial dysfunction, enhanced oxidative stress, and degeneration of the neuromuscular junction.² These changes collectively lead to a progressive loss of muscle physiological function, significant weakening of muscle strength, and a decline in physical activity capacity, ultimately manifesting as a frail state and an increased risk of functional impairment. Epidemiological studies worldwide indicate that the prevalence of sarcopenia shows a significant upward trend with age. It is estimated that approximately 10% to 16% of the elderly population globally is affected by sarcopenia to varying degrees.³ It is noteworthy that prevalence rates vary across different populations and diagnostic criteria. For instance, a study based on a community-dwelling elderly cohort in the United States (with an average participant age of approximately 70 years) reported a sarcopenia prevalence rate as high as 36.5% according to its adopted diagnostic criteria, highlighting the substantial burden of this syndrome among the elderly population.⁴

This high-burden state of physical dysfunction exerts profound impacts on mental health. Mental health management, particularly the early identification and intervention of depressive symptoms, plays a decisive role in maintaining the life autonomy, social integration, and subjective well-being of the elderly population.⁵ However, sarcopenia patients face a significantly higher risk of depressive symptoms compared to the general elderly population due to progressive physical functional decline, reduced exercise tolerance, loss of social roles, and anxiety over deteriorating health. Evidence-based medical research indicates that the risk of developing depression or clinically significant depressive symptoms in sarcopenia patients can be 1.32-1.86 times higher than that of their non-sarcopenic peers.⁶ Depressive symptoms not only diminish patients’ motivation for rehabilitation exercises and maintaining nutritional intake, accelerating muscle loss and functional deterioration, but also significantly increase the risks of cognitive impairment, cardiovascular events, falls, and fractures, as well as all-cause mortality.^7,8 Meanwhile, the social isolation tendency induced by depression and the deterioration of health-related quality of life further exacerbates family caregiving burdens and public health expenditures.⁹ The bidirectional association between sarcopenia and depressive symptoms, which may co-occur and potentially reinforce each other, along with their compounded adverse health consequences, has become a major public health challenge to healthy aging in older adults. Therefore, early identification of individuals with sarcopenia who are at high risk of depressive symptoms and prompt intervention are of urgent importance for improving prognosis, enhancing quality of life, and alleviating societal healthcare burdens.

However, the application of existing depression screening tools in populations with sarcopenia has significant limitations. Although current clinical instruments (such as the GDS-15 and PHQ-9 scales) can be used to screen for depressive symptoms in older adults, their use in sarcopenia populations still faces challenges. Traditional screening scales primarily rely on patients’ subjective reports, making them susceptible to disease stigmatization, cognitive biases, overlapping somatic symptoms, and cultural factors, which may result in insufficient screening sensitivity and specificity in this population.^10,11 The nomogram constructed based on traditional statistical methods, while capable of providing inevitable visualized risk stratification, is typically built upon specific samples with limited generalizability. Moreover, it lacks real-time and individualized risk interpretation capabilities, making it challenging to meet the clinical demand for rapid and precise decision-making.¹²

In contrast to traditional subjective scales, this study utilizes machine learning algorithms for risk prediction. Machine learning demonstrates significant advantages in processing high-dimensional, heterogeneous data, effectively integrating multidimensional features, including demographic characteristics, lifestyle and behavioral factors, anthropometric parameters, nutritional intake data, clinical disease status, and laboratory test indicators. By identifying complex nonlinear relationships and higher-order interactions among variables, machine learning algorithms significantly enhance the prediction accuracy of depressive symptoms risk in patients with sarcopenia.¹³ The core innovation lies in integrating explainable artificial intelligence techniques—particularly SHapley Additive exPlanations (SHAP). SHAP provides quantitative feature contribution analysis for individualized predictions, clarifying the relative impact weights of specific factors on the target patient’s risk of depression. This explainability safeguard mechanism enhances the clinical credibility and operability of model outputs, offering a theoretical foundation for individualized risk assessment and precise decision support.¹⁴

Therefore, this study aims to utilize multi-source data on sarcopenia patients from the National Health and Nutrition Examination Survey (NHANES) database to construct and validate a risk prediction model for depressive symptoms based on multiple machine learning algorithms. Through rigorous model comparison and evaluation, the optimal predictive model will be selected. The study will further employ methods such as SHAP to analyze the key predictive factors and their operational patterns in the optimal model, revealing the bio-psycho-social factors influencing depression risk in sarcopenia patients. The ultimate goal is to establish a precise and interpretable risk prediction tool for efficiently identifying high-risk individuals with depressive symptoms among sarcopenia patients. Enabling early warning and risk stratification will provide clinicians with decision-making support to promptly initiate targeted psychological interventions, social support, or comprehensive management plans. This approach will contribute to effectively preventing or mitigating the onset and progression of depressive symptoms, improving the mental health outcomes and overall quality of life for sarcopenia patients, and ultimately reducing their risk of adverse health events and associated socioeconomic burdens.

Materials and Methods

Study Design and Data Source

This study employed a cross-sectional research design, aiming to conduct diagnostic prediction of depressive symptoms in middle-aged and elderly patients with sarcopenia based on data from 6 interlinked burdens (2007-2020) of the NHANES. All 6 interlinked burdens of data contained the required demographic, dietary, physical examination, laboratory, and questionnaire data for this study, meeting the variable selection criteria of the research.

The NHANES, conducted by the National Centre for Health Statistics (NCHS) of the Centres for Disease Control and Prevention (CDC), is a nationally representative, cross-sectional surveillance program designed to assess the health and nutritional status of the civilian, noninstitutionalized U.S. population. It employed a complex, stratified, multistage probability sampling design to ensure representativeness. Data collection involves detailed household interviews, comprehensive standardized physical examinations, and laboratory testing conducted in Mobile Examination Centres. NHANES provided critical data on a wide range of health indicators, including the prevalence of chronic and infectious diseases, nutrition biomarkers, anthropometric measurements, environmental exposures, and risk behaviors. Released in publicly accessible, biennial data interlinked burdens, NHANES served as a foundational resource for epidemiological research, health trend monitoring, and informing public health policy in the United States. Study data were available through the official NHANES repository hosted by the NCHS: https://wwwn.cdc.gov/nchs/nhanes/default.aspx.¹⁵

Participants

The study population consisted of patients with sarcopenia. Previous studies have established dual-energy X-ray absorptiometry and bioelectrical impedance analysis as the gold standard methods for diagnosing and assessing sarcopenia. However, their practical application was limited by accessibility and operational convenience. To address this issue, some scholars have developed and validated a predictive equation for estimating appendicular skeletal muscle mass based on NHANES data.¹⁶ The study indicated that compared to equations incorporating serological indicators, this simplified equation without serological markers was ASM = 0.485 × 0.9998^age × 0.814^[female] × 1.006^height × weight^0.680. Although it exhibited slightly lower accuracy in estimating ASM, the difference between the 2 was minimal. Based on the diagnostic criteria for sarcopenia established by the Foundation for the National Institutes of Health (FNIH), this study employed the ratio of ASM to body mass index (BMI) (ASM/BMI) for diagnosis: males with ASM/BMI < 0.789 kg/(kg/m²) and females with ASM/BMI < 0.512 kg/(kg/m²) were defined as having sarcopenia.¹⁷ The inclusion criteria for this study were participants aged ≥45 years who met the diagnostic criteria for middle-aged and elderly sarcopenia as defined by the FNIH.¹⁸ The exclusion criteria were: (1) missing PHQ-9 data, including participants who did not complete the relevant interview items; and (2) missing data in candidate predictors required for model development. We did not apply additional exclusions based on antidepressant use or treatments that may affect muscle mass, such as long-term systemic corticosteroid therapy, and medication exposures were not incorporated as predictors. The sample screening process was shown in Figure 1.

Figure 1.

Sample screening and statistical analysis process framework diagram.

Sample Size

The sample size estimation was conducted based on the Events Per Variable (EPV) criterion.¹⁹ This criterion recommends that each predictor variable ultimately included in the model should correspond to at least 10 outcome events to ensure robust parameter estimation. Preliminary planning indicated that the final model might incorporate 8 to 10 predictors; for a conservative estimate, we used the median value of 9 for calculation. Based on existing epidemiological evidence, the prevalence of clinically relevant depressive symptoms among middle-aged and older adults aged with sarcopenia is approximately 15.82%.²⁰ Accordingly, the EPV threshold was set at 10, and allowing for an anticipated invalid/missing response rate of ~10%, the minimum required sample size for model development was calculated as: 10 × 9 ÷ 0.1582 ÷ 0.9 ≈ 632. Given that we planned a 7:3 split for model development and internal validation, the total required sample size was therefore at least 632 ÷ 0.7 ≈ 903. In the present study, 913 eligible participants were included and were stratified into a training set (n = 639) and a validation set (n = 274), ensuring that the training set exceeded the EPV-based minimum.

Outcome Variable

The Patient Health Questionnaire-9 (PHQ-9) scale was initially developed in 1999 by American psychiatrist Spitzer et al, based on the Diagnostic and Statistical Manual of Mental Disorders, fourth Edition, and is a commonly used self-assessment tool for depression.^21,22 The original author designed the scale as a universal tool intended for public use (please refer to https://www.phqscreeners.com/ for details). The PHQ-9 contained 9 questions, each with 4 answer options corresponding to a score of 0 to 3, with a total score ranging from 0 to 27.²³ According to the research criteria, sarcopenia patients with a PHQ-9 total score ≥10 were considered to have depressive symptoms.²⁴

Predictor Variables

The predictor variables involved in this study can be categorized as follows: demographic variables include age, gender, race, marital status, education level, and Poverty Income Ratio (PIR). Lifestyle and behavioral variables include alcohol consumption, smoking, physical activity, sedentary behavior, sleep duration, and sleep disorder. Anthropometric indicators include weight, height, body mass index (BMI), and waist circumference. Dietary and nutritional variables include protein intake and vitamin D intake. Clinical disease variables include hypertension, diabetes, osteoarthritis, cardiovascular disease (CVD), and stroke. Laboratory test indicators include renal function, blood lipids, inflammation-related markers (serum creatinine, blood urea nitrogen, uric acid, serum calcium, serum phosphorus, total cholesterol, triglyceride, white blood cell count, absolute lymphocyte count, monocyte, and neutrophil count). Detailed definition criteria for each predictor variable were shown in Supplemental Materials 1.

Statistical Analysis and Feature Screening

All statistical analyses were performed using R software (version 4.4.3). Normally distributed continuous variables were expressed as mean ± standard deviation (SD), non-normally distributed continuous variables as M (Q₁, Q₃), and categorical variables as number (%). Univariate analyses employed the chi-square test, independent samples t-test, and Mann-Whitney test, with a 2-sided significance level set at P < .05.

This study employed a stratified random sampling method to divide the dataset into training and validation sets at a ratio of 7:3. The training set was used for feature selection and model construction. During the feature selection phase, the Boruta algorithm based on random forests was first applied for preliminary screening. This algorithm identifies statistically significant and relevant features by comparing the importance scores of original features with those of randomly generated shadow variables. Subsequently, the Least Absolute Shrinkage and Selection Operator (LASSO) regression were used for the secondary selection of the screened features. By introducing an L1 regularization penalty term, LASSO compresses the regression coefficients of redundant features to zero, thereby achieving dimensionality reduction in the feature space, and mitigating overfitting risks. The optimal regularization parameter λ was ultimately determined through 10-fold cross-validation to obtain the final predictive variables.

Model Development

This study was based on the final screened predictive variables, employing a 10-fold cross-validated grid search method to optimize the hyperparameters of 9 machine learning algorithms—specifically AdaBoost, CatBoost, Gradient Boosting Machine (GBM), K-Nearest Neighbor (KNN), LightGBM, NeuralNetwork (NN), Random forest (RF), Support Vector Machine (SVM), and XGBoost—excluding the logistic regression model, aiming to achieve optimal model performance and mitigate the risk of overfitting.

Model Evaluation and Interpretability

This study evaluated model performance using validation set data through a comprehensive analysis of 3 dimensions: discrimination, calibration, and clinical utility. For discrimination assessment, the area under the receiver operating characteristic curve (AUC, ranging from 0.5 to 1) and the C-index (ranging from 0.5 to 1) were adopted as primary metrics, supplemented by accuracy, recall, specificity, precision, and the F1 score (all ranging from 0 to 1) for comprehensive evaluation. Values closer to 1 for these metrics indicated stronger model discrimination capability. Calibration assessment was conducted using calibration curves and the Brier score (ranging from 0 to 1), where calibration curves visually demonstrated the agreement between predicted probabilities and observed frequencies. In contrast, the Brier score quantitatively reflected the accuracy of predicted probabilities, with lower scores indicating better calibration. Clinical utility was evaluated using decision curve analysis (DCA), which calculated the net clinical benefit at varying decision thresholds to assess the model’s clinical application value. When the model’s decision curve lay above the “intervene for all” and “intervene for none” reference lines, it indicated that the model could effectively enhance clinical decision-making benefits. Based on the results of this multi-dimensional comprehensive evaluation, the optimal predictive model was ultimately determined.

To enhance model interpretability, this study employed the SHAP method, based on game theory, to analyze the optimal algorithm model. This approach quantified feature contributions to prediction outcomes by calculating Shapley values from 2 aspects. First, it identified key predictive features at the global level and explained individual sample predictions at the local level. Second, it ensured mathematical rigor in feature contribution allocation based on game-theoretic axioms. Third, it demonstrated broad model compatibility, applicable to various algorithms, including tree models, linear models, and deep learning models. Finally, it provided intuitive visualizations of feature influence direction and magnitude. The results demonstrated that SHAP analysis not only effectively improves model transparency by revealing the relationship between features and prediction outcomes but also provides a reliable theoretical foundation for model optimization and clinical applications.

TRIPOD+AI (EQUATOR) Reporting Guideline

This study followed the TRIPOD+AI reporting guideline (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis – AI extension), an EQUATOR Network reporting guideline for prediction model development and validation using regression or machine learning methods, to ensure transparent and comprehensive reporting throughout model development, internal validation, performance evaluation, and interpretation.²⁵ The completed TRIPOD + AI checklist is provided as Supplemental Figure S1.

Results

Baseline Characteristics

This study ultimately included 913 patients who met the criteria. Using the random number generator in R software (version 4.4.3) with a random seed set to 1234, the subjects were randomly allocated to the training set (n = 639) and the validation set (n = 274) in a 7:3 ratio. Baseline characteristic analysis revealed that, except for sedentary behavior, which showed a statistically significant difference between the groups (P < .05), all other variables were evenly distributed (Table 1), indicating that the randomization scheme was generally effective.

Table 1.

Comparison of Baseline Data Between Training Set and Validation Set.

Variables	Total sample (n = 913)	Training set (n = 639)	Validation set (n = 274)	Statistic	P
Age, Mean ± SD	59.57 ± 9.62	59.40 ± 9.66	59.95 ± 9.52	t = 0.80	.426
Gender, n (%)				χ² = 2.42	.120
Male	607 (66.48)	435 (68.08)	172 (62.77)
Female	306 (33.52)	204 (31.92)	102 (37.23)
Race, n (%)				χ² = 0.96	.916
Mexican American	125 (13.69)	87 (13.62)	38 (13.87)
Other Hispanic	79 (8.65)	52 (8.14)	27 (9.85)
Non-Hispanic White	480 (52.57)	336 (52.58)	144 (52.55)
Non-Hispanic Black	168 (18.40)	120 (18.78)	48 (17.52)
Other race	61 (6.68)	44 (6.89)	17 (6.20)
Education level, n (%)				χ² = 0.18	.912
High school below	182 (19.93)	126 (19.72)	56 (20.44)
High school	248 (27.16)	172 (26.92)	76 (27.74)
High school above	483 (52.90)	341 (53.36)	142 (51.82)
Marital status, n (%)				χ² = 1.89	.388
Married/Living with partner	613 (67.14)	429 (67.14)	184 (67.15)
Widowed/Divorced/Separated	235 (25.74)	169 (26.45)	66 (24.09)
Unmarried	65 (7.12)	41 (6.42)	24 (8.76)
PIR, n (%)				χ² = 1.78	.411
Low-income	219 (23.99)	155 (24.26)	64 (23.36)
Middle-income	386 (42.28)	277 (43.35)	109 (39.78)
High-income	308 (33.73)	207 (32.39)	101 (36.86)
Alcohol consumption, n (%)				χ² = 0.40	.818
Never	279 (30.56)	197 (30.83)	82 (29.93)
Light	517 (56.63)	363 (56.81)	154 (56.20)
Heavy	117 (12.81)	79 (12.36)	38 (13.87)
Smoking, n (%)				χ² = 1.98	.371
Never	371 (40.64)	267 (41.78)	104 (37.96)
Ever	339 (37.13)	228 (35.68)	111 (40.51)
Current	203 (22.23)	144 (22.54)	59 (21.53)
Hypertension, n (%)				χ² = 1.31	.252
Yes	516 (56.52)	369 (57.75)	147 (53.65)
No	397 (43.48)	270 (42.25)	127 (46.35)
Diabetes, n (%)				χ² = 0.01	.934
Yes	205 (22.45)	143 (22.38)	62 (22.63)
No	708 (77.55)	496 (77.62)	212 (77.37)
Osteoarthritis, n (%)				χ² = 0.32	.570
Yes	356 (38.99)	253 (39.59)	103 (37.59)
No	557 (61.01)	386 (60.41)	171 (62.41)
CVD, n (%)				χ² = 1.51	.219
Yes	99 (10.84)	64 (10.02)	35 (12.77)
No	814 (89.16)	575 (89.98)	239 (87.23)
Stroke, n (%)				χ² = 0.37	.545
Yes	44 (4.82)	29 (4.54)	15 (5.47)
No	869 (95.18)	610 (95.46)	259 (94.53)
Physical activity, n (%)				χ² = 0.42	.516
Inactive	127 (13.91)	92 (14.40)	35 (12.77)
Active	786 (86.09)	547 (85.60)	239 (87.23)
Sedentary behavior, n (%)				χ² = 4.68	.031
Mild	759 (83.13)	520 (81.38)	239 (87.23)
Severe	154 (16.87)	119 (18.62)	35 (12.77)
Sleep duration, n (%)				χ² = 1.09	.581
Short	330 (36.14)	232 (36.31)	98 (35.77)
Normal	540 (59.15)	374 (58.53)	166 (60.58)
Longer	43 (4.71)	33 (5.16)	10 (3.65)
Sleep disorder, n (%)				χ² = 2.46	.117
Yes	296 (32.42)	197 (30.83)	99 (36.13)
No	617 (67.58)	442 (69.17)	175 (63.87)
Weight (kg), Mean ± SD	85.81 ± 19.72	85.80 ± 19.75	85.82 ± 19.67	t = 0.01	.989
Height (cm), Mean ± SD	170.02 ± 9.92	169.98 ± 9.93	170.11 ± 9.91	t = 0.19	.849
BMI (kg/m²), Mean ± SD	29.63 ± 6.23	29.62 ± 6.04	29.68 ± 6.66	t = 0.14	.893
Waist circumference (cm), Mean ± SD	102.98 ± 14.87	102.94 ± 14.76	103.07 ± 15.14	t = 0.12	.904
Protein intake (g), Mean ± SD	85.66 ± 35.05	85.40 ± 35.04	86.25 ± 35.12	t = 0.34	.737
Serum creatinine (mg/dl), Mean ± SD	0.93 ± 0.39	0.92 ± 0.25	0.94 ± 0.61	t = 0.78	.433
Uric acid (mg/dl), Mean ± SD	5.77 ± 1.37	5.77 ± 1.37	5.75 ± 1.38	t = −0.21	.832
Blood urea nitrogen (mg/dl), Mean ± SD	14.92 ± 5.29	14.79 ± 5.24	15.23 ± 5.40	t = 1.17	.244
Serum calcium (mg/dl), Mean ± SD	9.31 ± 0.35	9.31 ± 0.35	9.31 ± 0.36	t = −0.14	.886
Serum phosphorus (mg/dl), Mean ± SD	3.49 ± 0.54	3.48 ± 0.52	3.51 ± 0.57	t = 0.72	.474
Total cholesterol (mg/dl), Mean ± SD	196.97 ± 43.79	196.91 ± 43.72	197.10 ± 44.06	t = 0.06	.953
White blood cell count (1000 cells/µL), Mean ± SD	6.65 ± 1.95	6.65 ± 1.94	6.66 ± 1.97	t = 0.11	.916
Absolute Lymphocyte Count (1000 cells/µL), Mean ± SD	1.93 ± 0.69	1.92 ± 0.69	1.96 ± 0.69	t = 0.96	.338
Monocyte (1000 cells/µL), Mean ± SD	0.56 ± 0.21	0.57 ± 0.23	0.55 ± 0.17	t = −1.29	.197
Neutrophil count (1000 cells/µL), Mean ± SD	3.91 ± 1.52	3.91 ± 1.51	3.90 ± 1.55	t = −0.09	.930
Vitamin D intake (mg), M (Q₁, Q₃)	3.85 (2.05, 6.35)	3.95 (2.05, 6.47)	3.52 (2.02, 6.07)	Z = −0.74	.460
Triglyceride (mg/dl), M (Q₁, Q₃)	102.00 (73.00, 153.00)	102.00 (72.00, 153.00)	103.00 (74.00, 154.00)	Z = −0.52	.601
Depressive symptoms, n (%)				χ² = 0.03	.856
Yes	79 (8.65)	56 (8.76)	23 (8.39)
No	834 (91.35)	583 (91.24)	251 (91.61)

Note. t = t-test; Z = Mann-Whitney test; χ² = Chi-square test; SD = standard deviation; M = median; Q₁ = 1st quartile; Q₃ = 3rd quartile; BMI = body mass index; PIR = ratio of family income to poverty; CVD = cardiovascular disease.

Among the 639 sarcopenia patients in the training set, 56 cases (8.76%) were in the depressive symptoms group. Compared with the non-depressive symptoms group, the depressive symptoms group exhibited the following characteristics: (1) Demographic aspects: significantly higher proportions of females (53.57% vs 29.85%; P < .001), low education level (below high school: 35.71% vs 18.18%; P = .003), and low income (poverty income ratio ≤1.0: 44.64% vs 22.30%; P < .001); (2) Clinical features: significantly higher prevalence of hypertension (80.36% vs 55.57%; P < .001), osteoarthritis (60.71% vs 37.56%; P < .001), and sleep disorders (60.71% vs 27.96%; P < .001); (3) Laboratory indicators: lower average height (167.11 ± 9.76 cm vs 170.25 ± 9.92 cm; P = .024) and higher lymphocyte count (2.12 ± 0.67 vs 1.90 ± 0.69; P = .020). No significant differences were observed between the 2 groups in terms of ethnicity, marital status, and cardiovascular diseases (all P > .05; see Table 2).

Table 2.

Distribution of Baseline Data in the Training Set.

Variables	Total data (n = 639)	Depressive symptoms (n = 56)	Non-depressive symptoms (n = 583)	Test statistic	P
Age, Mean ± SD	59.40 ± 9.66	56.98 ± 7.76	59.63 ± 9.80	t = 2.38	.020
Gender, n (%)				χ² = 13.23	<.001
Male	435 (68.08)	26 (46.43)	409 (70.15)
Female	204 (31.92)	30 (53.57)	174 (29.85)
Race, n (%)				χ² = 3.49	.480
Mexican American	87 (13.62)	11 (19.64)	76 (13.04)
Other Hispanic	52 (8.14)	6 (10.71)	46 (7.89)
Non-Hispanic White	336 (52.58)	25 (44.64)	311 (53.34)
Non-Hispanic Black	120 (18.78)	9 (16.07)	111 (19.04)
Other race	44 (6.89)	5 (8.93)	39 (6.69)
Education level, n (%)				χ² = 11.61	.003
High school below	126 (19.72)	20 (35.71)	106 (18.18)
High school	172 (26.92)	16 (28.57)	156 (26.76)
High school above	341 (53.36)	20 (35.71)	321 (55.06)
Marital status, n (%)				χ² = 2.82	.245
Married/Living with partner	429 (67.14)	32 (57.14)	397 (68.10)
Widowed/Divorced/Separated	169 (26.45)	19 (33.93)	150 (25.73)
Unmarried	41 (6.42)	5 (8.93)	36 (6.17)
PIR, n (%)				χ² = 15.00	<.001
Low-income	155 (24.26)	25 (44.64)	130 (22.30)
Middle-income	277 (43.35)	21 (37.50)	256 (43.91)
High-income	207 (32.39)	10 (17.86)	197 (33.79)
Alcohol consumption, n (%)				χ² = 2.73	.255
Never	197 (30.83)	21 (37.50)	176 (30.19)
Light	363 (56.81)	26 (46.43)	337 (57.80)
Heavy	79 (12.36)	9 (16.07)	70 (12.01)
Smoking, n (%)				χ² = 3.27	.195
Never	267 (41.78)	20 (35.71)	247 (42.37)
Ever	228 (35.68)	18 (32.14)	210 (36.02)
Current	144 (22.54)	18 (32.14)	126 (21.61)
Hypertension, n (%)				χ² = 12.86	<.001
Yes	369 (57.75)	45 (80.36)	324 (55.57)
No	270 (42.25)	11 (19.64)	259 (44.43)
Diabetes, n (%)				χ² = 2.25	.134
Yes	143 (22.38)	17 (30.36)	126 (21.61)
No	496 (77.62)	39 (69.64)	457 (78.39)
Osteoarthritis, n (%)				χ² = 11.45	<.001
Yes	253 (39.59)	34 (60.71)	219 (37.56)
No	386 (60.41)	22 (39.29)	364 (62.44)
CVD, n (%)				χ² = 0.03	.855
Yes	64 (10.02)	6 (10.71)	58 (9.95)
No	575 (89.98)	50 (89.29)	525 (90.05)
Stroke, n (%)				χ² = 0.42	.519
Yes	29 (4.54)	4 (7.14)	25 (4.29)
No	610 (95.46)	52 (92.86)	558 (95.71)
Physical activity, n (%)				χ² = 0.60	.440
Inactive	92 (14.40)	10 (17.86)	82 (14.07)
Active	547 (85.60)	46 (82.14)	501 (85.93)
Sedentary behavior, n (%)				χ² = 0.04	.837
Mild	520 (81.38)	45 (80.36)	475 (81.48)
Severe	119 (18.62)	11 (19.64)	108 (18.52)
Sleep duration, n (%)				χ² = 9.14	.010
Short	232 (36.31)	24 (42.86)	208 (35.68)
Normal	374 (58.53)	25 (44.64)	349 (59.86)
Longer	33 (5.16)	7 (12.50)	26 (4.46)
Sleep disorder, n (%)				χ² = 25.71	<.001
Yes	197 (30.83)	34 (60.71)	163 (27.96)
No	442 (69.17)	22 (39.29)	420 (72.04)
Weight (kg), Mean ± SD	85.80 ± 19.75	86.83 ± 22.41	85.70 ± 19.50	t = −0.41	.684
Height (cm), Mean ± SD	169.98 ± 9.93	167.11 ± 9.76	170.25 ± 9.92	t = 2.27	.024
BMI (kg/m²), Mean ± SD	29.62 ± 6.04	31.16 ± 8.30	29.47 ± 5.76	t = −1.49	.142
Waist circumference (cm), Mean ± SD	102.94 ± 14.76	105.19 ± 16.65	102.73 ± 14.57	t = −1.19	.233
Protein intake (g), Mean ± SD	85.40 ± 35.04	75.69 ± 34.70	86.33 ± 34.95	t = 2.18	.030
Serum creatinine (mg/dl), Mean ± SD	0.92 ± 0.25	0.85 ± 0.20	0.93 ± 0.25	t = 2.16	.031
Uric acid (mg/dl), Mean ± SD	5.77 ± 1.37	5.75 ± 1.44	5.78 ± 1.36	t = 0.16	.873
Blood urea nitrogen (mg/dl), Mean ± SD	14.79 ± 5.24	13.46 ± 4.82	14.92 ± 5.26	t = 1.99	.047
Serum calcium (mg/dl), Mean ± SD	9.31 ± 0.35	9.37 ± 0.35	9.31 ± 0.35	t = −1.32	.187
Serum phosphorus (mg/dl), Mean ± SD	3.48 ± 0.52	3.51 ± 0.62	3.48 ± 0.51	t = −0.36	.724
Total cholesterol (mg/dl), Mean ± SD	196.91 ± 43.72	195.52 ± 45.06	197.04 ± 43.62	t = 0.25	.803
White blood cell count (1000 cells/µL), Mean ± SD	6.65 ± 1.94	7.19 ± 1.89	6.59 ± 1.94	t = −2.21	.027
Absolute lymphocyte count (1000 cells/µL), Mean ± SD	1.92 ± 0.69	2.12 ± 0.67	1.90 ± 0.69	t = −2.34	.020
Monocyte (1000 cells/µL), Mean ± SD	0.57 ± 0.23	0.58 ± 0.18	0.57 ± 0.24	t = −0.34	.736
Neutrophil count (1000 cells/µL), Mean ± SD	3.91 ± 1.51	4.22 ± 1.43	3.88 ± 1.52	t = −1.63	.104
Vitamin D intake (mg), M (Q₁, Q₃)	3.95 (2.05, 6.47)	3.65 (1.54, 6.66)	3.95 (2.05, 6.43)	Z = −0.64	.520
Triglyceride (mg/dl), M (Q₁, Q₃)	102.00 (72.00, 153.00)	137.00 (82.25, 190.00)	100.00 (72.00, 148.00)	Z = −2.09	.036

Predictive Variable Screening

The training set was used for feature selection and model construction through a 2-step process. First, the Boruta algorithm, a random forest-based method, performed initial screening by comparing feature importance against shadow variables, retaining 14 potential predictors, including gender, race, education level, sleep disorder, PIR, osteoarthritis, weight, height, BMI, waist circumference, blood urea nitrogen, white blood cell count, absolute lymphocyte count and neutrophil count. Subsequently, LASSO regression with L1 regularization was applied for secondary selection, with the optimal regularization parameter (λ.min = 0.006674245) determined via 10-fold cross-validation to minimize prediction error while maintaining model parsimony. This final step identified 9 key predictive features: education level, sleep disorder, gender, PIR, blood urea nitrogen, osteoarthritis, white blood cell count, absolute lymphocyte count, and BMI. The selection process, illustrated in Figure 2, demonstrates a rigorous methodology for optimizing feature space while controlling overfitting.

Figure 2.

Feature selection workflow: (A) initial feature screening using the Boruta algorithm, (B) secondary feature selection using the LASSO algorithm (binomial deviance vs log(λ)); dotted vertical lines indicate λ.min and λ.1se, and (C) 9 predictors with non-zero coefficients at λ.min were retained for model development, providing a parsimonious set of clinically available variables.

Model Construction and Performance Evaluation

Through the 10-fold cross-validation grid search method with resampling settings, this study obtained the optimal hyperparameters for 9 machine learning algorithms (excluding the logistic regression model). The optimal hyperparameter settings for each algorithm were as follows: AdaBoost (number of iterations mfinal = 2, maximum tree depth maxdepth = 2), CatBoost (number of trees tree_count = 3, learning rate learning_rate = 0.03, number of features feature_count = 7), GBM (number of trees n.trees = 100, interaction depth interaction.depth = 2, shrinkage rate shrinkage = 0.01, minimum number of observations n.minobsinnode = 5), KNN (number of nearest neighbors kmax = 14, distance metric distance = 1), LightGBM (minimum data min_data = 1, learning rate learning_rate = 1, number of threads num_threads = 2, verbosity level verbosity = 1, number of iterations num_iterations = 5, early stopping rounds early_stopping_round = 3), NN (number of hidden layer nodes size = 5, weight decay decay = 0.6), RF (number of candidate features per split mtry = 2), SVM (kernel parameter Sigma = 0.1, regularization parameter C = 0.5), XGBoost (number of iterations nrounds = 10, maximum tree depth max_depth = 4, learning rate eta = 0.1, minimum loss reduction gamma = 0.5, feature sampling ratio colsample_bytree = 0.5, minimum child weight min_child_weight = 1, sample sampling ratio subsample = 0.6). Subsequently, this study constructed risk prediction models based on the optimal hyperparameters of the aforementioned machine learning algorithms, respectively. Detailed parameter settings can be found in Supplemental Materials 2.

This study evaluated the performance of risk prediction models constructed using 10 machine learning algorithms. The discriminative ability of the models was primarily assessed through AUC and the concordance index (C-index) on the test set, with the following results: LR (0.794), SVM (0.695), GBM (0.766), NN (0.789), RF (0.736), XGBoost (0.743), KNN (0.695), Adaboost (0.644), LightGBM (0.531, 0.529), and CatBoost (0.773). Further examination of confusion matrix metrics (accuracy, specificity, precision, and F1 score) revealed that the LR model outperformed other models in terms of comprehensive discriminative performance (ROC curves for each model are shown in Figure 3A and B, and detailed metrics are provided in Table 3). Given the relatively limited impact of discriminative ability on model prediction comparisons, this study further evaluated the calibration performance of the models using calibration curves and the Brier score (calibration curves for each model are shown in Figure 3C and D, and Brier scores are provided in Table 3). The calibration curves were generated by binning samples based on predicted risk and comparing the mean predicted risk in each bin with the observed actual event rate. Points in the plot represent the actual event rates for each risk bin, connected by line segments. Under ideal conditions (perfect calibration), all points should lie on the 45° diagonal reference line, indicating that predicted risks align with actual risks. The results showed that LR and NN predictions were the most accurate, with most points closely adhering to the reference line except for the highest-risk bin, demonstrating minimal overall deviation and excellent calibration performance. The Brier score evaluation also indicated that LR and NN predictions were superior to those of other models.

Figure 3.

Comparison of model performance across 10 algorithms: (A and B) ROC curves in the training and validation sets illustrate discrimination and highlight potential overfitting when training performance does not generalize, (C and D) Calibration curves evaluate agreement between predicted and observed risks, which is essential for clinical risk communication, and (E and F) Decision curve analysis (DCA) summarizes net benefit across threshold probabilities and links model output to potential clinical decision-making. Clinical implication: Net benefit across clinically plausible thresholds suggests the model may help guide decisions about whom to screen and refer, while thresholds should be tailored to local resources and care pathways.

Table 3.

Performance Comparison Results of 10 Machine Learning Models.

Data set	Model	Accuracy	Recall	Specificity	Precision	F1 score	C index	Brier score
Train	LR	0.757	0.714	0.762	0.223	0.340	0.789	0.071
	SVM	0.973	0.982	0.973	0.775	0.866	0.985	0.052
	GBM	0.795	0.696	0.804	0.255	0.373	0.816	0.072
	NN	0.721	0.821	0.712	0.215	0.341	0.839	0.067
	RF	1.000	1.000	1.000	1.000	1.000	1.000	0.022
	XGBoost	0.834	0.804	0.837	0.321	0.459	0.878	0.091
	KNN	0.961	1.000	0.957	0.691	0.818	0.982	0.045
	Adaboost	0.684	0.786	0.674	0.188	0.303	0.747	0.073
	LightGBM	0.915	0.732	0.933	0.512	0.603	0.889	0.034
	CatBoost	0.818	0.607	0.839	0.266	0.370	0.770	0.268
Valid	LR	0.828	0.739	0.837	0.293	0.420	0.794	0.065
	SVM	0.533	0.826	0.506	0.133	0.229	0.695	0.074
	GBM	0.774	0.783	0.773	0.240	0.367	0.766	0.070
	NN	0.810	0.739	0.817	0.270	0.395	0.789	0.066
	RF	0.748	0.696	0.753	0.205	0.317	0.736	0.073
	XGBoost	0.850	0.565	0.876	0.295	0.388	0.743	0.098
	KNN	0.715	0.696	0.717	0.184	0.291	0.695	0.081
	Adaboost	0.606	0.696	0.598	0.137	0.229	0.644	0.089
	LightGBM	0.544	0.652	0.534	0.114	0.194	0.529	0.157
	CatBoost	0.708	0.783	0.701	0.194	0.310	0.773	0.268

Note. Train = training set; Valid = validation set; LR = logistic regression; SVM = support vector machine; GBM = Gradient Boosting Machine; NN = Neural Network; RF = random forest; XGBoost = eXtreme Gradient Boosting; KNN = K-Nearest Neighbor; Adaboost = Adaptive Boosting; LightGBM = Light Gradient Boosting Machine; CatBoost = Categorical Boosting.

Additionally, the clinical utility of the models was assessed through DCA (Figure 3E and F). The DCA curves plotted threshold probability (ie, the minimum predicted risk probability for considering intervention) on the x-axis and net benefit values based on model decisions on the y-axis. Except for the threshold probability range of .42 to .63, the LR model exhibited significantly higher net benefits than the “no intervention” and “full intervention” baseline lines across all other threshold probabilities, outperforming other models, and indicating superior clinical applicability within this range. The closer or higher the DCA curve is to the baseline, the more significant the model’s effect on optimizing clinical decisions. Based on a comprehensive evaluation of all performance metrics, the LR model demonstrated the best overall performance, combining excellent calibration and clinical utility, and thus proved to be the most suitable predictive model in this study.

Model Interpretability

To thoroughly interpret the results of the LR model, this study employed the SHAP method for visual analysis, with the outcomes displayed in Figure 4. Figure 4D presents the SHAP feature importance ranking of the LR model. The analysis reveals that education level, sleep disorder, gender, PIR, blood urea nitrogen, osteoarthritis, white blood cell count, absolute lymphocyte count, and BMI are the 9 key features influencing depressive symptoms in middle-aged and elderly sarcopenia patients. Among these, education level, sleep disorder, gender, PIR, blood urea nitrogen, and osteoarthritis are the 6 most predictive features.

Figure 4.

Additional SHAP visualizations for the logistic regression model: (A) Beeswarm plot of SHAP values ranked by mean absolute importance, (B) SHAP waterfall plot illustrating how each feature shifts the prediction for an individual participant, (C) SHAP force plot showing feature contributions for a representative participant, and (D) Global feature importance ranking based on mean absolute SHAP values. Clinical interpretation: SHAP highlights which routinely available features most strongly drive predicted risk, supporting targeted screening and modifiable risk-factor management alongside rehabilitation.

As shown in Figure 4A, the SHAP beeswarm plot summarizes the distribution of SHAP values for each predictor across participants, thereby illustrating both the direction and magnitude of each predictor’s contribution to depressive symptoms risk. This visualization improves clinical interpretability by linking routinely measured features to individualized risk estimates.

Additionally, this study utilized SHAP waterfall plots (Figure 4B) and force plots (Figure 4C) to illustrate the relationship between individual sample features and their risk of depressive symptoms. In the plots, purple denotes lower feature values, and yellow denotes higher feature values, visually reflecting the direction and magnitude of each feature’s contribution to the model’s prediction for specific samples.

Taking the example patient, in the figure: This male sarcopenia patient has no sleep disorder, possesses a high school or higher education level, a low poverty-income ratio, and test indicators showing a white blood cell count of 6.1 × 1000 cells/µL, a BMI of 27.2 kg/m², a blood urea nitrogen level of 13 mg/dL, an absolute lymphocyte count of 2.3 × 1000 cells/µL, and osteoarthritis. Based on the model’s prediction, this patient has approximately a 2.19% probability of developing depressive symptoms.

Clinical Applications of Predictive Models

To facilitate the routine clinical implementation of the optimal logistic regression model, we translated it into a nomogram (Figure 5). By aligning each patient’s values for the 9 predictors on the nomogram, clinicians can derive an individualized predicted probability of depressive symptoms. In clinical practice, a higher predicted risk may be used to prioritize confirmatory assessment, such as administration of PHQ-9, and to prompt referral for mental health evaluation or psychosocial support within an integrated sarcopenia management pathway.

Figure 5.

Nomogram for predicting the probability of depressive symptoms in middle-aged and elderly patients with sarcopenia. Clinical use: This nomogram is intended as a screening-support tool to prioritize confirmatory symptoms assessment, such as the PHQ-9, and appropriate referral or psychosocial/behavioral intervention when predicted risk is elevated.

Discussion

Depressive symptoms are common among middle-aged and older adults with sarcopenia and may complicate rehabilitation and long-term functional outcomes. In our training set, 56 of 639 participants (8.76%) met the criterion for clinically relevant depressive symptoms, which falls within the range reported in cross-sectional studies of sarcopenia (8.09%-40%).²⁶ Importantly, depressive symptoms can reduce adherence to medical and behavioral treatment plans, including exercise-based rehabilitation, and may thereby worsen disability and downstream healthcare burden.^27
-29 Therefore, early identification of depressive symptoms in sarcopenia is clinically meaningful as part of integrated care aimed at preserving health-related quality of life and functional well-being.^30,31 We developed and internally validated prediction models using 10 machine-learning algorithms, and logistic regression demonstrated the best overall performance in the validation set (AUC 0.794; Brier score 0.065). Feature selection combining Boruta and LASSO identified 9 key predictors, and SHAP analysis provided transparent, patient-level interpretability, supporting clinical trust, and mechanistic hypothesis generation.^32,33 Collectively, these findings indicate that an interpretable and well-calibrated model can serve as a screening-support tool to flag individuals who may benefit from confirmatory assessment and timely psychosocial or behavioral interventions before advanced depression develops.³⁴ Importantly, because NHANES is cross-sectional, the model output should be interpreted as a risk marker for concurrent depressive symptoms burden rather than a causal pathway, and the temporal sequence between sarcopenia and depressive symptoms cannot be determined.

Key implications

Previously known: Sarcopenia is associated with a higher burden of depressive symptoms and functional decline.

What this study adds: An interpretable, internally validated logistic regression-based nomogram to predict depressive symptoms risk in middle-aged and older adults with sarcopenia using NHANES 2007-2020.

What to do next: Use the nomogram as a screening -support aid to prioritize confirmatory assessment and integrated physical–mental health interventions, particularly in socially vulnerable subgroups.

This study confirms that educational level is a core predictor of depressive symptoms in middle-aged and elderly patients with sarcopenia. Specifically, the lower the educational level, the higher the risk of depression, which is consistent with the conclusions of research on the depressive trajectory in sarcopenia patients.³⁵ The research findings of scholars such as Li et al also demonstrate a negative correlation between educational attainment and depressive symptoms.³⁶ Educational attainment modulates depression symptoms risk through biological-psychological-social multifactorial pathways. At the biological level, higher educational attainment enhances cognitive reserve and promotes neural compensatory mechanisms, thereby delaying degeneration in brain regions associated with depression.³⁷ On a psychological level, the advantage of health literacy among highly educated individuals facilitates early identification of depressive tendencies and mitigates emotional deterioration caused by illness uncertainty.³⁸ At the societal level, improving educational levels can help alleviate depressive symptoms in the elderly by enhancing their cognitive abilities and improving their economic security.³⁹ Indeed, it should be noted that educational level may introduce diagnostic bias, yet as an intervenable social determinant, it retains public health significance. Future research should employ standardized tools to control for bias and develop stratified intervention strategies: prioritizing community-based mental health programs for sarcopenia patients with low education levels while integrating family-community support systems to disrupt the depression pathway.

Sleep disorders have been identified as the second major predictive factor for depressive symptoms in middle-aged and elderly patients with sarcopenia, with those experiencing sleep disorders showing a significantly increased risk of depression. This finding aligns with research on the neuroendocrine mechanisms underlying the comorbidity of sarcopenia and depression.⁴⁰ The pathological pathway is primarily manifested as sleep disruption leading to hypothalamic-pituitary-adrenal (HPA) axis dysfunction, elevated cortisol levels, simultaneous suppression of growth hormone release, accelerated muscle protein breakdown, and induction of neuroinflammation, collectively promoting the onset of depression.⁴¹ Sleep disorders, as a modifiable factor, suggest that future research should improve sleep quality through cognitive behavioral therapy to block the pathway to depression.

This study found that gender is the third largest predictive factor for depressive symptoms in middle-aged and elderly patients with sarcopenia, with women having a significantly higher risk of depression than men. This result is consistent with previous research conclusions on gender differences in depressive symptoms.⁴² It may be related to the dual-pathway impact of declining estrogen levels in postmenopausal women on disease progression, which not only significantly inhibits muscle satellite cell activity to accelerate sarcopenia progression, but also markedly reduces hippocampal brain-derived neurotrophic factor expression to increase susceptibility to depression.^43,44 In addition, the significantly prolonged daily caregiving time for women leads to a decline in physical function, weakens role performance capacity, and results in a noticeable reduction in social network size, exacerbating social isolation.⁴⁵ It is recommended to prioritize female sarcopenia patients as a key population for depression screening, develop gender-differentiated intervention plans, and reconstruct the care responsibility allocation model through community support systems.

This study found that income level is the fourth largest predictive factor for depressive symptoms in middle-aged and elderly patients with sarcopenia, with low-income status showing a significant positive correlation with increased depression risk. These findings align with the theoretical framework that economic resources influence the comorbidities of chronic diseases.^46,47 The primary mechanism is the barrier to healthcare accessibility: low-income groups experience insufficient management of sarcopenia and accelerated physical function decline due to limitations in nutritional supplementation, exercise equipment, and health monitoring.⁴⁸ Future research should focus on expression bias in depression assessment among low-income patients, and it is recommended to include low-income sarcopenia patients in active depression screening systems.

This study found that blood urea nitrogen (BUN) level is an important predictor of depressive symptoms in middle-aged and older patients with sarcopenia, with lower BUN levels associated with higher predicted risk. BUN reflects nitrogen balance and is influenced by dietary protein intake, hepatic urea production, hydration status, and renal clearance; low BUN may therefore signal protein-energy undernutrition and impaired muscle anabolism, which could exacerbate sarcopenia-related vulnerability.⁴⁹ Nevertheless, BUN is non-specific and may be confounded by comorbidities and acute illness; thus, its contribution should be interpreted as a risk marker rather than a causal pathway.

Osteoarthritis was also a prominent predictor of depressive symptoms in middle-aged and older patients with sarcopenia. Chronic pain, mobility limitation, and reduced physical activity can accelerate muscle loss and impair participation in rehabilitation, contributing to a interlinked burden of worsening function and depressive symptoms.^30,50 Integrating pain control, joint function rehabilitation, and mental health screening may therefore be particularly relevant for sarcopenic patients with osteoarthritis.

In addition to the top predictors, elevated white blood cell count, increased absolute lymphocyte count, and higher BMI levels were associated with higher predicted depressive symptoms risk in adults with sarcopenia. These hematologic measures are readily available but highly non-specific and can be influenced by infection, chronic disease, medications, and other comorbidities. Nevertheless, accumulating evidence supports a role for systemic low-grade inflammation and immune dysregulation in depressive symptoms, potentially through effects on neurotransmitter metabolism, neuroendocrine function, and synaptic plasticity.^51,52 Higher BMI may contribute through metabolic dysfunction and adiposity-related inflammation,⁵³ and through psychosocial pathways such as body-image distress and weight stigma.⁵⁴ Overall, these findings suggest that integrating routine clinical and laboratory indicators can help identify higher-risk individuals, while emphasizing the need for careful clinical interpretation and prospective validation.

Model comparison highlighted that more complex algorithms are not necessarily better for this clinical prediction task. Several non-linear models achieved very high apparent performance in the training set but showed marked degradation in the validation set, consistent with overfitting in a setting with a relatively small number of outcome events. In contrast, logistic regression provided robust discrimination, excellent calibration, and transparent coefficients that facilitate clinical acceptance. This observation is consistent with evidence from clinical prediction research showing that machine-learning methods do not reliably outperform logistic regression, particularly when predictors are structured and sample sizes are modest.⁵⁵

To enhance translational usefulness, we presented the final logistic regression model as a nomogram. Rather than serving only as a statistical visualization, the nomogram can function as a screening-support tool in sarcopenia care: clinicians can estimate an individual’s predicted risk and, when the predicted probability exceeds a prespecified threshold that should be calibrated and externally validated in future studies, initiate confirmatory symptoms assessment such as administration of PHQ-9, and refer patients for evidence-based mental health evaluation or psychosocial interventions. Depression screening with systems in place for accurate diagnosis, effective treatment, and appropriate follow-up is recommended for adults, including older adults.³⁴ Future work should determine clinically actionable thresholds, evaluate net benefit in pragmatic settings, and externally validate the model across independent cohorts and health systems.

This study has several limitations. First, the cross-sectional design of NHANES precludes causal inference and does not establish temporal ordering; thus, model outputs should be interpreted as risk markers or correlates rather than evidence of causal pathways. Second, sarcopenia was defined using a BMI-adjusted appendicular skeletal muscle mass index based on dual-energy X-ray absorptiometry, without direct assessments of muscle strength or physical performance; future work should adopt the European Working Group on Sarcopenia in Older People 2 diagnostic framework to improve phenotypic accuracy. Third, several laboratory predictors, including blood urea nitrogen, white blood cell count, and absolute lymphocyte count, are non-specific and may be affected by comorbidities or acute conditions; although these variables may enhance discrimination, their clinical interpretation warrants caution and should be evaluated in prospective settings. Fourth, NHANES primarily samples community-dwelling individuals; therefore, our findings may not fully generalize to institutionalized populations. In addition, standardized cognitive assessments were only available in selected NHANES cycles, which limited our ability to consistently account for cognitive impairment across all survey years. Medication exposures that may influence depressive symptoms or muscle mass, including antidepressant therapy and long-term systemic corticosteroids, were not used as exclusion criteria, or incorporated as predictors because information on indication and cumulative duration is limited and not consistently harmonizable across cycles; residual confounding may remain. Future studies should incorporate medication data and perform sensitivity analyses to further evaluate clinical specificity. Fifth, despite internal validation, the number of outcome events was relatively limited, and external validation is required. Validation in independent multicenter cohorts, together with prospective follow-up studies, is needed to assess transportability, refine decision thresholds, and quantify clinical impact.

Conclusion

This study developed an interpretable machine learning model predicting depressive symptoms risk in middle-aged and older patients with sarcopenia using NHANES data. The optimal logistic regression model identified 9 predictors. A clinically applicable nomogram was created for rapid bedside risk quantification, enabling precision-tiered interventions and early disruption of the sarcopenia–depression interlinked burden to improve outcomes.

Supplemental Material

sj-docx-2-inq-10.1177_00469580261436992 – Supplemental material for Internally Validated Logistic Regression Nomogram for Depressive Symptoms Risk Prediction in Middle-Aged and Older Adults With Sarcopenia: Cross-Sectional Study

Supplemental material, sj-docx-2-inq-10.1177_00469580261436992 for Internally Validated Logistic Regression Nomogram for Depressive Symptoms Risk Prediction in Middle-Aged and Older Adults With Sarcopenia: Cross-Sectional Study by Enguang Li, Fangzhu Ai, Ping Tang, Hongjuan Wen and Botang Guo in INQUIRY: The Journal of Health Care Organization, Provision, and Financing

Supplemental Material

sj-docx-3-inq-10.1177_00469580261436992 – Supplemental material for Internally Validated Logistic Regression Nomogram for Depressive Symptoms Risk Prediction in Middle-Aged and Older Adults With Sarcopenia: Cross-Sectional Study

Supplemental material, sj-docx-3-inq-10.1177_00469580261436992 for Internally Validated Logistic Regression Nomogram for Depressive Symptoms Risk Prediction in Middle-Aged and Older Adults With Sarcopenia: Cross-Sectional Study by Enguang Li, Fangzhu Ai, Ping Tang, Hongjuan Wen and Botang Guo in INQUIRY: The Journal of Health Care Organization, Provision, and Financing

Supplemental Material

sj-pdf-1-inq-10.1177_00469580261436992 – Supplemental material for Internally Validated Logistic Regression Nomogram for Depressive Symptoms Risk Prediction in Middle-Aged and Older Adults With Sarcopenia: Cross-Sectional Study

Supplemental material, sj-pdf-1-inq-10.1177_00469580261436992 for Internally Validated Logistic Regression Nomogram for Depressive Symptoms Risk Prediction in Middle-Aged and Older Adults With Sarcopenia: Cross-Sectional Study by Enguang Li, Fangzhu Ai, Ping Tang, Hongjuan Wen and Botang Guo in INQUIRY: The Journal of Health Care Organization, Provision, and Financing

Footnotes

Acknowledgements

Thanks to the National Health and Nutrition Examination Survey (NHANES) for providing the data and allowing us to use it for free. We express our great gratitude to the participants in the study.

List of Abbreviations

PHQ-9 The Patient Health Questionnaire-9

NHANES National Health and Nutrition Examination Survey

NCHS National Center for Health Statistics

PIR Poverty Income Ratio

BMI Body Mass Index

SD Standard deviation

KNN K-nearest neighbor

LR Logistic regression

RF Random forest

SVM Support vector machine

ROC Receiver Operating Characteristic

AUC Area under the curve

DCA Decision curve analysis

SHAP Shapley Additive exPlanations

CVD Cardiovascular disease

CDC Centres for Disease Control and Prevention

LASSO Least Absolute Shrinkage and Selection Operator

GBM Gradient Boosting Machine

NN NeuralNetwork

ORCID iDs

Enguang Li

Fangzhu Ai

Hongjuan Wen

Botang Guo

Ethical Considerations

NHANES data collection procedures and protocols were reviewed and approved by the National Center for Health Statistics (NCHS) Ethics Review Board (ERB), including Protocol #2005-06 (continuation approved September 19, 2007), Protocol #2011-17 (approved November 10, 2011), and Protocol #2018-01 (approved October 26, 2017), with annual continuation reviews for subsequent survey cycles. The study procedures were conducted in accordance with the Declaration of Helsinki and its subsequent amendments. This study is a secondary analysis of publicly available, de-identified NHANES data; therefore, additional institutional review board approval for this secondary analysis was not required. The original ERB approval documentation is provided as a separate “Research Ethics Documentation” file.

Consent to Participate

Written informed consent was obtained from all participants (and parental permission/assent for minors, as applicable).

Author Contributions

EL: Conceptualization, Methodology, Formal analysis, Data Curation, Writing—Original Draft, and Writing—Review & Editing. FA: Conceptualization, Methodology, Formal analysis, Data Curation. PT: Formal analysis, Data Curation, Conceptualization, Methodology, Formal analysis. BG and HW: Conceptualization, Formal analysis, Supervision, Writing—Original Draft, and Writing—Review & Editing.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Science and Technology Project of the Jilin Provincial Administration of Traditional Chinese Medicine (No.2024260), the Changchun University of Traditional Chinese Medicine Theme Case Project (No.2024YJ03), the Shenzhen Key Medical Discipline Construction Fund (No. SZXK062), the 2025 Thematic Case Project of the Development Center for Degree and Graduate Education, Ministry of Education (No. ZT-2510199001), and the Shenzhen Philosophy and Social Science Planning Project (No. SZ2024C018).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

It can be obtained by visiting the NHANES website () if you need more information.

Supplemental Material

Supplemental material for this article is available online.

References

Yang

Jiang

Yang

Chen

Sarcopenia and nervous system disorders. J Neurol. 2022;269:5787-5797. doi:10.1007/s00415-022-11268-8

Ryall

Schertzer

Lynch

GS.

Cellular and molecular mechanisms underlying age-related skeletal muscle wasting and weakness. Biogerontology. 2008;9:213-228. doi:10.1007/s10522-008-9131-0

Yuan

Larsson

SC.

Epidemiology of sarcopenia: prevalence, risk factors, and consequences. Metabolism. 2023;144:155533. doi:10.1016/j.metabol.2023.155533

Brown

Harhay

MN.

Sarcopenia and mortality among a population-based sample of community-dwelling older adults. J Cachexia Sarcopenia Muscle. 2016;7:290-298. doi:10.1002/jcsm.12073

Al Gilani

Tingö

Kihlgren

Schröder

. Mental health as a prerequisite for functioning as optimally as possible in old age: a phenomenological approach. Nurs Open. 2021;8:2025-2034. doi:10.1002/nop2.698

Tong

, et al. Prevalence of depression in patients with sarcopenia and correlation between the two diseases: systematic review and meta-analysis. J Cachexia Sarcopenia Muscle. 2022;13:128-144. doi:10.1002/jcsm.12908

Xian

Chai

Gong

, et al. The relationship between healthy lifestyles and cognitive function in Chinese older adults: the mediating effect of depressive symptoms. BMC Geriatr. 2024;24:299. doi:10.1186/s12877-024-04922-5

Song

Zhang

Song

Zhao

Association between changes in depressive symptoms and falls: the China health and retirement longitudinal study (CHARLS). J Affect Disord. 2023;341:393-400. doi:10.1016/j.jad.2023.09.004

Inoue

Haseda

Shiba

Tsuji

Kondo

Social isolation and depressive symptoms among older adults: a multiple bias analysis using a longitudinal study in Japan. Ann Epidemiol. 2023;77:110-118. doi:10.1016/j.annepidem.2022.11.001

10.

Tian

Yang

Tang

Guo

Develop and validate machine learning models to predict the risk of depressive symptoms in older adults with cognitive impairment. BMC Psychiatry. 2025;25:219. doi:10.1186/s12888-025-06657-y

11.

Saczynski

Beiser

Seshadri

Auerbach

Wolf

Depressive symptoms and risk of dementia: the Framingham Heart Study. Neurol. 2010;75:35-41. doi:10.1212/WNL.0b013e3181e62138

12.

Kattan

MW.

Nomograms are superior to staging and risk grouping systems for identifying high-risk patients: preoperative application in prostate cancer. Curr Opin Urol. 2003;13:111-116. doi:10.1097/00042307-200303000-00005

13.

Heo

Yoon

Park

Kim

Nam

Heo

JH.

Machine learning-based model for prediction of outcomes in acute stroke. Stroke. 2019;50:1263-1265. doi:10.1161/STROKEAHA.118.024293

14.

Cao

Creating machine learning models that interpretably link systemic inflammatory index, sex steroid hormones, and dietary antioxidants to identify gout using the SHAP (SHapley Additive exPlanations) method. Front Immunol. 2024;15:1367340. doi:10.3389/fimmu.2024.1367340

15.

Paulose-Ram

Graber

Woodwell

Ahluwalia

The National Health and Nutrition Examination Survey (NHANES), 2021–2022: adapting data collection in a COVID-19 environment. Am J Public Health. 2021;111:2149-2156. doi:10.2105/ajph.2021.306517

16.

Shi

Chen

Jiang

Chen

Liao

Huang

A more accurate method to estimate muscle mass: a new estimation equation. J Cachexia Sarcopenia Muscle. 2023;14:1753-1761. doi:10.1002/jcsm.13254

17.

Studenski

Peters

Alley

, et al. The FNIH sarcopenia project: rationale, study description, conference recommendations, and final estimates. J Gerontol A Biol Sci Med Sci. 2014;69:547-558. doi:10.1093/gerona/glu010

18.

Gao

Jia

Zhao

Han

The effect of activity participation in middle-aged and older people on the trajectory of depression in later life: National Cohort Study. JMIR Public Health Surveill. 2023;9:e44682. doi:10.2196/44682

19.

Peduzzi

Concato

Kemper

Holford

Feinstein

AR.

A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996;49:1373-1379. doi:10.1016/s0895-4356(96)00236-3

20.

Liu

Tong

, et al. The association between sarcopenia and incident of depressive symptoms: a prospective cohort study. BMC Geriatr. 2024;24:74. doi:10.1186/s12877-023-04653-z

21.

Spitzer

Kroenke

Williams

JB.

Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary care evaluation of mental disorders. Patient Health Questionnaire. JAMA. 1999;282:1737-1744. doi:10.1001/jama.282.18.1737

22.

Kroenke

Spitzer

Williams

JB.

The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16:606-613. doi:10.1046/j.1525-1497.2001.016009606.x

23.

Steffens

DC.

Treatment-resistant depression in older adults. N Engl J Med. 2024;390:630-639. doi:10.1056/NEJMcp2305428

24.

Vandelaar

Jiang

Saini

, et al. PHQ-9 and SNOT-22: elucidating the prevalence of depression in chronic rhinosinusitis. Otolaryngol Head Neck Surg. 2020;162:142-147. doi:10.1177/0194599819886852

25.

Collins

Moons

KGM

Dhiman

, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ. 2024;385:e078378. doi:10.1136/bmj-2023-078378

26.

Ganggaya

Vanoh

Ishak

WRW

. Prevalence of sarcopenia and depressive symptoms among older adults: a scoping review. Psychogeriatrics. 2024;24:473-495. doi:10.1111/psyg.13060

27.

DiMatteo

Lepper

Croghan

TW.

Depression is a risk factor for noncompliance with medical treatment: meta-analysis of the effects of anxiety and depression on patient adherence. Arch Intern Med. 2000;160:2101-2107. doi:10.1001/archinte.160.14.2101

28.

Chisholm

Sweeny

Sheehan

, et al. Scaling-up treatment of depression and anxiety: a global return on investment analysis. Lancet Psychiatry. 2016;3:415-424. doi:10.1016/S2215-0366(16)30024-4

29.

GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Psychiatry. 2022;9:137-150. doi:10.1016/S2215-0366(21)00395-3

30.

Cruz-Jentoft

Bahat

Bauer

, et al. Sarcopenia: revised European consensus on definition and diagnosis. Age Ageing. 2019;48:601. doi:10.1093/ageing/afz046

31.

Reis

JMS

Alves

Vogt

. According to revised EWGSOP sarcopenia consensus cut-off points, low physical function is associated with nutritional status and quality of life in maintenance hemodialysis patients. J Ren Nutr. 2022;32:469-475. doi:10.1053/j.jrn.2021.06.011

32.

Singh

Lanchantin

Sekhon

Attend and predict: understanding gene regulation by selective attention on chromatin. Adv Neural Inf Process Syst. 2017;30:6785-6795.

33.

Kui

Pintér

Molontay

, et al. EASY-APP: an artificial intelligence model and application for early and easy prediction of severity in acute pancreatitis. Clin Transl Med. 2022;12:e842. doi:10.1002/ctm2.842

34.

Barry

Nicholson

Silverstein

, et al. Screening for depression and suicide risk in adults: US Preventive Services Task Force Recommendation Statement. JAMA. 2023;329:2057-2067. doi:10.1001/jama.2023.9297

35.

Liu

Chen

, et al. Relationship between sarcopenia and the trajectories of depressive symptoms among Chinese older adults: the mediating effect of social participation. Int J Behav Med. Published online April 30, 2025. doi:10.1007/s12529-025-10366-x

36.

Sun

Luo

Huang

Associations between education levels and prevalence of depressive symptoms: NHANES (2005-2018). J Affect Disord. 2022;301:360-367. doi:10.1016/j.jad.2022.01.010

37.

Stern

What is cognitive reserve? Theory and research application of the reserve concept. J Int Neuropsychol Soc. 2002;8:448-460.

38.

Zhang

Uncertainty in illness: theory review, application, and extension. Oncol Nurs Forum. 2017;44:645-649. doi:10.1188/17.ONF.645-649

39.

Zhao

Wang

Lou

, et al. The effect of education level on depressive symptoms in Chinese older adults-parallel mediating effects of economic security level and subjective memory ability. BMC Geriatr. 2024;24:635. doi:10.1186/s12877-024-05233-5

40.

Häuser

Ablin

Fitzcharles

, et al. Fibromyalgia. Nat Rev Dis Primers. 2015;1:15022-20150813. doi:10.1038/nrdp.2015.22

41.

Antonijevic

HPA axis and sleep: identifying subtypes of major depression. Stress. 2008;11:15-27. doi:10.1080/10253890701378967

42.

Michas

Magriplis

Micha

, et al. Sociodemographic and lifestyle determinants of depressive symptoms in a nationally representative sample of Greek adults: the Hellenic National Nutrition and Health Survey (HNNHS). J Affect Disord. 2021;281:192-198. doi:10.1016/j.jad.2020.12.013

43.

Seko

Fujita

Kitajima

Nakamura

Imai

Ono

Estrogen receptor β controls muscle growth and regeneration in young female mice. Stem Cell Reports. 2020;15:577-586. doi:10.1016/j.stemcr.2020.07.017

44.

Spencer-Segal

Tsuda

Mattei

, et al. Estradiol acts via estrogen receptors alpha and beta on pathways important for synaptic plasticity in the mouse hippocampal formation. Neurosci. 2012;202:131-146. doi:10.1016/j.neuroscience.2011.11.035

45.

Berkman

Glass

Brissette

Seeman

TE.

From social integration to health: Durkheim in the new millennium. Soc Sci Med. 2000;51:843-857. doi:10.1016/s0277-9536(00)00065-4

46.

Link

Phelan

Social conditions as fundamental causes of disease. J Health Soc Behav. 1995;35:80-94.

47.

Adler

Newman

Socioeconomic disparities in health: pathways and policies. Health Aff. 2002;21:60-76. doi:10.1377/hlthaff.21.2.60

48.

Braveman

Gottlieb

The social determinants of health: it’s time to consider the causes of the causes. Public Health Rep. 2014;129(Suppl 2):19-31. doi:10.1177/00333549141291s206

49.

Fouque

Kalantar-Zadeh

Kopple

, et al. A proposed nomenclature and diagnostic criteria for protein-energy wasting in acute and chronic kidney disease. Kidney Int. 2008;73:391-398. doi:10.1038/sj.ki.5002585

50.

Hunter

Bierma-Zeinstra

Osteoarthritis. Lancet. 2019;393:1745-1759. doi:10.1016/S0140-6736(19)30417-9

51.

Miller

Raison

CL.

The role of inflammation in depression: from evolutionary imperative to modern treatment target. Nat Rev Immunol. 2016;16:22-34. doi:10.1038/nri.2015.5

52.

Sakai

Kobayashi

Lymphocyte ‘homing’ and chronic inflammation. Pathol Int. 2015;65:344-354. doi:10.1111/pin.12294

53.

Shelton

Miller

AH.

Eating ourselves to death (and despair): the contribution of adiposity and inflammation to depression. Prog Neurobiol. 2010;91:275-299. doi:10.1016/j.pneurobio.2010.04.004

54.

Bianciardi

Di Lorenzo

Niolu

, et al. Body image dissatisfaction in individuals with obesity seeking bariatric surgery: exploring the burden of new mediating factors. Riv Psichiatr. 2019;54:8-17. doi:10.1708/3104.30935

55.

Christodoulou

Collins

Steyerberg

Verbakel

Van Calster

A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12-22. doi:10.1016/j.jclinepi.2019.02.004