Abstract
Background:
With advances in cancer treatment, the number of cancer survivors has increased, bringing attention to long-term complications such as alterations in bone mineral density (BMD). Although survivors are at elevated risk for low BMD, prior studies have focused on specific cancer types and relied on traditional regression models, which are limited in capturing complex inter-variable relationships. This study aimed to examine the causal relationships among factors affecting BMD in cancer survivors and age-matched controls using causal Bayesian network (CBN) modeling.
Methods:
Data from the 2010-2011 Korea National Health and Nutrition Examination Survey (KNHANES) V were analyzed. We included 227 cancer survivors and 681 age- and sex-matched controls. Associations between BMD and variables such as age, sex, body composition, smoking, fracture history, and vitamin D were assessed using linear regression. A CBN model was then applied to evaluate probabilistic dependencies and potential causal relationships between variables and femoral neck BMD.
Results:
Among all participants, age, sex, smoking, fracture history, body fat percentage, muscle mass, and cancer history were significantly associated with femoral neck BMD. In cancer survivors, age (β = −0.032, P < .001) and sex (β = −0.680, P < .001) showed negative associations with BMD, whereas higher muscle mass (β = 0.073, P < .001) was a strong positive predictor. Smoking (β = −0.779, P = .005) and previous fractures (β = −0.507, P = .003) were also linked to lower BMD. The CBN model identified direct effects of age and muscle mass on BMD, with indirect effects from sex, smoking, and fracture history. Among women aged >60 years, greater muscle mass appeared particularly protective.
Conclusion:
Causal Bayesian network modeling identified muscle mass as a key modifiable factor influencing BMD among cancer survivors. These findings highlight the importance of muscle-preserving lifestyle interventions, including resistance exercise and adequate protein intake, in survivorship care. The CBN approach provides a framework for identifying individualized risk pathways and can support personalized bone-health management strategies in clinical practice.
Introduction
With advancements in cancer treatment, cancer survivorship has become a critical focus in modern health care, leading to a growing population of individuals living beyond cancer diagnosis. 1 While this represents a major clinical achievement, cancer survivors often face long-term health challenges related to both the disease and its treatment. 2 Among these, changes in bone mineral density (BMD) have emerged as a significant concern due to their association with increased fracture risk and reduced quality of life.
Bone mineral density is a key indicator of bone strength, influenced by multiple factors such as genetics, lifestyle, hormonal status, and medical interventions.3-6 In cancer survivors, BMD alterations may result from chemotherapy-induced bone loss, hormonal changes, physical inactivity, or nutritional deficiencies. 7 These metabolic alterations accelerate bone loss, leading to osteoporosis and fractures that impair long-term quality of life and functional independence.
Previous studies have extensively examined bone health in specific groups of cancer survivors, often focusing on individual risk factors like chemotherapy or hormonal therapy.8,9 Most of these studies have used traditional regression-based approaches,10,11 which primarily capture unidirectional associations and may not fully represent the complex, interrelated effects of multiple determinants of bone health. In particular, the combined influence of muscle mass, adiposity, and metabolic factors on BMD has been insufficiently explored in large, population-based studies of cancer survivors.
A causal Bayesian network (CBN) provides an advanced analytical framework that models probabilistic dependencies and potential causal pathways among multiple interrelated factors. 12 Therefore, this study applied CBN modeling to data from the Korea National Health and Nutrition Examination Survey (KNHANES) to examine the causal relationships between demographic, lifestyle, and clinical variables affecting BMD in cancer survivors. By identifying key modifiable determinants and causal pathways, this study aims to provide insights that support personalized interventions to preserve bone health and reduce long-term complications in this population.
Materials and Methods
Study population
Data were obtained from the KNHANES, a nationwide, cross-sectional survey conducted by the Division of Chronic Disease Surveillance, and the Korea Centers for Disease Control and Prevention (KCDC). Korea National Health and Nutrition Examination Survey uses a stratified, multistage clustered probability sampling design to represent the non-institutionalized civilian population of South Korea. This study analyzed data from the KNHANES V (2010-2011), which are publicly available on the official KNHANES website (http://knhanes.kdca.go.kr). All data are fully anonymized and de-identified prior to release, ensuring that no individual participants could be identified. Detailed information on the survey design and data collection procedures has been described previously. 13
Among 17 476 participants in the 2010-2011 KNHANES, we excluded those with missing body-weight data (n = 7445), those without dual-energy X-ray absorptiometry (DXA) measurements (n = 1317), participants younger than 19 years of age (n = 1198), and those lacking information on cancer diagnosis (n = 58). After exclusions, 7458 participants were eligible for analysis. Participants with missing data for key variables were excluded from the respective analyses (complete-case analysis). Because the proportion of missing data was minimal across included variables, no imputation was performed.
Cancer survivors were defined as individuals who answered “yes” to the question, “Have you ever been diagnosed with cancer by a physician?” Among these, 227 participants (84 males and 143 females) were classified as cancer survivors. To create a comparable control group, we estimated propensity scores using a logistic regression model that included age, sex, and smoking status as covariates. A 1:3 case-control matching was then performed using the nearest neighbor method without replacement, resulting in 681 matched individuals (264 males and 417 females) in the healthy control group. A 1:3 matching ratio was chosen to balance statistical power and matching quality. Previous methodological studies have shown that while increasing the number of controls may enhance precision, gains beyond a 1:3 ratio are generally minimal and may increase the risk of poorer match quality.14,15 A flow diagram of the participant selection process is presented in Figure 1. The KNHANES protocol was approved by the Institutional Review Board of the KCDC (No: 2010-02CON-21C, 2011-02CON-06C). Written informed consent was obtained from all participants. As this study used publicly available, de-identified data, no additional approval was required for secondary data analysis.

Flowchart of participant selection.
The reporting of this study adheres to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines for cross-sectional studies. 16 The completed STROBE checklist has been provided as Supplementary File 1.
Study variables
Anthropometric measurements
Height and weight were measured by trained technicians following standardized KNHANES protocols while participants wore light clothing and no shoes. Height and weight were recorded using calibrated equipment, with precision to 0.1 cm and 0.1 kg, respectively.
Body mass index (BMI) was calculated as weight in kilograms divided by the square of height in meters.
BMD measurement: DXA
Bone mineral density (g/cm2) was measured using DXA (Hologic Discovery, Hologic Inc., Bedford, Massachusetts) at several skeletal sites, including the lumbar spine (L2-L4), femoral neck, trochanter, and Ward’s triangle. All measurements were performed according to standardized procedures established by the KCDC, with equipment calibration and quality control regularly performed by survey staff. Femoral neck BMD was selected as the primary outcome, as it provides consistent diagnostic accuracy across sexes and cancer types, is less affected by degenerative spine changes, and is a strong predictor of osteoporotic fracture risk. 17
Definition of osteoporosis
The diagnosis of osteoporosis was based on the World Health Organization (WHO) criteria, which classify bone status using T-scores derived from DXA measurements of the hip or spine. A T-score reflects how many standard deviations an individual’s BMD differs from the mean BMD of a young reference population. According to WHO definitions:
Normal bone mass: T-score ⩾ −1.0
Low bone mass (osteopenia): T-score between −1.0 and −2.5
Osteoporosis: T-score ⩽ −2.5
World Health Organization criteria also allow a clinical diagnosis of osteoporosis in individuals who experience low-trauma or fragility fractures, even if DXA measurements are unavailable. 18
Statistical analysis
Data are presented as the mean ± SD for continuous variables and as numbers with percentages for categorical variables. Participant characteristics according to cancer history were compared using generalized estimating equations (GEEs) clustered on the matched pair. 19 To evaluate the associations between study variables and femoral neck BMD, complex-sample linear regression analyses were conducted using both univariable and multivariable models. The multivariable model included sex, age, smoking status, history of fracture, regular exercise, serum calcium level, body fat percentage, appendicular skeletal muscle mass, and cancer history as covariates. Given the high correlation between obesity and body fat percentage, obesity was excluded from the model to avoid multicollinearity. Subgroup analyses stratified by cancer status were also performed using multivariable linear regression.
All statistical analyses were performed with SAS version 9.4 (SAS Institute Inc., Cary, North Carolina) and STATA version 18.0 (Stata Corporation, College Station, Texas). P values of <.05 were considered to indicate statistical significance.
Bayesian network analysis
Data preprocessing
In the multivariable linear regression analysis, the 8 variables—age, sex, smoking, fracture, vitamin D, body fat percentage, muscle mass, and cancer—showed statistically significant associations with femur neck BMD T-score. Therefore, these 8 variables, along with femur neck BMD T-score, were included in Bayesian network analysis. All variables were continuous or ordinal. Continuous variables were standardized into z-score and subsequently discretized as follows: z < –1 (low expression, coded as 0), –1 ⩽ z ⩽ 1 (normal range, coded as 1), and z > 1 (high expression, coded as 2). 20 Ages < 50 years, 50 to 60 years, and >60 years were expressed as 0, 1, and 2, respectively. The Bayesian Network Inference with Java Objects (BANJO) software was used for CBN structure learning. BANJO applies a data-driven approach utilizing Bayesian network frameworks to obtain directed graphical relationships among variables.20,21 Discretization thresholds followed standardized, threshold-based approach commonly applied in prior Bayesian network studies to minimize bias and maintain interpretability.22,23
Learning the CBN structure and learning causal Bayesian parameters
Twelve independent BANJO runs were executed (3 runs each at 3, 6, 12, and 24 hours; total of 135 hours). Among the 12 best log-likelihood structures reported by each run, the network with the highest log-likelihood score (denoted as S1; note that S1 includes 9 variables, each representing clinical information) was selected as the final model. Parameter learning was performed using the GeNIe graphical network interface (version 3.0.R2; BayesFusion), which estimated conditional probability tables from the dataset.
Results
General characteristics
A total of 227 cancer survivors and 681 age-matched controls participated in the study. General characteristics of cancer survivors and matched controls are shown in Table 1. The mean BMI, smoking, fracture history, exercise, serum calcium, serum vitamin D, and bone density did not differ significantly between cancer survivors and age-matched healthy controls.
Baseline characteristics of cancer survivors and age-matched controls.
Values are presented as mean ± standard deviation (SD) for continuous variables and number (percentage) for categorical variables. P-values were calculated using the generalized estimating equation (GEE) to account for matched pairs.
Linear regression analyses were conducted for femoral neck BMD T-score in all participants (Table 2). In univariable analysis, sex, age, smoking, fracture history, exercise, obesity, serum calcium, body fat percentage, muscle mass, and cancer history showed significant correlations with femur neck BMD. In the multivariable analysis, exercise and serum calcium were no longer significantly associated with femoral neck BMD after adjusting for other covariates. On the contrary, serum vitamin D (E = 0.009, SD = 0.004, P = .030) remained significantly associated with femoral neck BMD.
Univariable and multivariable linear regression analysis for femoral neck bone mineral density (BMD).
Multivariable models were adjusted for sex, age, smoking status, fracture history, regular exercise, serum calcium, body fat percentage, muscle mass, and cancer history. Values are presented as regression coefficient (estimate) ± standard error (std. err).
Table 3 shows the results of multivariable linear regression analyses stratified by cancer status. In the age-matched control group (non-cancer group), sex (E = −0.461, SD = 0.198, P = .020), age (E = −0.052, SD = 0.005, P < .001), and muscle mass (E = 0.051, SD = 0.007, P < .001) were significantly correlated with femur neck BMD. In the cancer survivor group, sex (E = −0.680, SD = 0.290, P = .021), age (E = −0.032, SD = 0.009, P < .001), muscle mass (E = 0.073, SD = 0.014, P < .0001), smoking (E = −0.779, SD = 0.275, P = .005), and fracture history (E = −0.507, SD = 0.168, P = .003) showed significant correlation with femur neck BMD.
Multivariable linear regression analysis stratified by cancer status.
Multivariable models were adjusted for sex, age, smoking status, fracture history, regular exercise, serum calcium, body fat percentage, muscle mass, and cancer history. Values are presented as regression coefficient (estimate) ± standard error (Std. err).
Bayesian network analysis
All 12 CBNs derived from the BANJO analysis produced identical log-likelihood score (−4970.9305) and network structure. The final CBN structure is presented in Figure 2. In this structure, age and muscle mass were identified as plausible direct causes of femoral neck BMD, while muscle mass was a direct effect of sex. Fracture history and body fat percentage were influenced by age and sex, respectively. In addition, age and vitamin D appeared to influence cancer status in the current dataset. When the BMD of femur neck was set as target node, age, muscle mass, and sex were key determinants in the posterior probability distribution for femoral neck BMD.

Causal Bayesian network (CBN) structure for femoral neck bone mineral density (BMD).
Depending on whether age was set to <50 or >60 years, the probabilities of low (state 0) and normal (state 1) of femoral neck BMD states changed markedly (Figure 3A and B), suggesting a negative association between age and femur neck BMD. In addition, the probability associated with smoking experience also decreased (Figure 3C and D), indicating that both aging and female sex may negatively affect femoral neck BMD. However, even when conditioning both age (>60 years) and sex (female), muscle mass maintained a positive association with femoral neck BMD (Figure 4). In other words, among women aged >60 years increased muscle mass through physical activity may have a protective effect on bone health.

Conditional probability changes of femoral neck BMD according to age and sex. Panels (A to D) illustrate changes in posterior probabilities across network nodes following conditional manipulations of the age and sex nodes.

Conditional network analysis for multivariable scenarios.
Discussion
This study assessed the interactions between BMD and various health-related and demographic variables in cancer survivors using a CBN model. The analysis highlighted the complex interplay between muscle mass and other determinants of BMD in this group.
The amount of biological data collected in the medical field is increasing and becoming more complex each year. To elucidate the underlying mechanisms of disease phenotypes, analytical tools capable of modeling causal interconnections among complex datasets are essential. Because available clinical data are often incomplete, missing data and uncertainty must be considered in analysis. However, traditional regression methods have limited ability to capture layered correlations or direct causal relationships among predictors. In contrast, Bayesian network analysis—being graphical and intuitive for clinicians—can more clearly represent hierarchical and causal dependencies among variables.24,25 Furthermore, Bayesian networks are especially well suited for reasoning under uncertainty. 26 In the present study, the CBN structure (Figure 2) enabled intuitive visualization of causal pathways among factors influencing BMD. The model also allowed personalized probabilistic predictions of the target outcome based on different conditional states of each variable.
Our findings of regarding the associations between BMD and multiple factors are consistent with those of prior studies. Especially, we confirmed that low muscle mass is a significant risk factor for reduced BMD among cancer survivors.27-30 Muscle mass and BMD have both been identified as a valuable imaging marker across various clinical settings.31-33 In addition, BMD has demonstrated predictive value in several cancer types.34-36 We also found that sex and age were key determinants of BMD in cancer survivors, consistent with previous study. 27 Interestingly, the direction of the association between smoking and BMD reversed after multivariable adjustment. While smoking initially appeared positively associated with BMD in univariable analysis, this association became negative after adjusting for confounders such as age, sex, and muscle mass. This suggests that the unadjusted relationship was confounded by demographic and body composition factors, as smokers in our cohort tended to be younger and more muscular. This adjusted negative association aligns with previous studies showing that smoking adversely affects bone metabolism and increases osteoporosis risk.37,38 Beyond demographic and behavioral variables, our analysis also considered the potential impact of body fat percentage, fracture history, and vitamin D levels—factors known to influence bone health in cancer survivors.
Previous research has established several risk factors for low BMD in cancer survivors, particularly the effects of chemotherapy and hormonal treatments on bone metabolism.39-41 Our findings corroborate these results while extending them through the application of a CBN approach, which enables the identification of both direct and indirect associations among multiple variables. The final CBN structure illustrated direct effects of age and muscle mass on femoral neck BMD, with indirect influences from sex, smoking, and fracture history. Clinically, this suggests that maintaining or improving muscle mass may protect against bone loss, even among older survivors with smoking history or prior fractures. The network also identified age as a central node connecting multiple lifestyle and biological factors, emphasizing the need for integrated geriatric and survivorship management strategies.
One of the major strengths of this study is the use of a large, nationally representative dataset, which enhances the generalizability of the findings. In addition, the application of CBN modeling provides a novel and robust framework for exploring the multifactorial nature of bone health. Although this study achieved a high prediction accuracy, several limitations should be noted. First, because this analysis was based on cross-sectional KNHANES data, temporal or causal directions among variables cannot be definitively established. Consequently, the causal pathways inferred by the CBN should be interpreted as probabilistic rather than deterministic. Second, discretization of continuous variables required for CBN learning may have caused minor information loss at threshold boundaries; however, we applied standardized, threshold-based discretization commonly used in prior studies to minimize potential bias.22,42 Third, reliance on cancer history and lifestyle factors may introduces recall bias and residual confounding despite statistical adjustments. Fourth, since KNHANES data represent only the Korean population, generalizability to other ethnic groups may be limited. Finally, absence of detailed data on cancer type, treatment modality, and duration of survivorship restricts further stratified analysis. From the clinical standpoint, our findings highlight the importance of preserving skeletal muscle mass through resistance and weight-bearing exercises—such as brisk walking, stair climbing, and light strength training—combined with adequate protein and vitamin D intake, particularly among older or postmenopausal female cancer survivors. Integrating such evidence-based lifestyle interventions into primary-care-based survivorship programs could mitigate bone loss, reduce fracture risk, and improve long-term quality of life and functional independence.
Further studies should validate these findings using longitudinal designs and diverse populations to establish temporal relationships and enhance external validity. Expanding the CBN framework to include additional behavioral and biochemical factors—dietary intake, inflammatory biomarkers, or treatment history—may further refine the prediction and guide personalized interventions. Ultimately, deepening our understanding of bone health determinants in cancer survivorship will support evidence-based strategies to promote healthier aging and improving long-term outcomes in this growing population.
Conclusion
This study successfully developed a prediction model for low BMD in cancer survivors and age-matched controls using a CBN. The findings highlight the critical role of age, sex, and muscle mass in influencing BMD. By providing a comprehensive understanding of the multifactorial determinants of bone health, our model offers valuable insights for developing targeted, evidence-based interventions to prevent osteoporosis and related complications among cancer survivors. These insights can inform clinical practice and public health strategies aimed at improving the long-term health and quality of life of cancer survivors.
Supplemental Material
sj-docx-1-onc-10.1177_11795549251411101 – Supplemental material for Prediction Model for Low Bone Mineral Density in Cancer Survivors and Age-Matched Controls Using a Causal Bayesian Network: A Nationwide Population-Based Study in Korea
Supplemental material, sj-docx-1-onc-10.1177_11795549251411101 for Prediction Model for Low Bone Mineral Density in Cancer Survivors and Age-Matched Controls Using a Causal Bayesian Network: A Nationwide Population-Based Study in Korea by Sujeong Han, Sung-Bae Park, Sohee Oh and Bumjo Oh in Clinical Medicine Insights: Oncology
Footnotes
Acknowledgements
The authors thank the staff of the KCDC for providing access to the KNHANES data. Portions of this manuscript were refined for grammar and clarity using an AI-based language tool (ChatGPT, OpenAI). No scientific data, analyses, or figures were generated or modified using this tool.
Ethical Consideration
The study was conducted using publicly available and fully de-identified data from the KNHANES. All participants provided written informed consent at the time of the national survey. The survey protocol was approved by the Institutional Review Board of the KCDC (IRB No. 2010-02CON-21C and 2011-02CON-06C). As this analysis used secondary anonymized data, no additional institutional review board approval was required.
Author Contributions
SH and SP contributed equally to this work. SH conceptualized the study, interpreted the findings, and drafted and revised the manuscript. SP designed and implemented the CBN analysis and contributed to the methodology and data interpretation. SO performed the baseline statistical analyses and assisted with interpretation of the regression results. BO supervised the overall research process and approved the final version of the manuscript as the corresponding author. All authors read and approved the final manuscript.
Funding
The authors received no research funding for this study. The article processing charge (APC) was supported by the Ministry of Science and ICT, South Korea (Grant No. 1711179413, RS-2022-00164620).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
