Predicting fracture risk for elderly osteoporosis patients by hybrid machine learning model

Abstract

Background and Objective

Osteoporotic fractures significantly impact individuals's quality of life and exert substantial pressure on the social pension system. This study aims to develop prediction models for osteoporotic fracture and uncover potential risk factors based on Electronic Health Records (EHR).

Methods

Data of patients with osteoporosis were extracted from the EHR of Xinhua Hospital (July 2012–October 2017). Demographic and clinical features were used to develop prediction models based on 12 independent machine learning (ML) algorithms and 3 hybrid ML models. To facilitate a nuanced interpretation of the results, a comprehensive importance score was conceived, incorporating various perspectives to effectively discern and mine critical features from the data.

Results

A total of 8530 patients with osteoporosis were included for analysis, of which 1090 cases (12.8%) were fracture patients. The hybrid model that synergistically combines the Support Vector Machine (SVM) and XGBoost algorithms demonstrated the best predictive performance in terms of accuracy and precision (above 90%) among all benchmark models. Blood Calcium, Alkaline phosphatase (ALP), C-reactive Protein (CRP), Apolipoprotein A/B ratio and High-density lipoprotein cholesterol (HDL-C) were statistically found to be associated with osteoporotic fracture.

Conclusions

The hybrid machine learning model can be a reliable tool for predicting the risk of fracture in patients with osteoporosis. It is expected to assist clinicians in identifying high-risk fracture patients and implementing early interventions.

Keywords

Fracture risk assessment machine learning data mining fracture prevention osteoporosis

Introduction

Osteoporosis, a metabolic bone disease characterized by a decrease in the amount of bone tissue per unit volume, is prone to fracture due to a variety of reasons such as bone density and bone quality decline, bone microstructure damage, and increased bone fragility.^1–3 Osteoporosis is highly prevalent among the elderly, afflicting 36% of the older population in China,⁴ and significantly increases the likelihood of experiencing fragile fractures (medically recognized as osteoporotic fractures), which poses substantial challenges to the public health system^5,6 and invariably lead to a diminished quality of life and a reduction in life expectancy of those patients.³

Despite the widespread occurrence and severe consequences of osteoporotic fractures, the rates of diagnosis and treatment for these patients in China are disappointingly low.² The subtle onset of bone density loss, which often shows no clear symptoms initially, complicates early intervention efforts. Based on this, many patients realize their diminished bone density only when addressing unrelated health concerns. Thus, leveraging existing clinical data from Electronic Health Records (EHR) may aid in early diagnosis and timely intervention. These records, compiled during each patient check-up, encompass a wide range of information including demographics, medical history, and vital signs.⁷ The FRAX tool,⁸ developed by the World Health Organization, is widely utilized to date for assessing fracture risk. Nevertheless, its application is confined to specific demographic groups (e.g. ages 40–90), and it falls short of identifying emerging risk factors.

Machine learning (ML) strategies have the potential to bolster clinical decision-making processes by facilitating accurate and efficient early predictions of osteoporotic fractures. For example, Atkinson et al. assessed fracture risk using GBM models based on bone images.⁹ Vincenzo et al. used several machine learning models to predict fracture risk according to genotypes.¹⁰ Chen et al. developed a hybrid deep learning model to predict the risk of fracture for diabetics.¹¹ Even without desired information like bone mineral density (BMD), these models can potentially reveal indicators of fracture risk, offering valuable insights for preventive healthcare initiatives. However, there has been no model established based on EHR data for Chinese elderly patients with osteoporosis, especially those older than 90 years old.

The purpose of this study was to develop prediction models for osteoporotic fracture and uncover potential risk factors based on EHR. 12 independent ML algorithms plus 3 hybrid ML models were applied to develop prediction models for osteoporosis patients based on EHR from the Chinese elderly. Moreover, we confirmed that the hybrid XGBoost and SVM model had the best predictive performance in terms of accuracy and precision (above 90%) among all models, and 20 important features according to the comprehensive importance score which represents high-risk factors for fractures were interpreted from a clinical point of view. These findings offer valuable novel insights into the prediction of fractures in patients with osteoporosis.

Material and methods

Patients and data

We obtained the EHR of outpatients and inpatients in Xinhua Hospital, which consists of diagnosis data, medical records, vital signs, inspection index, medication history, and so on. There are a total of 14418 patients in this dataset. The workflow of our study is shown in Figure 1.

Figure 1.

The flow chart of this study.

Inclusion criteria and exclusion criteria

We initially selected patients over the age of 60 with osteoporosis as subjects, totaling 8530 individuals. The selection criteria were based on patients with the International Classification of Diseases (ICD)-10 code M81.9 in their Electronic Health Records (EHR). Besides, we referred to the guideline definitions to identify and include patients who potentially met the diagnostic criteria for osteoporosis and excluded patients with fractures caused by trauma.¹² Patients who had osteoporotic fractures were marked as 1, and others as 0. In our database, the average time from the initial check-up to the occurrence of fractures in patients with osteoporotic fractures was 109 days, with the longest time being 5 years. Additionally, abnormal records are removed from the dataset caused by input errors, such as systolic blood pressure >300 mmHg, age of menopause <30, height <30 cm, etc. The flowchart of data extraction is shown in Figure 2.

Figure 2.

The flowchart of patient selection.

In terms of feature selection, we obtained patients’ age and gender as well as medical histories of smoking, drinking, and fractures. From inspection data, parameters of blood and urine tests were obtained, such as blood Calcium, blood glucose, urine, and so on. We obtained patients’ height, weight, and blood pressure from vital signs data. From doctor's order and medication data, we obtained the patient's medication status and medication history, such as glucocorticoids, aromatase inhibitors, thyroid hormones, proton pump inhibitors, and other medicines that might affect bone metabolism. From symptom data, we obtained information about arthralgia, weight loss, and puffiness. From bone density data, we collected the bone density of patients’ lumbar spine1∼4.

We started by incorporating all possible features to mine the potential risk factors of the elderly with osteoporosis and then neglected features with too few non-null values. As a result, 146 factors were obtained which can be divided into six categories, including demographic data, medical history, inspection data, illness, medicine, and symptoms. Next, we formulated the task of identifying old osteoporosis patients for their risk of fracture as a binary classification problem by training machine learning models with these features.

Variable selection

The raw data went through a data-wrangling pipeline including missing value handling and unbalanced label processing.

Missing data is a common occurrence in clinical research, where the value of the variables of interest is not measured or recorded for all subjects in the sample.¹³ Our dataset exhibits a high rate of missing data, necessitating the establishment of an appropriate data handling criterion. Deleting all records with missing values would render the dataset extremely small, while imputing all missing values would result in most values being inferred, rendering the analysis results unreliable. XGBoost, with its capability to automatically handle missing values, was applied for the pilot study. The experiment demonstrated that the model exhibits optimal generalization capabilities when records with a feature missing rate exceeding 55% are deleted. After eliminating records where the missing rate of any feature exceeded 55%, a total of 1913 samples were obtained for subsequent training.

Several methods were used to deal with the missing values, including layered average filling, missMDA,¹⁴ and random forest regression.¹⁵ XGBoost was applied to compare the results of the three filling methods, and Area Under Curve (AUC) was used as a measure. Results are shown in Figure 3 that random forest regression filling performs the best out of the three.

Figure 3.

Receiver operating characteristic curve of three filling methods.

The effect of random forest regression filling (blue) was better than missMDA (green) and average filling method (pink).

The labels in our dataset are extremely unbalanced which can lead to poor predictive performance. One efficient and flexible strategy for solving this problem is to employ sampling techniques before training a classification learning model.¹⁶ We randomly split the dataset into training set and testing set, and then only performed oversampling and undersampling methods to the training set, leaving the testing set unprocessed. In general, a combination of oversampling and undersampling of the majority class yields better classifier performance. Once again, XGBoost was applied here to evaluate several sampling methods. Results are shown in Figure 4 that the combination of the SMOTE and Tomek links outperforms the rest.

Figure 4.

Comparison of several sampling methods’ performance.

Statistical analysis

We conducted descriptive statistical analysis and significance testing on the dataset to ascertain if there were any statistically significant differences between the fracture and non-fracture groups’ indicators. We categorized the characteristics into two groups: categorical and numerical variables. For characteristics meeting the criteria for parametric testing, we used the chi-square test and the t-test. Otherwise, we applied non-parametric tests. Further details are presented in Table 1.

Table 1.

Characteristics of the study cohort.

Variables	Fracture(n = 1090)	Non-Fracture(n = 7440)	Missing Rate	p value
Demography
Age, years, mean ± SD(range)	78.75 ± 9.56 (60–100)	74.7 ± 9.61 (60–109)	0	<0.001
Female, n (%)	853(78.26)	5804(78.01)	0	0.89
Weight, kg, mean ± SD(range)	57.88 ± 10.3(35.5–89.5)	58.31 ± 9.61(32–93)	92.83	0.34
Height, cm, mean ± SD(range)	162.25 ± 5.7(145–170)	159.98 ± 7.12(140–180)	96.67	0.04
Medical history
Smoke, n (%)	29(2.66)	305(4.1)	51.49	0.04
Vaccined, n(%)	299(27.97)	2734(28.51)	49.57	0.74
Drink, n(%)	13(1.19)	106(1.42)	52.06	0.64
Fracture, n(%)	98(8.99)	144(1.94)	0	<0.001
Inspection
Calcium-blood, mmol/L, mean ± SD(range)	2.18 ± 0.29(1.07–3.32)	2.17 ± 0.28(0.43–3.22)	78.47	0.55
C-reactive protein, mg/L, mean ± SD(range)	26.09 ± 31.21(1–174)	23.11 ± 24.82(1–141)	84.69	0.53
Na-blood, mmol/L, mean ± SD(range)	140.42 ± 3.31(125–148)	140.23 ± 3.36(113–154.37)	71.15	0.25
Apolipoprotein A/B ratio, %, mean ± SD(range)	1.63 ± 0.68(0.7–3.55)	1.66 ± 1.21(0.4–33.4)	85.21	0.77
Alkaline phosphatase, U/L, mean ± SD(range)	99.45 ± 43.18(27.00–381.00)	86.99 ± 62.99(16.00–1779.33)	79.13	<0.001
α2-globulin, g/L, mean ± SD(range)	9.85 ± 1.99(4.20–15.10)	9.79 ± 2.47(5.8–22.6)	93.29	0.23
HDL-C, mmol/L, mean ± SD(range)	1.36 ± 0.37(0.64–3.15)	1.47 ± 0.40(0.38–3.88)	84.89	0.01
Total Cholesterol (TC), mmol/L, mean ± SD(range)	4.44 ± 0.94(2.75–6.76)	4.8 ± 1.2(2.01–12.55)	84.76	0.01
DXA value of Lumbar spine 1, mean ± SD(range)	−1.39 ± 1.48(−3.96–2.98)	−1.5 ± 1.11(−3.91–2.99)	50.33	0.77
DXA value of Lumbar spine 2, mean ± SD(range)	−1.85 ± 1.3(−4–2.7)	−1.77 ± 1.21(−4–2.94)	50.36	0.2
DXA value of Lumbar spine 3, mean ± SD(range)	−1.79 ± 1.32(−3.99–2.99)	−1.53 ± 1.3(−4–2.98)	50.36	<0.01
DXA value of Lumbar spine 4, mean ± SD(range)	−1.51 ± 1.44(−3.96–2.94)	−1.26 ± 1.39(−4–2.99)	50.36	<0.01
Complications
Heart Disease(%)	61(5.6)	564(7.58)	0	0.14
Diabetes, n (%)	144(13.47)	2141(22.33)	0	<0.001
Symptoms
Weight loss, n (%)	98(9.17)	982(10.24)	0	0.98
Joint Pain, n(%)	3(0.28)	6(0.08)	0	0.18
Medicines
Acarbose, n (%)	35(3.27)	417(4.35)	0	0.23
Omeprazole, n (%)	126(11.56)	189(2.54)	0	<0.001

Models

The task of identifying fracture risk is formulated as a classification problem by training machine models with features collected from real-world data. The structure of this hybrid model is presented in Figure 5. XGBoost was used for the feature transformation and SVM was used as the classifier to predict fracture risks.

Figure 5.

Hybrid model structure diagram of XGBoost and SVM.

Results

The data set was divided into training set and testing set at a ratio of 4:1. We oversampled the training set by adapting Synthetic Minority Oversampling Technique (SMOTE) to generate the minor class, and then undersample it by Tomek links to avoid overfitting. We repeated the experiment 100 times to get the average metrics score. The results of each model are shown in Table 2.

Table 2.

Model performance metrics.

Models	Accuracy	Precision	Recall	F1
BP	72.43	68.8	75.18	68.71
LR	85.07	79.1	83.45	80.75
DT	83.76	77.55	81.72	79.1
KNN	80.99	74.68	79.49	76.19
SVM	88.64	84.96	82.87	83.77
RF	88.56	84.36	83.55	83.91
ERT	89.37	85.59	84.64	85.05
GBDT	88.12	83.27	84.56	83.82
AdaBoost	86.66	81.24	83.91	82.32
CatBoost	82.92	77.21	81.14	78.3
XGBoost	89.4	85.4	85.19	85.24
MLP	87.49	82.34	84.15	83.11
XGBoost + MLP	90.34	87.75	83.44	85
XGBoost + LR	90.08	87.49	82.86	85
XGBoost + SVM	90.6	89.1	82.79	85.1

We formulated a strategy to identify critical factors for fracture, which integrates the significance scores from four perspectives: weight, the number of times a feature is used to split the data across all trees; gain, the average gain across all splits the feature is used in; cover, the average coverage across all splits the feature is used in; SHAP (SHapley Additive exPlanations). The feature importance scores of each metrics are shown in Figures 6 and 7.

Figure 6.

The feature importance score is computed by three methods. (a) the feature importance score of weight; (b) the feature importance score of gain; (c) the feature importance score of cover.

Figure 7.

SHAP diagram of important features.

To get the comprehensive feature importance score, we first normalized the values obtained by the four methods. In this way, all values are scaled between 0 and 1.

x * = \frac{x - m i n (x)}{m a x (x) - m i n (x)}

Then, these four scores are linearly combined to generate the comprehensive feature importance score.

c o m p r e h e n s i v e f e a t u r e i m p o r t a n c e s c o r e = α \times weight + β \times gain + γ \times cover + θ \times SHAP values

Optimal coefficients are determined by evaluating various combinations. For each combination, a composite feature importance score was generated and ranked, isolating the top 20 features. These features were then fed into the model to determine accuracy, precision, recall, and F1-score.

Our model achieved the best results when coefficients for cover and gain were set at 0.3, and weight and SHAP were set at 0.2. The top-ranked features, in order, were Calcium (Ca)-blood, Apolipoprotein A/B ratio-blood, Weight loss, C Reactive Protein (CRP)-blood, Vaccination history, High-density lipoprotein cholesterol (HDL-C)-blood, Alpha2globulin (alpha2-GLO)-blood, Alkaline phosphatase (ALP)-blood, Heart disease, Retinol binding protein (RBP)-blood, Smoking history, Sodium (Na)-blood, Bilirubin (BIL)-blood, Weight, Alkaline granulocyte count-blood, DXA value of Lumbar spine 1, White blood cell esterase-urine, Insulin (2 h)-blood, Age, and DXA value of L1-L2.

For comparison, all features were input into the model, but only the top 20 features as obtained above are illustrated in Figure 8. When using only these 20 features, the model was able to achieve a superior classification effect. This suggests that these 20 features are predictive of fractures in elderly patients with osteoporosis, and thus could be considered risk factors for fracture.

Figure 8.

Top 20 important features and comparison between 20 features and all features. (a) top 20 features ranked by comprehensive feature importance score; (b) the predictive performance of the top 20 features and all features. The top features achieved better results than inputting all features.

Discussion

In the realm of medical applications where high dimensionality and limited data are common phenomena, high prediction accuracy and model stability are crucial, and the fusion of XGBoost and SVM into a hybrid model presents several advantages. The hybridization of the two models could enhance accuracy by capitalizing on the different aspects of the data each model excels at capturing. The amalgamation could amplify the models’ inherent robustness, given XGBoost's aptitude for handling noise and outliers, and SVM's resistance to overfitting when appropriately kernelized and regularized. Additionally, the combination model can efficiently manage imbalanced data, a frequent concern in medical scenarios as well, by adjusting the weights on minority classes in XGBoost and leveraging techniques such as SMOTE in SVM. Moreover, the hybrid model can provide a level of interpretability via the XGBoost component by examining the significance of the features, offering valuable insights into which factors are driving predictions. With XGBoost's capacity for capturing complex non-linear relationships and SVM's proficiency in binary and multiclass classification problems being particularly notable, both XGBoost and SVM have demonstrated superior performance across diverse tasks.¹⁷

Our comprehensive feature importance score considered four aspects of information to mine the top 20 important features (shown in Figure 8). These features play important roles in this model, which could be viewed as the risk factors for fracture. Some of the features¹⁸ found during this study are well known to be associated with osteoporosis or even fractures, such as smoking history,¹⁹ weight loss,²⁰ DXA value of Lumbar spine 1, weight, age, sodium-blood and DXA value of L1-L2.⁸ Several factors were not well demonstrated in prior studies. They appeared to be related to osteoporosis, which should be focused on, such as (i) bone metabolite: Calcium-blood and Alkaline phosphatase (ALP)-blood, (ii) inflammatory response markers: C reactive protein-blood, and (iii) indicators of lipid metabolism: Apolipoprotein A/B ratio-blood and High-density lipoprotein cholesterol (HDL-C)-blood. The detailed SHAP diagram is shown in Figure 9.

Figure 9.

The SHAP value of Calcium-blood, Alkaline phosphatase-blood, C-Reactive Protein-blood, Apolipoprotein A/B ratio, and High-density lipoprotein cholesterol.

Our model predicted that calcium (Ca)-blood may be a high-risk factor for fracture. Even though 99% of the body's calcium is stored in the bones with the remaining 1% in the blood, calcium-blood is much easier to measure than calcium-bone; therefore, it is a routine laboratory test for patients with osteoporosis. For patients with primary osteoporosis, their blood calcium values are usually within the reference range. As we all know²¹ that blood calcium and parathyroid hormone levels have reciprocal regulatory effects, such as decreasing concentrations of blood calcium ions, secondary secretion of increased parathyroid hormone by parathyroid chief cells, stimulation of osteoclast proliferation, inhibition of osteoblast activity, and resulting osteoporosis. A recent study²² suggested that patients with osteoporosis have significantly lower serum calcium than controls. Likewise in our model, when blood calcium levels decrease, the predicted probability of fracture increases.

Alkaline phosphatase (ALP) is a hydrolase enzyme responsible for removing phosphate groups from many molecules, including nucleotides, proteins, and alkaloids.²³ It belongs to bone formation markers among bone turnover biochemical markers and can react to osteoblast activity and bone formation status. ALP plays an important role in the differential diagnosis of several skeletal disorders, determination of bone turnover types, monitoring treatment adherence, and evaluation of drug efficacy,²⁴ and the levels of ALP are usually normal or mildly elevated in patients with primary osteoporosis. A recent study has also shown that the activity of serum total ALP >129 U/L is used as an indicator for osteoporosis in males,²⁵ consistent with what our model predicts.

C-reactive protein (CRP) participates in the immune response,²⁶ which is a non-specific and high-sensitivity inflammatory biomarker. The inflammatory processes are involved in a wide variety of physical health problems and systemic chronic inflammation often increases with age.²⁷ Meanwhile, studies^28,29 also show that older individuals have higher circulating levels of cytokines, chemokines, acute phase proteins, and greater expression of genes involved in inflammation. Moreover, systemic chronic inflammation is persistent and ultimately causes collateral damage to tissues and organs over time, such as by inducing osteoporosis. In addition to age, physical inactivity was found to be directly associated with increased anabolic resistance, increased CRP levels, and increased levels of proinflammatory cytokines in healthy individuals.³⁰ These effects, in turn, promote several inflammation-related pathophysiologic alterations, including osteoporosis.^31,32 Furthermore, several studies even showed that increased CRP was linked to an increased fracture rate due to osteoporosis.^33,34

The association between lipid and bone metabolism has become an increasing focus of interest in recent years.³⁵ A study by Dennison et al.³⁶ investigating the correlation between BMD and lipid profiles, observed that total spine BMD was inversely correlated with levels of apoA but positively associated with levels of apoB in males and females, which agrees with our model results. However, regarding HDL-C, our model's trends don’t appear to match exactly what has been reported. On the one hand, Ackert-Bicknell et al. proved that there is sufficient evidence to conclude that bone metabolism and HDL-C are genetically linked, and HDL-C can interact directly with osteoblasts and osteoclasts.³⁷ Yamaguchi et al.³⁸ investigated the correlation between plasma lipid levels and BMD and found that low levels of HDL-C were associated with an increased risk of vertebral fracture (similar to those predicted by our model). Nevertheless, another study reported the opposite result.³⁶ On the other hand, Hsu et al.³⁹ conducted a study that aimed to analyze the association between plasma lipid profile and BMD, bone mineral content, and osteoporotic fractures in 7137 Chinese males, and 4585 premenopausal and 2248 postmenopausal females. No significant correlation between whole-body bone mineral content and levels of HDL-C was detected. Similarly, another study showed no association between HDL-C and BMD.⁴⁰ Several factors may be responsible for these discrepancies. Firstly, the differences in non-modifiable characteristics of the subjects, including age, sex, and medication history may have introduced bias. Secondly, the differences in modifiable characteristics, including cigarette and alcohol consumption, or physical activity among the study subjects may also have led to bias. Apart from these reasons, the use of different models may have affected the results. Therefore, the relationship between lipid profile and bone metabolism warrants further investigation.

To the best of our knowledge, some of the remaining factors predicted by our model, such as heart disease,¹⁸ Retinol binding protein (RBP)-blood,⁴¹ and Bilirubin (BIL)-blood,^42,43 also seem to have some relationship with fractures and future work is suggested to focus more on these indicators.

Several limitations of our work need to be mentioned here. Firstly, the retrospective and observational nature of our study. Although we incorporated various clinical features in an attempt to make prospective predictions, it is more accurate to describe this work as essentially retrospective (i.e. focused on the discrimination or detection of an event that has already occurred). Secondly, given that the raw data is imbalanced, we have performed oversampling and undersampling of the training set, which might lead to deviation from the true value. Thirdly, the patient's data came from a single center in China. Therefore, further research with large samples and multiple centers is necessary to validate our model's performance.

Conclusions

To sum up, the hybrid model incorporating XGBoost and SVM has demonstrated utility as a predictive tool for assessing fracture risk in elderly patients. Furthermore, our study has identified several risk factors for fracture, providing valuable new insights for elderly patients with osteoporosis in clinical practice.

Footnotes

Acknowledgments

We extend our gratitude to all those who provided valuable insights and recommendations for this work.

Contributorship

The experiments are designed by Xf G and Aj X, and performed by Mh L, Zc M, and Jw R. Data analysis is conducted by XW, Yl C and Mh L. Mh L, XW, Xd X, and Aj X drafted the manuscript.

Conflicting interests

The authors declare that there is no conflict of interest.

Ethical approval

All procedures followed complied with the ethical standards of the responsible committee on human experimentation (institutional and national) and the Helsinki Declaration of 1975, as revised in 2000(5). Informed consent was obtained from all patients included in the study.

Funding

This work was funded by the National Key R&D Program of China (2020YFC2005502), the National Natural Science Foundation of China (82072142), and the Science and Technology Commission of Shanghai Municipality (Project No. 19401900500).

Guarantor

Aj X.

ORCID iD

Menghan Liu

References

LeBoff

Greenspan

Insogna

, et al. The clinician's guide to prevention and treatment of osteoporosis. Osteoporos Int 2022; 33: 2049–2102.

Xia

. The epidemiology of osteoporosis, associated fragility fractures, and management gap in China. Arch Osteoporos 2019; 14: 32.

Zeng

Wang

, et al. The prevalence of osteoporosis in China, a nationwide, multicenter DXA survey. J Bone Miner Res 2019; 34: 1789–1797.

Chen

. Prevalence of osteoporosis in China: a meta-analysis and systematic review. BMC Public Health 2016; 16: 1039.

Lin

Xiong

Peng

, et al. Epidemiology and management of osteoporosis in the People's Republic of China: current perspectives. Clin Interv Aging 2015; 10: 1017–1033.

Bikbov

Purcell

Levey

, et al. Global, regional, and national burden of chronic kidney disease, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 2020; 395: 709–733.

Menachemi

Collum

. Benefits and drawbacks of electronic health record systems. Risk Manag Healthc Policy 2011; 4: 47–55.

Kanis

Hans

Cooper

, et al. Interpretation and use of FRAX in clinical practice. Osteoporos Int 2011; 22: 2395–2411.

Atkinson

Therneau

Melton

3rd , et al. Assessing fracture risk using gradient boosting machine (GBM) models. J Bone Miner Res 2012; 27: 1397–1404.

10.

Forgetta

Keller-Baruch

Forest

, et al. Machine learning to predict osteoporotic fracture risk from genotypes. BioRxiv 2018: 413716.

11.

Chen

Yang

Gao

, et al. Hybrid deep learning model for risk prediction of fracture in patients with diabetes and osteoporosis. Front Med 2022; 16: 496–506.

12.

Siris

Adler

Bilezikian

, et al. The clinical diagnosis of osteoporosis: a position statement from the National Bone Health Alliance Working Group. Osteoporos Int 2014; 25: 1439–1443.

13.

Austin

White

Lee

, et al. Missing data in clinical research: a tutorial on multiple imputation. Can J Cardiol 2021; 37: 1322–1331.

14.

Josse

Husson

. missMDA: a package for handling missing values in multivariate data analysis. J Stat Softw 2016; 70: 1–31.

15.

Hong

Lynn

. Accuracy of random-forest-based imputation of missing data in the presence of non-normality, non-linearity, and interaction. BMC Med Res Methodol 2020; 20: 1–12.

16.

Liu

Blekas

Tsoumakas

. Multi-label sampling based on local label imbalance. Pattern Recognit 2022; 122: 108294.

17.

Chang

Liu

, et al. A new hybrid XGBSVM model: application for hypertensive heart disease. IEEE Access 2019; 7: 175248–175258.

18.

LeBoff

Greenspan

Insogna

, et al. The clinician’s guide to prevention and treatment of osteoporosis. Osteoporos Int 2022; 33: 2049–2102.

19.

Yoon

Maalouf

Sakhaee

. The effects of smoking on bone metabolism. Osteoporos Int 2012; 23: 2081–2092.

20.

Van Loan

Johnson

Barbieri

. Effect of weight loss on bone mineral content and bone mineral density in obese women. Am J Clin Nutr 1998; 67: 734–738.

21.

Chiodini

Bolland

. Calcium supplementation in osteoporosis: useful or harmful? Eur J Endocrinol 2018; 178: D13–D25.

22.

Shahida

Rehman

Ilyas

, et al. Determination of blood calcium and lead concentrations in osteoporotic and osteopenic patients in Pakistan. ACS Omega 2021; 6: 28373–28378.

23.

Kuo

T-R

Chen

C-H

. Bone biomarker for the clinical assessment of osteoporosis: recent developments and future perspectives. Biomark Res 2017; 5: 1–9.

24.

Kyd

De Vooght

Kerkhoff

, et al. Clinical usefulness of bone alkaline phosphatase in osteoporosis. Ann Clin Biochem 1998; 35: 717–725.

25.

Fink

Litwack-Harrison

Taylor

, et al. Clinical utility of routine laboratory testing to identify possible secondary causes in older men with osteoporosis: the Osteoporotic Fractures in Men (MrOS) Study. Osteoporos Int 2016; 27: 331–338.

26.

Furman

Campisi

Verdin

, et al. Chronic inflammation in the etiology of disease across the life span. Nat Med 2019; 25: 1822–1832.

27.

Franceschi

Garagnani

Vitale

, et al. Inflammaging and ‘Garb-aging’. Trends Endocrinol Metab 2017; 28: 199–212.

28.

Furman

Chang

Lartigue

, et al. Expression of specific inflammasome gene modules stratifies older individuals into two extreme clinical and immunological states. Nat Med 2017; 23: 174–184.

29.

Ferrucci

Fabbri

. Inflammageing: chronic inflammation in ageing, cardiovascular disease, and frailty. Nat Rev Cardiol 2018; 15: 505–522.

30.

Fedewa

Hathaway

Ward-Ritacco

. Effect of exercise training on C reactive protein: a systematic review and meta-analysis of randomised and non-randomised controlled trials. Br J Sports Med 2017; 51: 670–676.

31.

Redlich

Smolen

. Inflammatory bone loss: pathogenesis and therapeutic intervention. Nat Rev Drug Discovery 2012; 11: 234–250.

32.

Straub

Cutolo

Pacifici

. Evolutionary medicine and bone loss in chronic inflammatory diseases—a theory of inflammation-related osteopenia. In: Seminars in arthritis and rheumatism, 2015, pp.220–228: Elsevier.

33.

Schett

Kiechl

Weger

, et al. High-sensitivity C-reactive protein and risk of nontraumatic fractures in the Bruneck study. Arch Intern Med 2006; 166: 2495–2501.

34.

Eriksson

Movérare-Skrtic

Ljunggren

, et al. High-sensitivity CRP is an independent risk factor for all fractures and vertebral fractures in elderly men: the MrOS Sweden study. J Bone Miner Res 2014; 29: 418–423.

35.

Tian

. Lipid metabolism disorders and bone dysfunction-interrelated and mutually regulated. Mol Med Rep 2015; 12: 783–794.

36.

Dennison

Syddall

Aihie Sayer

, et al. Lipid profile, obesity and bone mineral density: the Hertfordshire Cohort Study. J Assoc Physicians 2007; 100: 297–303.

37.

Ackert-Bicknell

. HDL cholesterol and bone mineral density: is there a genetic link? Bone 2012; 50: 525–533.

38.

Yamaguchi

Sugimoto

Yano

, et al. Plasma lipids and osteoporosis in postmenopausal women. Endocr J 2002; 49: 211–217.

39.

Hsu

Y-H

Venners

Terwedow

, et al. Relation of body composition, fat mass, and serum lipids to osteoporotic fractures and bone mineral density in Chinese men and women. Am J Clin Nutr 2006; 83: 146–154.

40.

Jia

Cheng

. Correlation analysis between risk factors, BMD and serum osteocalcin, CatheK, PINP, β-crosslaps, TRAP, lipid metabolism and BMI in 128 patients with postmenopausal osteoporotic fractures. Eur Rev Med Pharmacol Sci 2022; 26: 7955–7959.

41.

Huang

Zhou

Wang

, et al. Retinol-binding protein 4 is positively associated with bone mineral density in patients with type 2 diabetes and osteopenia or osteoporosis. Clin Endocrinol (Oxf) 2018; 88: 659–664.

42.

Zhao

Zhang

Quan

, et al. Systematic influence of circulating bilirubin levels on osteoporosis. Front Endocrinol (Lausanne) 2021; 12: 719920.

43.

Jurado

Parés

Peris

, et al. Bilirubin increases viability and decreases osteoclast apoptosis contributing to osteoporosis in advanced liver diseases. Bone 2022; 162: 116483.