Sage Journals: Discover world-class research

Abstract

Determining the factors that contribute to making a reliable prediction of the metabolic syndrome will provide a deeper understanding of the medical indices involved in the prediction and assist in early diagnosis and treatment of patients. The study examined the optimal number of National cholesterol education program adult treatment panel (NCEP ATP) III indices needed to make a reliable prediction of the syndrome, whether each of the five NCEP ATP III indices for predicting the syndrome is equally important and whether a reliable prediction can be made using calculated blood pressure indices – estimated mean arterial pressure and pulse pressure – instead of NCEP ATP III blood pressure indices. The results show that NCEP ATP III indices for determination of the syndrome are not equally important. Moreover, the indices importance and their prediction quality vary according to gender. Optimal results are obtained by using all five NCEP ATP III indices for prediction.

Keywords

gender machine learning metabolic syndrome national cholesterol education program adult treatment panel III prediction

Introduction

Metabolic syndrome is among the most common public health problems worldwide and a major contributor to cardiovascular disease and diabetes.¹ The syndrome is a prominent risk factor for minor ischemic stroke and patients that have suffered a minor ischemic stroke and are diagnosed with metabolic syndrome are at a high risk of subsequent vascular events.² Metabolic syndrome is also associated with an increased risk of cardiovascular disease and mortality. According to the results of the National Health and Nutrition Examination Survey (NHANES) conducted in 2011–2016 among adults aged 20 and over in the US, the incidence of metabolic syndrome, as defined by the National Cholesterol Education Program (NCEP) ATP (Adult Treatment Panel) III, grows significantly with age. No significant differences in the prevalence of metabolic syndrome between men and women in each age group¹ were found.³ Studies in nursing homes in Europe found that prevalence of metabolic syndrome was high among the elderly population and increased with age. In these studies, gender, in contrast, was seen to play a major role in the prevalence prediction of metabolic syndrome.^4–6 Women apparently become more susceptible to metabolic syndrome with aging, thus separate guidelines may be required for men and women.⁷ The prevalence exhibited a commensurate rise with age, as suggested in⁸ and was significantly higher among more elderly people in the overall sample and among women. For men, however, the prevalence rate trended toward a non-significant pattern with increasing age.⁹

The National Institutes of Health’s (NIH) definition of the NCEP ATP III is the one most used and widely accepted by the international community and the WHO. According to this definition, the presence of metabolic syndrome is confirmed when at least three of the following five indicators are present:

(i) Waist circumference is at least 102 cm in men and at least 88 cm in women

(ii) Blood triglycerides level is higher than 150 mg/dL

(iii) HDL cholesterol (HDL-C) level in the blood is lower than 40 mg/dL in men and 50 mg/dL in women

(iv) Blood pressure is at least 130/85 mm Hg

(v) Fasting glucose blood level is higher than 100 mg/dL.¹⁰

Machine learning algorithms have been shown to have the potential to significantly help solve health problems by developing classification systems that can assist physicians diagnose and predict disease onset in early stages. Extracting knowledge from medical data, however, is challenging because these data may be heterogeneous, disorganized, high-dimensional, and contain “noise” and anomalies.¹¹ Today, clinicians and researchers are able to utilize machine learning to positively impact patient outcomes in a clinically meaningful way.¹²

The prevalence of metabolic syndrome as a worldwide public health problem with high morbidity, mortality, and costs highlights the urgency of efforts to identify and modify risk factors for the syndrome.¹⁰ Improving its prognosis through machine learning will improve the quality of life and even survival of men and women. Determining the main factors that contribute to the prediction of the syndrome will provide a deeper understanding of existing medical indices (NCEP ATP III and others) and assist in the implementation of early treatment of patients.

Our work, using machine learning, examines the primary tools used for predicting the syndrome. We seek to determine the optimal number of NCEP ATP III indices needed to make a reliable prediction, and whether each of the five NCEP ATP III indices used to predict the syndrome is equally important. In case that the indices are not equally important, we aim to discover their order of importance. We further intend to find out whether it should be beneficial to use calculated blood pressure indices instead of NCEP ATP III blood pressure indices (systolic and diastolic blood pressure) for syndrome prediction and diagnosis. Finally, we investigate the effect, if any, of gender on the indices’ importance and prediction value. To the best of our knowledge, we are the first researchers seeking to elucidate the above issues.

Our work emphasizes the importance of the order and the contribution of each NCEP ATP III index to the metabolic syndrome prediction, according to gender and the need for at least four variables for a sufficient prediction.

Methods

In this study we analyzed a large dataset comprising the medical records compiled over time of a Spanish sample that underwent periodic health examinations at their workplace. Based on the dataset, we built and examined three datasets: a dataset for men and women, a dataset for men only and a dataset for women only. Using the medical indices in the datasets, we derived from them additional, calculated medical indices that are widely used in medical practices. We sought to determine which of these indices were most important for predicting metabolic syndrome for three different sets of groups (men and women, men only, and women only)¹³ using the ExtraTreesClassifier¹⁴ and univariate selection¹⁵ methods. The importance of a feature refers to its contribution to higher chances of having metabolic syndrome. That is, as the importance of a feature increases, its contribution to the chance of having metabolic syndrome increases too. We predicted metabolic syndrome based on indices selected in the order of their importance ranking according to the ExtraTreesClassifier method.

Random forest,¹⁶ Naive Bayes,¹⁷ k-nearest neighbor (KNN),¹⁸ decision tree (CART)¹⁹ and logistic regression²⁰ algorithms were applied and examined to select a classification algorithm for predicting metabolic syndrome:

In this examination, we found out that the use of a Random Forest algorithm with a Gini impurity criterion has led to a more effective partitioning of the data. Further, the inclusion of a larger number of trees in the forest (n_estimators) did not significantly increase the computational complexity required to generate a prediction output, nor did it noticeably augment the bias of the model. K-nearest neighbors (KNN) with uniform weights reduce variance although at the cost of larger bias. Decision tree (CART) with the criterion of Gini impurity makes a purer separation and the “best” splitter chooses the best split. Logistic regression with the ‘liblinear’ solver performs well with high dimensionality. For each prediction made, five algorithms were examined, and the prediction was evaluated according to four performance measurements: sensitivity, precision, F1-score, and MCC (Matthews correlation coefficient). From among these, we identified the classification algorithm with the highest F1-score performance measurement and selected it for making further predictions. Using this algorithm, we predicted metabolic syndrome using four and five NCEP ATP III criteria. We additionally examined the contribution of blood pressure indices to the prediction of metabolic syndrome. F1-score performance measurement was chosen because it balanced the sensitivity and precision performance indices. We also examined the prediction of the syndrome using a neural network algorithm. The performance measurements, however, were lower than the performance measurements of the other algorithms, unlike high performance measurements in.^21,22

Our dataset is an open-source free dataset in accordance with the Creative Commons Attribution Non-Commercial (CC BY-NC 4.0) license. Participants of the research in which that data was collected provided their written informed consents to participate. The protocol of that study complied with the Declaration of Helsinki for conducting medical research involving human subjects, authorized by Mallorca Health Management Ethical Review Committee of GESMA.

Study population

A free-to-use medical dataset was selected. The dataset contained records of 60,799 subjects who underwent periodic health examinations at their workplace in Spain between 2012–2016.²³ Medical dataset indices included personal information and health habits (age, gender and smoking), anthropometric measurements (body fat percentage, ABSI index, BMI, waist circumference and waist circumference-to-height ratio), systolic and diastolic blood pressure and blood measurements (total cholesterol, LDL cholesterol, HDL-C, fasting glucose and triglycerides) and the classification of whether or not the individual had metabolic syndrome.¹⁴ Based on the dataset, we built and examined three datasets: a dataset for men and women, a dataset for men only and a dataset for women only. Calculated medical indices were added to each dataset (pulse pressure, estimated mean arterial pressure, HDL cholesterol to blood triglyceride level, non-HDL-C, and total cholesterol to blood triglyceride level).

Measurement

A confusion matrix was calculated based on subjects’ metabolic syndrome classification in the datasets and performance measurements were calculated (accuracy, sensitivity, precision, F1-score, and MCC). The prevalence of the metabolic syndrome in the study population (9%) - in a young working population (average age 40 and standard deviation 10.3), is neither low nor high, and therefore a threshold of 0.5 is appropriate.

Data analysis

Feature importance of the indices used in the datasets

Using the ExtraTreesClassifier and univariate selection methods, we determined which indices were most important. We applied these methods to all three datasets for examining gender effects. The preferred classification algorithm for predicting metabolic syndrome per subgroup was selected based on the highest F1-score performance measure. Therefore, boxplots were prepared for each dataset for the classification algorithms we used: random forest, naïve Bayes, decision tree (CART), KNN and logistic regression. As mentioned above, for evaluation of the quality of metabolic syndrome prediction, performance measures were calculated: accuracy, sensitivity, precision, F1-score and MCC.

Improving metabolic syndrome prediction using four and five national cholesterol education program adult treatment panel III criteria

Determining metabolic syndrome as defined by NCEP ATP III requires using at least three of five criteria (blood triglyceride level, waist circumference, HDL-C, blood glucose level, and blood pressure). We evaluated the predictive quality of the syndrome using three, four and five indices according to NCEP ATP III, with respect to sensitivity, precision, F1-score and MCC.

Combinations of three and four indices using the men and women dataset were selected according to their importance ranking by the ExtraTreesClassifier method. The following are the combinations we used (Figure 1):

Three most important indices: blood triglyceride level, waist circumference and HDL-C.

Four most important indices: blood triglyceride level, waist circumference, HDL-C and blood glucose level.

Five most important indices: blood triglyceride level, waist circumference, HDL-C, blood glucose level and blood pressure.

Figure 1.

Feature importance ranking of the men and women dataset indices.

A specific algorithm was selected according to the highest F1-score performance measurement:

For the three indices: the Naive Bayes algorithm was selected for the men and women dataset, the random forest algorithm was selected for the men dataset and the KNN algorithm was selected for the women dataset.

Four and five indices: the random forest algorithm was selected.

Contribution of blood pressure indices to the prediction of metabolic syndrome

We predicted metabolic syndrome according to blood pressure indices (systolic and diastolic blood pressure from the medical dataset; and additional calculated indices – estimated mean arterial pressure and pulse pressure), in two index groupings ranked by importance using the ExtraTreesClassifier method. The aim was to find the contribution of calculated blood pressure indices to the prediction of metabolic syndrome. The two groups for which metabolic syndrome was predicted using NCEP ATP III blood pressure indices and calculated blood pressure indices were:

Index Group 1: blood triglyceride level, blood glucose level, waist circumference, HDL-C (NCEP ATP III blood pressure indices; hereinafter referred to as Group 1 base indices) and estimated mean arterial pressure and pulse pressure.

Index Group 2: blood triglyceride level, blood glucose level, non-HDL-C (NCEP ATP III blood pressure indices; hereinafter referred to as Group 2 base indices) and estimated mean arterial pressure and pulse pressure.

For Groups 1 and 2, the chosen algorithm was random forest for the men and women dataset, for the men dataset and for the women dataset. Performance measurements of Groups 1 and 2 and estimated arterial pressure and pulse pressure were compared to performance measurements of Groups 1 and 2 and systolic blood pressure and diastolic blood pressure.

Results

Feature importance of the dataset’s indices

We examined which indices are most important as ranked by the ExtraTreesClassifier and univariate selection methods for men and women, for men only, and for women only.

Feature importance of indices using the ExtraTreesClassifier method

Abbreviations: SBP – Systolic Blood Pressure; DBP – Diastolic Blood Pressure; PP – Pulse Pressure; MAP – Mean Arterial Pressure = DBP + 1/3 (SBP – DBP); ABSI index – WC/((BMI)2/3 (height)1/2); BF – %Body Fat =1.2 × (BMI) + 0.23 × (age in years) − 10.8 × (gender) – 5.4. Gender: women (0), men (1); BMI – body weight (kg) divided by height (m) squared, in kg/m²

Figure 1 shows the feature importance ranking of the men and women dataset indices. We observed that the five NCEP ATP III indices ranked from high to low were: blood triglyceride level, waist circumference, HDL-C, blood glucose level and blood pressure.

Figure 2 shows the feature importance ranking of the men only dataset indices. We observe that the five NCEP ATP III indices ranked from high to low are: blood triglyceride level, waist circumference, HDL-C, blood glucose level and blood pressure.

Figure 2.

Feature importance ranking of the men only dataset indices.

Figure 3 shows the feature importance ranking of the women only dataset indices. We observe that the five NCEP ATP III indices ranked from high to low are: blood triglyceride level, blood glucose level, waist circumference, blood pressure and HDL-C.

Figure 3.

Feature importance ranking of the women only dataset indices.

Feature importance of the three dataset indices using the univariate selection (chi-square) method

Table 1 shows the feature importance ranking of the men and women dataset indices, men dataset indices and women dataset indices. We observed in the men and women dataset that the five NCEP ATP III indices ranks, from high to low, were: blood triglyceride level, blood glucose level, waist circumference, blood pressure and HDL-C. In the men dataset the five NCEP ATP III indices ranks, from high to low, were: blood triglyceride level, blood glucose level, waist circumference, systolic blood pressure, HDL-C and diastolic blood pressure. In the women dataset the five NCEP ATP III indices ranks, from high to low, were: blood triglyceride level, blood glucose level, systolic blood pressure, waist circumference, diastolic blood pressure and HDL-C.

Table 1.

Importance ranking of the men and women dataset indices, men dataset indices and women dataset indices.

Indices importance ranking
Men and women dataset		Men dataset		Women dataset
Index	Score	Index	Score	Index	Score
Triglycerides	877436.942533	Triglycerides	648823.065104	Triglycerides	112657.416795
Non-HDL-C	52798.383516	Non-HDL-C	34348.414534	Non-HDL-C	14422.246800
Glucose	22776.178589	Cholesterol_T	15364.574418	Cholesterol_T	6389.477791
Cholesterol_T	22657.521459	Glucose	14360.559868	Glucose	6239.517463
WC	15766.598103	WC	7300.416230	LDL-C	5290.576749
SBP	12612.989822	BF	6597.848465	SBP	4273.184994
MAP	9642.882240	SBP	5864.072681	BF	4046.122076
DBP	8160.987073	HDL-C	5012.470530	WC	3978.778154
HDL-C	7883.210757	MAP	4649.953371	MAP	3286.917929
Age	7544.145942	Age	4644.378537	DBP	2794.961227
LDL-C	7264.404023	DBP	4050.177852	Age	2722.420.296
BF	6236.529538	LDL-C	2959.284444	BMI	2397.088005
BMI	5251.197409	BMI	2471.674382	HDL-C	1640.998895
PP	4469.961428	Cholesterol_T/HDL-C	2129.171535	PP	1485.030853
Cholesterol_T/HDL-C	3005.343230	PP	1854.830010	Cholesterol_T/HDL-C	582.227929
HDL-C/Triglycerides	1459.169708	HDL-C/Triglycerides	980.056328	HDL-C/Triglycerides	317.825789
Gender	318.237946	WHtR	45.476815	WHtR	26.608217
WHtR	85.277491	Smoke	39.491117	Smoke	4.373936
Smoke	29.068373	ABSI	0.567439	ABSI	0.068289
ABSI	1.093250

We found that the most important and leading index according to both methods for the three datasets was the NCEP ATP III triglyceride blood level index. Asides from this index, the importance ranking generated by the ExtraTreesClassifier and univariate selection (chi-square) methods of the other indices are different for the three datasets. The HDL-C index was ranked the lowest by the univariate selection (chi-square) method, and the diastolic blood pressure index was ranked lowest by the ExtraTreesClassifier. Of the calculated blood pressure indices, pulse pressure was the least important measure in both methods.

Improving metabolic syndrome prediction by using four and five national cholesterol education program adult treatment panel III indices

By definition, three of the five NCEP ATP III metrics are sufficient to determine metabolic syndrome. Our study shows that adding a fourth or fifth NCEP ATP III index to predict the syndrome, instead of just three, improves the quality of prediction of metabolic syndrome, noticeably for women.

From the results shown in Table 2, we can see in the men and women dataset that adding the fourth index increased sensitivity and precision by about 12% (compared to using three indices) and adding the fifth index increased sensitivity by 8.63% and precision by 15.6% (compared to using four indices). In the men dataset, adding a fourth index to predict metabolic syndrome increased sensitivity by 15.56% and precision increased by 5.92% (compared to three indices) and adding a fifth index increased sensitivity by 10.96% and precision by 19.4% (compared to four indices). In the women dataset, adding a fourth index to predict metabolic syndrome increased the sensitivity by 21.48% and the precision by 10.59% (compared to using three indices) and the addition of the fifth index increased sensitivity by 20.74% and precision by 23.41% (compared to using four indices).

Table 2.

Prediction quality according to using various numbers of indices in the men and women dataset, men dataset and women dataset.

		Dataset	Prediction quality
# Of indices	Index Name	Dataset	Sensitivity (%)	Precision (%)	F1-score (%)	MCC (%)
3	(1) Blood triglyceride level	Men and women	57.59	67.90	62.32	59.11
	(2) Waist circumference	Men	63.89	74.13	68.63	65.11
	(3) HDL-C	Women	49.26	65.20	56.12	54.65
4	(1) + (2) + (3) + (4) Blood glucose levels	Men and women	69.63	79.90	74.41	72.23
		Men	79.45	60.05	79.75	77.12
		Women	70.74	75.79	73.18	71.81
5	(1) + (2) + (3) +(4) +(5) Blood pressure	Men and women	78.26	95.50	86.02	85.27
		Men	90.41	99.45	94.72	94.20
		Women	91.48	99.20	95.18	95.02

With each addition of an index to the metabolic syndrome prediction, the performance measurements increased, but differently for the three datasets. The improvement in the prediction quality for the three datasets after adding a fourth index compared to three indices ranged from 12.04% to 21.48% in sensitivity and 5.92% – 12% in precision. The improvement in the prediction quality after adding a fifth index compared to four indices ranged from 8.63% to 20.74% in sensitivity and 15.6% – 23.41% in precision (Table 3).

Table 3.

prediction’s confusion matrixes according to using various numbers of indices in the men and women dataset, men dataset women dataset.

# Of indices	Index name	Dataset	False negatives	True negatives	False positives	True positives
3	(1) Blood triglyceride level	Men and women	641	303	10,744	472
	(2) Waist circumference	Men	513	179	5984	290
	(3) HDL-C	Women	133	71	4854	137
4	(1) + (2) + (3) + (4) Blood glucose levels	Men and women	775	195	10,852	338
		Men	638	159	6004	165
		Women	191	61	4864	79
5	(1) + (2) + (3) +(4) + (5) Blood pressure	Men and women	871	41	11,006	242
		Men	726	4	6159	77
		Women	247	2	4923	23

Contribution of blood pressure indices to metabolic syndrome prediction

Group 1

Table 4. Presents the quality of prediction with Group 1 base indices and the blood pressure index in the men and women dataset.

Table 4.

Group 1 performance measurements in the men and women dataset.

	Prediction quality
Index Name	Sensitivity (%)	Precision (%)	F1-score (%)	MCC (%)
Base indices + systolic & diastolic blood pressure	78.26	95.50	86.02	85.27
Base indices + estimated mean arterial pressure	76.01	93.27	83.76	82.82
Base indices + pulse pressure	72.42	85.84	78.56	76.93

When predicting using the estimated mean arterial pressure instead of systolic and diastolic blood pressure, sensitivity and precision decreased by 2.2%. When predicting using pulse pressure instead of systolic and diastolic blood pressure, sensitivity decreased by 5.84% and precision by 9.66% (Table 5).

Table 5.

Group 1 confusion matrixes in the men and women dataset.

Index name	False negatives	True negatives	False positives	True positives
Base indices + systolic & diastolic blood pressure	871	41	11,006	242
Base indices + estimated mean arterial pressure	846	61	10,986	267
Base indices + pulse pressure	806	133	10,914	307

Group 2

Table 6 presents the quality of prediction with Group 2 base indices and the blood pressure index in the men and women dataset.

Table 6.

Group 2 performance measurements in the men and women dataset.

	Prediction quality
Index Name	Sensitivity (%)	Precision (%)	F1-score (%)	MCC (%)
Base indices + systolic & diastolic blood pressure	61.55	79.56	69.40	67.49
Base indices + estimated mean arterial pressure	60.38	78.41	68.22	66.14
Base indices + pulse pressure	55.97	73.82	63.67	61.24

When predicting using the estimated mean arterial pressure instead of systolic and diastolic blood pressure, sensitivity and precision decreased by 1.1%. When predicting using pulse pressure instead of systolic and diastolic blood pressure, sensitivity decreased by 5.58% and precision by 5.74% (Table 7).

Table 7.

Group 2 confusion matrixes in the men and women dataset.

Index name	False negatives	True negatives	False positives	True positives
Base indices + systolic & diastolic blood pressure	685	176	10,871	428
Base indices + estimated mean arterial pressure	672	185	10,862	441
Base indices + pulse pressure	623	221	10,826	490

Discussion

Feature importance of the datasets’ indices

The study showed that the ExtraTreesClassifier and univariate selection (chi-square) methods rank importance of indices differently for men and women when taken as one group, and men and women viewed as separate groups. The assorted NCEP ATP III criteria were not equally important when making a prediction. In both methods, the most highly ranked NCEP ATP III criterion for both men and women as a group and men and women separately was blood triglyceride level, such as.^24,25 Verses blood triglyceride level in men and waist-to-height ratio in women that showed the strongest predictive strength for the syndrome²⁶; of the remaining criteria, waist circumference for all datasets was in the top five. Waist-to-height ratio index was useful in identification of metabolic syndrome^27,28 and had important diagnostic value for metabolic syndrome in older adults.^29–31 Of the top five indices ranked important in both methods, two indices appear on both methods’ lists (blood triglyceride level, waist circumference), three appear on only one list and another three appear only on the other list.

Contribution of blood pressure indices to metabolic syndrome prediction

To the best of our knowledge, our study is the first to examine calculated blood pressure measures – estimated mean arterial pressure and pulse pressure – when predicting metabolic syndrome using machine learning. According to the study findings, predictions using the NCEP ATP III blood pressure indices versus estimated mean arterial blood pressure and pulse pressure index are more accurate. Further, when using estimated mean arterial pressure versus pulse pressure the prediction of metabolic syndrome is more accurate.

The fact that the dataset was homogeneous (residents of Spain, all of whom were employed) constitutes a limitation of the research. Therefore, it is recommended that a future study be conducted that examines the prediction and importance of NCEP ATP III criteria and other indices for the prediction of metabolic syndrome in other populations based on additional datasets.

The results of the study can be used with clinical informatics considerations, to assist in making monitoring and treatment decisions for patients through digital health systems. Digital Health interventions can be beneficial for improving systolic blood pressure and anthropometric outcomes like waist circumference.³² Automatized follow-up and alert systems provide support for the control and reduction of diseases associated with high blood pressure. Integration of new physiological variables for monitoring is feasible, and will broaden the scope for the early detection of chronic diseases including those associated with the metabolic syndrome, which could result in a reduction in their frequency.³³ The use of E-Health apps in most cases is voluntary, but persistence to goals can set the stage for long-term use, thereby promoting more favorable health outcomes.³⁴ The use of digital health-based lifestyle interventions may increase engagement and persistence in self-healthcare,³⁵ which is essential in the case of the syndrome.

Conclusions

In this study, methods for predicting the syndrome through machine learning were examined. The study, using data drawn from a Spanish sample, examined the optimal number of NCEP ATP III indices needed to make a reliable prediction of the syndrome, whether each of the five NCEP ATP III indices for predicting the syndrome are equally important, and if not, what is the order of their importance, whether a reliable prediction can be made using calculated blood pressure indices – estimated mean arterial pressure and pulse pressure – instead of NCEP ATP III blood pressure indices, and the effect of gender on the indices’ importance and the prediction.

According to the NCEP ATP III definition, at least three criteria are to be selected, with the assumption that all five indices for determination of the syndrome are of equal importance. In this study we show that this may not be the case. Moreover, the indices importance and their prediction quality included NCEP ATP III indices and other personal information, health habits, anthropometric measurements and blood measurements vary according to gender. Optimal results are obtained by using all five indices for prediction, and even when using just four instead of three NCEP ATP III indices, good prediction can be attained.

For predicting metabolic syndrome, NCEP ATP III (systolic and diastolic blood pressure) indices are better than the calculated indices (estimated mean arterial blood pressure and pulse pressure), and an estimated mean arterial pressure index is better than a pulse pressure index.

Future research could deepen the understanding of gender and other demographic data such as age for the purpose of accurate and focused prediction of metabolic syndrome. It could further study different ethnic groups as well as working and non-working subjects, in the aim of improving medical care. SHAP chart could deepen the understanding of the positive and negative contribution of each feature to metabolic syndrome prediction. Analyzing the data and predicting the syndrome for specific groups of medical and other indices, and addressing the difficulty and costs of producing each index or set of indices separately, may improve the cost–benefit ratio of diagnosing metabolic syndrome. Predicting the syndrome and investing in prevention efforts among high-risk populations could reduce the risk of mortality, morbidity and decline in quality of life.

Footnotes

Author’s note

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Elad Avizohar

Note

References

Hosseini-Esfahani

Alafchi

Cheraghi

, et al. Using machine learning techniques to predict factors contributing to the incidence of metabolic syndrome in Tehran: cohort study. JMIR Public Health Surveill 2021; 7: e27304. DOI: 10.2196/27304

Wang

Tong

Chen

, et al. Metabolic syndrome is a strong risk factor for minor ischemic stroke and subsequent vascular events. PLoS One 2016; 11: e0156243. DOI: 10.1371/journal.pone.0156243

Hirode

Wong

. Trends in the prevalence of metabolic syndrome in the United States, 2011-2016. JAMA 2020; 323(24): 2526–2528. DOI: 10.1001/jama.2020.4501

Wen

, et al. Predictive modeling the probability of suffering from metabolic syndrome using machine learning: a population-based study. ahead of print 2021, retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3854655

Hosseini

Bahadoran

Moslehi

, et al. Metabolic syndrome: findings from 20 years of the Tehran lipid and glucose study. Int J Endocrinol Metabol 2018; 16: e84771. DOI: 10.5812/ijem.84771

Alvero-Cruz

Fernández Vázquez

Martínez Blanco

, et al. Sex differences for predicting metabolic syndrome by adipose dysfunction markers in institutionalized elderly. Eur J Cardiovasc Nurs 2020; 20(6): 534–539. DOI: 10.1093/eurjcn/zvaa036

. Sex differences in risk factors for metabolic syndrome in the Korean population. Int J Environ Res Publ Health 2020; 17(24). DOI: 10.3390/ijerph17249513

Park

Cho

. Evolutionary attribute ordering in Bayesian networks for predicting the metabolic syndrome. Expert Syst Appl 2012; 39(4): 4240–4249.

Manaf

MRA

Nawi

Tauhid

, et al. Prevalence of metabolic syndrome and its associated risk factors among staffs in a Malaysian public university. Sci Rep 2021; 11: 8132. DOI: 10.1038/s41598-021-87248-1

10.

Romero-Saldaña

Tauler

Vaquero-Abellán

, et al. Validation of a non-invasive method for the early detection of metabolic syndrome: a diagnostic accuracy test in a working population. BMJ Open 2018; 8: e020476. DOI: 10.1136/bmjopen-2017-020476

11.

Romero-Saldaña

Fuentes-Jiménez

Vaquero-Abellán

, et al. New non-invasive method for early detection of metabolic syndrome in the working population. Eur J Cardiovasc Nurs 2016; 15(7): 549–558. DOI: 10.1177/1474515115626622

12.

Gutiérrez-Esparza

Infante Vázquez

Vallejo

, et al. Prediction of metabolic syndrome in a Mexican population applying machine learning algorithms. Symmetry 2020; 12(4): 581. DOI: 10.3390/sym12040581

13.

Bursa

Renda

(eds) Information technology in bio-and medical informatics. New York, NY: Springer International Publishing, 2014.

14.

Geurts

Ernst

Wehenkel

. Extremely randomized trees. Mach Learn 2006; 63(1): 3–42.

15.

Huynh-Thu

Saeys

Wehenkel

, et al. Statistical interpretation of machine learning-based feature importance scores for biomarker discovery. Bioinformatics 2012; 28(13): 1766–1774.

16.

Biau

Scornet

. A random forest guided tour. Test 2016; 25(2): 197–227.

17.

Kaur

Oberai

. A review article on Naive Bayes classifier with various smoothing techniques. Int J Comput Sci Mobile Comput 2014; 3(10): 864–868.

18.

Taunk

Verma

, et al. A brief review of nearest neighbor algorithm for learning and classification. In: International conference on intelligent computing and control systems (ICCS), 15–17 May 2019, Madurai, India: IEEE.

19.

Lavanya

Rani

. Performance evaluation of decision tree classifiers on medical datasets. Int J Comput Appl 2011; 26(4): 1–4.

20.

Maalouf

. Logistic regression in data analysis: an overview. Int J Data Anal Tech Strat 2011; 3(3): 281–299.

21.

Chen

Xiong

Ren

. Evaluating the risk of metabolic syndrome based on an artificial intelligence model. In: Abstract and applied analysis. New York, NY: Hindawi, 2014.

22.

Hirose

Takayama

Hozawa

, et al. Prediction of metabolic syndrome using artificial neural network system based on clinical data including insulin resistance index and serum adiponectin. Comput Biol Med 2011; 41(11): 1051–1056.

23.

Romero-Saldaña

Tauler

Vaquereo-Abellán

, et al. Data from validation of a non-invasive method for the early detection of metabolic syndrome: a diagnostic accuracy test in a working population. BMJ Open 2018; 8(10): e020476. DOI: 10.5061/dryad.cb51t54

24.

Karimi-Alavijeh

Jalili

Sadeghi

. Predicting metabolic syndrome using decision tree and support vector machine methods. ARYA Atheroscler 2016; 12(3): 146.

25.

Miller

Fridline

. Development and validation of metabolic syndrome prediction and classification-pathways using decision trees. J Metab Syndrome 2015; 4(173): 2167.

26.

Kim

Nam

Heo

. Identification of metabolic syndrome based on anthropometric, blood and spirometric risk factors using machine learning. Appl Sci 2020; 10(21): 7741.

27.

Suliga

Ciesla

Głuszek-Osuch

, et al. The usefulness of anthropometric indices to identify the risk of metabolic syndrome. Nutrients 2019; 11(11): 2598.

28.

Kammar-García

Hernández-Hernández

López-Moreno

, et al. Risk and diagnosis of the metabolic syndrome in apparently healthy young adults by means of the waist-height. Rev Med Hosp Gen Mex 2019; 82(4): 179–186.

29.

Oliveira

CCD

Costa

EDD

Roriz

AKC

, et al. Predictors of metabolic syndrome in the elderly: a review. Int J Cardiovasc Sci 2017; 30: 343–353.

30.

Yang

Xin

Feng

, et al. Waist-to-height ratio is better than body mass index and waist circumference as a screening criterion for metabolic syndrome in Han Chinese adults. Méd 2017; 96(39): 8192.

31.

Khosravian

Bayani

Hosseini

, et al. Comparison of anthropometric indices for predicting the risk of metabolic syndrome in older adults. Rom J Intern Med 2020; 59(1): 43–49.

32.

Chen

Shao

, et al. Effect of electronic health interventions on metabolic syndrome: a systematic review and meta-analysis. BMJ Open 2020; 10(10): e036927.

33.

Urrea

Venegas

. Automatized follow-up and alert system for patients with chronic hypertension. Health Inf J 2020; 26(4): 2625–2636. DOI: 10.1177/1460458219900446

34.

Guo

Shan

Ali Khan

. What are the impetuses behind E-health applications’ self-management services’ ongoing adoption by health community participants? Health Inf J 2023; 29(1): 14604582231152801. DOI: 10.1177/14604582231152801

35.

Lee

Kim

, et al. Effective prevention and management tools for metabolic syndrome based on digital health-based lifestyle interventions using healthcare devices. Diagnostics 2022; 12(7): 1730.

Predicting metabolic syndrome using machine learning – Analysis of commonly used indices

Abstract

Keywords

Introduction

Methods

Study population

Measurement

Data analysis

Feature importance of the indices used in the datasets

Improving metabolic syndrome prediction using four and five national cholesterol education program adult treatment panel III criteria

Contribution of blood pressure indices to the prediction of metabolic syndrome

Results

Feature importance of the dataset’s indices

Feature importance of indices using the ExtraTreesClassifier method

Feature importance of the three dataset indices using the univariate selection (chi-square) method

Improving metabolic syndrome prediction by using four and five national cholesterol education program adult treatment panel III indices

Contribution of blood pressure indices to metabolic syndrome prediction

Group 1

Group 2

Discussion

Feature importance of the datasets’ indices

Contribution of blood pressure indices to metabolic syndrome prediction

Conclusions

Footnotes

Author’s note

Declaration of conflicting interests

Funding

ORCID iD

Note

References