Sage Journals: Discover world-class research

Abstract

Globally, heart disease (HD) persists as a major contributor to mortality rates, requiring accurate and efficient diagnostic models. While machine learning has shown promise in early detection, challenges such as missing data, class imbalance, suboptimal feature selection, and inefficient hyperparameter tuning hinder predictive accuracy and reliability. Many existing models fail to effectively preprocess medical datasets, leading to biased and computationally expensive predictions. To address these issues, this study proposes a strong hybrid framework for HD prediction. The Balanced Imputation-Normalization Framework incorporates K-Nearest Neighbors (KNN) imputation, StandardScaler normalization, and the Synthetic Minority Oversampling Technique (SMOTE). KNN imputation effectively handles missing data, ensuring reliable representation, while StandardScaler normalization standardizes feature values to enhance model stability. SMOTE is applied to address class imbalance, synthetic samples are generated to augment the minority class. Feature selection is optimized using the Hungarian algorithm, which systematically selects the most relevant attributes while reducing redundancy. Additionally, Bayesian optimization fine-tunes hyperparameters to improve classification performance. For prediction, an ensemble learning approach combines Random Forest (RF), Decision Tree (DT), K-Nearest Neighbors (KNN), Naïve Bayes (NB), and Extreme Gradient Boosting (XGBoost). The Voting Ensemble aggregates predictions using hard and soft voting mechanisms, improving robustness and generalization. Experimental results on benchmark heart disease datasets demonstrate that XGBoost attained a peak accuracy of 96.43%, with subsequent results from the Voting Ensemble at 95.66%, significantly outperforming traditional models and demonstrating that ensemble learning effectively improves accuracy and reduces computational complexity.

Keywords

Heart disease detection ensemble learning XGBoost Hungarian algorithm Bayesian optimization SMOTE KNN imputation deep learning

Get full access to this article

View all access options for this article.

References

Zhou

Dai

Hou

, et al A comprehensive review of deep learning-based models for heart disease prediction. Artif Intell Rev 2024; 57(10): 263.

Virani

S S

Newby

Arnold

, et al 2023 AHA/ACC/ACCP/ASPC/NLA/PCNA guideline for the management of patients with chronic coronary disease: a report of the American Heart Association/American College of Cardiology Joint Committee on Clinical Practice Guidelines. J Am Coll Cardiol 2023; 82(9): 833–955.

Ayon

Islam

Hossain

MR.

Coronary artery heart disease prediction: a comparative study of computational intelligence techniques. IETE J Res 2022; 68(4): 2488–2507.

Behrendt

Kreutzburg

Nordanstig

, et al The OAC3-PAD risk score predicts major bleeding events one year after hospitalisation for peripheral artery disease. Eur J Vasc Endovasc Surg 2022; 63(3): 503–510.

Sinha

Gupta

Yancy

, et al Risk-based approach for the prediction and prevention of heart failure. Circ Heart Fail 2021; 14(2): e007761.

Wong

Tse

HF.

Circulating biomarkers for cardiovascular disease risk prediction in patients with cardiovascular disease. Front Cardiovasc Med 2021; 8: 713191.

Ghaffar Nia

Kaplanoglu

Nasab

. Evaluation of artificial intelligence techniques in disease diagnosis and prediction. Discov Artif Intell 2023; 3(1): 5.

Rani

Kumar

Ahmed

NMS

, et al A decision support system for heart disease prediction based upon machine learning. J Reliab Intell Environ 2021; 7(3): 263–275.

Almazroi

AA.

Survival prediction among heart patients using machine learning techniques. Math Biosci Eng 2022; 19(1): 134–145.

10.

Gupta

Adarsh

Reddy

, et al Comparison of various machine learning approaches uses in heart ailments prediction. J Phys Conf Ser 2022; 2161(1): 012010.

11.

Tsai

Chen

PC.

Harnessing electronic health records and artificial intelligence for enhanced cardiovascular risk prediction: a comprehensive review. J Am Heart Assoc 2025; 14: e036946.

12.

Islam

M A

Majumder

MZH

Miah

, et al Precision healthcare: a deep dive into machine learning algorithms and feature selection strategies for accurate heart disease prediction. Comput Biol Med 2024; 176: 108432.

13.

Hussain

Nanda

Barigidad

, et al Novel deep learning architecture for predicting heart disease using CNN. In: 2021 19th OITS international conference on information technology (OCIT), 2021, pp.353–357. New York: IEEE.

14.

Liu

Song

Liu

, et al A review of deep-learning-based medical image segmentation methods. Sustainability 2021; 13(3): 1224.

15.

Khanna

Selvaraj

Gupta

, et al Internet of things and deep learning enabled healthcare disease diagnosis using biomedical electrocardiogram signals. Expert Syst 2023; 40(4): e12864.

16.

Cai

Gong

Tang

, et al Pitfalls in developing machine learning models for predicting cardiovascular diseases: challenge and solutions. J Med Internet Res 2024; 26: e47645.

17.

Ullah

, et al Machine learning-based cardiovascular disease detection using optimal feature selection. IEEE Access 2024; 12: 16431–16446.

18.

Almulihi

Saleh

Hussien

, et al Ensemble learning based on hybrid deep learning model for heart disease early prediction. Diagnostics 2022; 12(12): 3215.

19.

Atimbire

Appati

Owusu

Empirical exploration of whale optimization algorithm for heart disease prediction. Sci Rep 2024; 14(1): 4530.

20.

El-Shafiey

Hagag

El-Dahshan

ESA

, et al A hybrid GA and PSO optimized approach for heart-disease prediction based on random forest. Multimed Tools Appl 2022; 81(13): 18155–18179.

21.

Al Bataineh

Manacek

. MLP-PSO hybrid algorithm for heart disease prediction. J Pers Med 2022; 12(8): 1208.

22.

Sarra

Dinar

Mohammed

, et al Enhanced heart disease prediction based on machine learning and χ² statistical optimal feature selection model. Designs 2022; 6(5): 87.

23.

Bizimana

Zhang

Hounye

, et al Automated heart disease prediction using improved explainable learning-based technique. Neural Comput Appl 2024; 36(26): 16289–16318.

24.

Rahman

Alsenani

Zafar

, et al Enhancing heart disease prediction using a self-attention-based transformer model. Sci Rep 2024; 14(1): 514.

25.

UCI Machine Learning Repository, Statlog Heart Dataset. https://archive.ics.uci.edu/ml/datasets/statlog+heart

26.

Z-Alizadeh Sani Dataset. https://archive.ics.uci.edu/ml/datasets/Z-Alizadeh+Sani

27.

Abdullah

Artificial intelligence-based framework for early detection of heart disease using enhanced multilayer perceptron. Front Artif Intell 2025; 7: 1539588.

AI-driven CardioPredict: A synergistic ensemble framework for heart health monitoring

Abstract

Keywords

Get full access to this article

References