Sage Journals: Discover world-class research

Abstract

Objectives

Cardiovascular Disease (CVD) remains one of the leading causes of global mortality, accounting for millions of deaths annually. Early and accurate diagnosis plays a critical role in reducing mortality and healthcare burden. However, conventional diagnostic approaches often suffer from misdiagnosis, delayed treatment, and increased medical costs. Machine Learning (ML) has shown significant potential in supporting clinical decision-making for early CVD detection. Nevertheless, ML models often face challenges such as computationally expensive parameter tuning and susceptibility to local minima. This study aims to address these challenges by proposing a bio-inspired optimization framework to enhance diagnostic accuracy and efficiency.

Methods

This study employs Bacterial Colony Optimization (BCO) to optimize the hyperparameters of ten machine learning classifiers: Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbors, Multilayer Perceptron, Naïve Bayes, Random Forest (RF), Decision Tree, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine, and AdaBoost. Principal Component Analysis (PCA) is integrated to handle feature dimensionality and multicollinearity. Experiments were conducted using the Cleveland Heart Disease dataset (CLE) and the IEEE DataPort dataset (HGR), applying a rigorous 5-fold Cross-Validation (CV) strategy to ensure reliability and stability.

Results

Experimental findings demonstrate that the integration of PCA, BCO, and ML classifiers significantly improves prediction performance compared to baseline models. The BCO-optimized RF model achieved the highest mean accuracy of 92.02% (95% CI: 89.93–94.10) on the HGR dataset, outperforming the baseline accuracy of 91.26%. Similarly, the BCO-SVM model achieved a mean accuracy of 85.79% on the CLE dataset. Confidence interval analysis further confirmed enhanced model stability and reduced prediction variance.

Conclusion

The proposed framework effectively enhances CVD diagnosis by improving classification accuracy and stability. By efficiently exploring the search space and mitigating local minima limitations, the framework provides a statistically robust and clinically reliable decision-support tool for early cardiovascular risk detection.

Keywords

bacterial colony optimization cardiovascular disease machine learning principal component analysis (PCA)

Introduction

Cardiovascular Diseases (CVDs) comprise all the conditions that affect the heart’s normal functionality and its coronary vessels. It leads to the interruption of the blood circulation. CVDs are responsible for a significant portion of deaths globally, currently affecting almost 26 million people throughout the world, and the number continuously rises every year.¹ Over the past few decades, according to the World Health Organization (WHO), the prevalence of CVDs has grown substantially, and they are now responsible for 32% of global fatalities, with heart attacks and strokes responsible for the majority (85%).² In the United States, for example, heart disease continues to be the leading cause of death for people of all genders and from all racial and ethnic origins. It is alarming to consider that in the US, every 33 seconds a person dies because of CVDs.³

The rise in CVD patients causes a significant burden on the healthcare infrastructure globally.⁴ As a result, it is imperative to start treatment and counseling promptly by early identification of CVDs. The risk factors of causing CVDs include poor nutrition, inadequate exercise, obesity, diabetes, and alcohol consumption and cigarette smoking.⁵ CVDs detected in the advanced stage result in major surgeries like angiography or bypass surgery, which can be painful for individuals. Age, sex, family history of cardiovascular diseases, blood pressure, cholesterol, and diabetes all have an impact on the prognosis of heart disease.⁶ Additionally, a number of contributory factors, including hypertension, arrhythmia, and hyperlipidemia, make it difficult to diagnose early heart disease signs.⁷ The creation of sophisticated CVD detection systems has been fueled by the urgent need for early CVD identification and the reduction of heart disease related fatalities. Artificial Intelligence (AI) enhances the healthcare system by making it possible to diagnose diseases earlier, saving lives.⁸ Compared to traditional techniques, these intelligent systems not only enhance the detection process but also lower human error. These intelligent systems are capable of accurately predicting cardiovascular problems by examining patients’ medical data and establishing correlations between different health characteristics.⁹

In the field of healthcare, ML has significant transformational potential. Its exceptional ability to analyze vast amounts of data, which exceeds human analytical capabilities, is partly responsible for its impressive advancement.^10,11 With the speed and accuracy of machine learning, this capacity has enabled the creation of a plethora of AI driven applications for healthcare that provide creative answers to a range of clinical problems. A number of ML approaches have been exploited to identify cardiovascular conditions, and some of the predictive models face some issues that need to be addressed. Imbalanced datasets frequently lead to biased predictions.¹² Researchers have investigated hybrid approaches, which combine many methods, such as neural networks and other machine learning algorithms, to improve prediction accuracy and solve the difficulties. These studies provide valuable information, but the diversity of datasets, approaches, and results emphasizes how difficult this prediction task is. Alongside these advancements, further study is necessary to improve the efficacy of the models now in use in the prediction of cardiovascular disease. The wide range of ML applications in this field emphasizes the significance of doing further research in order to improve the precision, dependability, and generalizability of predictive models, which will eventually lead to better patient care and therapeutic interventions.^13–16

The use of Deep Learning (DL) and ML in the prediction of CVDs is essential to modern healthcare because these methods can greatly increase the precision of risk factor identification, aid in the early detection of possible complications, and make it easier to create individualized treatment plans. Such technologies improve patient outcomes and lessen the burden of disease by facilitating prompt and proactive medical treatments. As a result, a great deal of study has been done to investigate and identify the best techniques for heart disease prediction.

The ML approaches train algorithms using a lot of medical data to find patterns and make predictions for CVD detection. ML techniques like LR, RF, or Neural Networks (NN) can identify the probability of CVD events, enabling individualized treatment plans and early intervention. Six ML algorithms: KNN, LR, RF, SVM, DT, and XGBoost were investigated on two heart disease datasets in Ref. 17. The analysis shows SVM as the most effective for the Cleveland dataset with 87.9% accuracy. On the other hand, authors in Ref. 18 find KNN as the best performer for the Cleveland Dataset, gaining 85.6% overall accuracy. Similarly, among several data mining methods, RF attains the maximum accuracy of 87.64% in Ref. 19. Another investigation exploits five ML approaches KNN, DT, SVM, RF, and LR on a dataset from Kaggle consisting of no duplicate samples, missing data, and outliers of multicollinearity.²⁰ The predictive analysis shows RF as the best classifier with an accurate prediction rate of 87.5%. A rigorous study explores eight ML models and three DL models on four different heterogeneous datasets from Kaggle.²¹ The analysis shows the supremacy of XGBoost with PCA adoption. XGBoost also ranked top for classification in Ref. 22 among the investigated six ML classifiers. A recent investigation has integrated swarm intelligence based feature selection with ML approaches to enhance early CVD diagnosis. The study evaluated average runtime and objective function values to find the best feature subsets by testing several techniques across different population sizes. Ten classification models were then trained using these subsets, and the results were evaluated using a thorough weighted analysis.²³ Another study exploits swarm optimization techniques for selecting features in Jordan University Hospital dataset.²⁴ To achieve more reliable early CVD detection, the study in Ref. 25, proposes a novel framework integrating RF, KNN, LR, DT, and Gaussian Naive Bayes. They also adopted advanced ensemble approaches such as Boosting, Voting, Bagging, and Stacking. Genetic Algorithm (GA) combined with NB and SVM for classification was used by the authors in Ref. 26; the results showed 85.50% and 83.82% accuracies, respectively. An investigation into how well data mining techniques might identify salient characteristics and categorize the presence or absence of cardiac disorders is presented in Ref. 27. According to their findings, a voting strategy combining LR and NB had the best classification accuracy, at 87.41%. A comparative analysis of LR, SVM, and NN was made to predict heart disease prevalence in Ref. 28. The study determines LR as the best classifier with 84.80% classification accuracy. With a classification accuracy of 82%, the study in Ref. 29, finds KNN as the best algorithm while exploring the effectiveness of several data mining approaches. The empirical study achieved an accuracy of 86.76% by using the SVM classification technique in association with the Artificial Bee Colony (ABC) algorithm, which is based on swarm intelligence. DT C4.5 performs best among the eight investigated data mining techniques in Ref. 30, providing an overall accuracy of 83.40%. The research in Ref. 31, exploited a combination of GA and an optimized NN for CVD prediction, and their analysis shows the superiority of the proposed model among the other investigated ML models. A CVD detection model based on DT-J48, NN, and NB is proposed in Ref. 32. The proposed architecture obtains an accuracy of 97.91% for data taken from the database of transthoracic echocardiography. A Two Phase Taguchi Method (TPTM) along with an Artificial Neural Network (ANN) is exploited for efficient CVD prediction in Ref. 33. The findings demonstrated in the literature indicate the efficiency of the proposed model with fewer computational resources. Another hybrid model, consisting of TPTM, a hyperparameter-tuned ANN (HANN), and a GA, denoted as TPM-HANN-GA is also proposed in the study. This hybrid framework significantly increases the predictive accuracy for CVD risk by efficiently adjusting ANN hyperparameters during the training phase. Using the Cleveland dataset, the researchers assessed a Convolutional Neural Network (CNN) with different numbers of layers.³⁴ Before increasing the number of layers, they employed a CNN with two convolutional layers. After expanding their model to five levels, the outcomes were analyzed and contrasted. According to their results, three convolutional layers produced the best accuracy (95%) out of all of them. The Fractional-Order Models along with Graphical User Interface is introduced in Ref. 35, which is quite different from the purpose of our study. A combinational model based on majority voting and GA for CVD prediction proposed in Ref. 36, provides an accuracy of 88.43% on the Cleveland dataset. An overview of the literature is illustrated in Table 1. Bio inspired algorithms such as Particle Swarm Optimization (PSO), GA and ABC struggle to make a balance between exploration and exploitation. PSO often traps the model in local optima during high dimensional parameter searches due to its tendency toward premature convergence.³⁷ Similar to this, ABC frequently has slower convergence rates while fine tuning complicated classifiers for disease detection, despite providing robust exploration.³⁸ On the other hand, BCO effectively navigates complex search spaces and shows superior performance in disease detection with faster convergence rate compared to conventional swarm based approaches.³⁹ Additionally BCO performs better in feature selection and data clustering than GA and PSO through preserving population diversity in its own lifecycle processes.⁴⁰

Table 1.

Comparative summary of state-of-the-art studies in cardiovascular disease prediction.

Title	Methodology	Dataset	Key findings	Research gap
”A robust combination for effective heart disease diagnosis”, Dhanka et al.⁴¹	Improved PSO + XGBoost (with Padding Interpolation)	Composite Dataset (5 Repositories)	Achieved 91.3% accuracy; dynamic inertia weights effectively prevented premature convergence.	Heavily relies on complex data imputation (padding); effectiveness on single, standard benchmarks (like HGR) is less clear.
”A hybrid machine learning approach using particle swarm optimization for cardiac arrhythmia classification”, Dhanka et al.⁴²	PSO + XGBoost Classifier	UCI Cardiac Arrhythmia	Exceptional 95.24% accuracy and 96.3% sensitivity with low computational cost.	Focused specifically on arrhythmia; generalizability to broader coronary artery disease (CAD) prediction needs verification.
”HyOPTXGBoost and HyOPTRF: Hybridized Intelligent Systems using Optuna Optimization Framework…”, Dhanka et al.⁴³	HyOPTXGBoost (Optuna Framework)	Statlog Heart Disease	Reached 96.30% accuracy; validated via Stratified K-Fold Cross Validation.	While highly accurate, Optuna is a general purpose tuner; bio inspired heuristics (like BCO) may offer better global search capability in nonconvex spaces.
”A hybridization of XGBoost machine learning model by Optuna hyperparameter tuning suite…”, Dhanka et al.⁴⁴	Hybrid XGBoost + Optuna + Outlier Removal	Cleveland Heart Disease	95.45% accuracy (90:10 split); highlighted impact of outlier removal (Z-score/IQR).	90:10 split may lead to overfitting compared to 5 fold CV; aggressive outlier removal might discard rare but valid clinical cases.
”Comprehensive analysis of supervised algorithms for coronary artery heart disease detection”, Dhanka et al.⁴⁵	Logistic Regression vs. XGBoost (Random Search)	Statlog Heart Disease	XGBoost (91.85%) significantly outperformed LR (87.78%) after tuning.	Comparison limited to only two classifiers; lacks a comprehensive benchmark against other algorithms (SVM, RF, etc.).
”Multiple Machine Learning Intelligent Approaches for the Heart Disease Diagnosis”, Dhanka et al.⁴⁶	Comparative ML (LR, SVM, NB, RF)	UCI Heart Disease	Random Forest achieved the highest AUC (95.93%) among standard classifiers.	Uses basic Grid/Random search; lacks advanced metaheuristic optimization for hyperparameter fine tuning.
”A hybrid approach to heart disease prediction using a fractional-order mathematical model…”, Amilo et al.³⁵	Fractional-order Model + Decision Tree	Heart Disease Dataset	93% accuracy; effective at modeling nonlinear physiological relationships.	Fractional-order mathematics adds significant computational complexity compared to standard implementation.
”Random Forest for Heart Disease Detection: A Classification Approach”, Kumar et al.⁴⁷	Random Forest (Standard)	Cleveland Heart Disease	90.16% accuracy using 80:20 split and 10 Fold CV.	Represents a strong baseline, but lacks the performance boost provided by swarm based optimization (like BCO).

The field of medical diagnostics is quickly moving toward intelligent hybrid systems that blend sophisticated optimization methods with machine learning. According to a thorough analysis demonstrated in Ref. 48, feature selection and hyperparameter tweaking done simultaneously can increase diagnosis accuracy by 12 – 15% when compared to sequential approaches. Cardiology is not the only field seeing this trend; endocrinology and reproductive health are also seeing comparable developments. In order to manage complicated biological data, optimization techniques such as GA, and PSO are increasingly being used, as seen by recent studies on psoriasis prediction and hormonal diseases.^49,50 In order to represent nonlinear biological dynamics, novel techniques in In Vitro Fertilization (IVF) have even started combining Artificial Neural Networks with Fractional Order Models (FOM).⁵¹ The study in Ref. 52, exploits a Hybrid ML approach with Fractional Calculus for predicting diabetes risk. Hyper Tuned RBF SVM is exploited to detect breast cancer in Ref. 53.

However, significant difficulties arise as these models get increasingly intricate. One significant issue is the “fairness-accuracy trade-off.” An attempt to increase demographic fairness frequently results in a slight decrease in overall model accuracy, as stated in a recent systematic study of bias in AI.⁵⁴

Current research establishes that integrating ML with sophisticated optimization approaches is essential for enhancing the precision of CVD prediction. While previous studies have investigated a variety of feature selection techniques, swarm intelligence algorithms, and ensemble models to improve diagnostic performance, a critical gap remains. Achieving consistently high accuracy across heterogeneous datasets while preserving computational economy is a persistent challenge, often due to the tendency of standard algorithms to trap in local optima. Motivated by the need for a clinically dependable tool, this study utilizes Bacterial Colony Optimization (BCO) to optimize the hyperparameters of 10 distinct ML classifiers. This approach not only maximizes predictive power but also guarantees a robust, efficient framework capable of supporting reliable decision making in real world heart disease diagnosis.

According to current research, ML and sophisticated optimization approaches have the potential to increase the precision and effectiveness of CVD prediction. In order to improve diagnostic performance, previous research has investigated a variety of feature selection techniques, swarm intelligence algorithms, and ensemble models. Finding consistently high accuracy across many datasets while preserving computational efficiency is still difficult often due to the tendency of standard algorithms to become trapped in local optima. Motivated by the need for a clinically dependable tool, the current study uses BCO to optimize the parameters of 10 ML classifiers. This improves the classifiers’ predictive power and guarantees a more dependable and effective framework for the prediction of heart diseases.

Considering the increasing prevalence of CVDs around the world, creating precise and effective prediction systems is crucial. The traditional ML approaches show promising results; however, their efficacy mostly relay on choosing the best parameters, which may be difficult given the complexity and diversity of medical data. The large dimensionality and intrinsic multicollinearity of medical data make choosing the best parameters a hard nonlinear optimization issue, as highlighted in recent work, where typical approaches like Grid Search frequently fall short. Traditional algorithms sometimes suffer with feature redundancy or become stuck in local optima, resulting in unstable predictions that are difficult to generalize across a variety of patient cohorts. To improve the performance of 10 ML classifiers Logistic Regression (LR), Support Vector Machine (SVM), K Nearest Neighbors (KNN), Multilayer Perceptron (MLP), Naive Bayes (NB), Random Forest (RF), Decision Tree (DT), XGBoost, Light Gradient Boosting Machine (LGBM), and Adaptive Boosting (AdaBoost) for the prediction of heart disease, this study employs BCO technique. By combining classification and optimization, the suggested procedure seeks to improve CVDs prediction accuracy, robustness, and dependability. The key contribution of this study is outlined below-

1. The proposed framework effectively addresses multicollinearity and performs a global search for optimal hyperparameters accross 10 different ML classifiers.

2. The study validates the efficacy of BCO optimization using statistical validation tests.

3. The results of the proposed framework are compared with different optimizers to make a robust evaluation.

4. The suggested framework has the potential to be a useful decision support tool for medical professionals by improving prediction reliability, which would allow for early identification and prompt management for individuals at risk of cardiovascular illnesses.

The remainder of the article proceeds as follows: The proposed framework with a detailed procedure of BCO, the 10 ML classifiers, and a brief discussion on the datasets and features are listed in the Methodology section. The results section illustrates the outcomes of all the experiments with BCO and without BCO. The discussion section rigorously analyzes the Results mentioned in the Results section. The Conclusion section highlights the contributions of this study, with possible directions for further investigation.

Methodology

Feature reduction using Principal Component Analysis

A dimensionality reduction approach named PCA is exploited first in the proposed frame work to optimize the feature space and handle any multicollinearity across clinical characteristics. The cumulative explained variance ratio, which has a 95% goal threshold, determines which main components are selected. This criteria guarantees that low variance noise that might cause overfitting is eliminated while the great majority of the datasets’ useful information is kept. The initial feature sets are reduced to a smaller number of orthogonal components determined by this threshold.

In order to explore a more efficient search space, the BCO method moves the data into this uncorrelated coordinate system. Comparing this modification to the usage of raw, unadjusted features, it allows for better convergence and more robust hyperparameter adjustment. All ensuing classification and optimization activities use the resultant components as input, guaranteeing a balance between precision in diagnosis and computing efficiency (Figure 1).

Figure 1.

Proposed working procedure.

Hyperparameter tuning: Bacterial colony optimization

BCO, introduced by Niu and Wang,⁵⁵ is a population based heuristic optimization algorithm that simulates the foraging and communication behaviors of E. coli bacteria. Unlike Bacterial Foraging Optimization (BFO), which relies on individual swimming and tumbling, BCO utilizes bacterial communication to guide the search, thereby increasing both convergence efficiency and solution quality.⁵⁶ This study utilizes BCO to optimize classifier parameters for heart disease detection, achieving superior clinical precision while mitigating overfitting.

The BCO lifecycle progresses through chemotaxis, communication, elimination, reproduction, and migration, with each stage playing a crucial role in balancing exploration and exploitation during optimization. The behavior of the traditional BCO and its key procedural phases are described in the following:

Chemotaxis and communication

In BCO, chemotaxis is combined with communication throughout the optimization process. The chemotactic behavior of bacteria over their lifetime can generally be classified into two modes: tumbling and swimming. In the tumbling phase, the bacterium introduces a stochastic disturbance to its orientation, enabling random turbulence to impact the optimal search direction. As a result, the position of each bacterium is updated according to Equation (1), where both the turbulent director and the optimal search director jointly determine the movement.⁵⁵ In contrast, the bacterium moves without turbulence and smoothly in the direction of optimal exploitation during the swimming phase, updating its position according to Equation (2). The BCO employs an adaptive chemotaxis step, developed to guide bacterial movement during the search process, as defined by Equation (3).

\begin{align} {Position}_{i}^{T} & = {Position}_{i}^{T - 1} + C (i) [f_{i}^{*} (G_{best} - {Position}_{i}^{T - 1}) \\ + (1 - f_{i}) (P_{best, i} - {Position}_{i}^{T - 1}) + {turbulent}_{j}] \end{align}

(1)

\begin{align} {Position}_{i}^{T} & = {Position}_{i}^{T - 1} + C (i) [f_{i} (G_{best} - {Position}_{i}^{T - 1}) \\ + (1 - f_{i}) (P_{best, i} - {Position}_{i}^{T - 1})] \end{align}

(2)

C (i) = C_{\min} + {(\frac{{iter}_{\max} - {iter}_{j}}{{iter}_{\max}})}^{n} \cdot (C_{\max} - C_{\min})

(3)

where,

{Position}_{i}^{T}

is the updated position of the i^th bacterium at iteration T, C(i) is the adaptive coefficient for the i^th bacterium, f_i ∈ (0, 1) and

f_{i}^{*}

is the communication factor controlling influence from the global best, G_best is the global best position of the i^th bacterium, P_best,i is the personal best position of the i^th bacterium, turbulent_j is the turbulent direction variance of the i^th bacterium, iter_max denotes the maximum iterations, and iter_j represents the current iteration.

This formulation balances stochastic exploration and directed convergence, with tumbling promoting randomness to escape local optima and swimming guiding bacteria toward promising regions (P_best and G_best).

Elimination and reproduction

Each bacterium is assigned an energy level reflecting its search capability and fitness in heart disease detection. The decision rules for elimination and reproduction, which determine the removal of less fit bacteria and the reproduction of fitter ones, are defined by Equation (4), ensuring the colony progressively converges toward optimal solutions.⁵⁵

i \in \{\begin{cases} {Candidate}_{repr}, & if L_{i} > L_{given} and i \in healthy \\ {Candidate}_{eli}, & if L_{i} < L_{given} and i \in healthy \\ {Candidate}_{eli}, & if i \in unhealthy \end{cases}

(4)

where, L_i denotes the fitness of the i^th bacterial colony.

Migration

In BCO, bacterial migration moves individuals to new random positions when specific conditions—such as energy level, similarity among bacteria, or chemotaxis efficiency—are met, as described by Equation (5).⁵⁵ This process helps to avoid local optima, strengthens global search ability, and ensures efficient exploration.

{Position}_{i} (T) = rand \times (u b - l b) + l b

(5)

where, rand is a random value in [0,1], ub and lb are the upper and lower bounds of the search space, respectively.

Hypothesized advantages of BCO in hyperparameter tuning

In contrast to traditional optimization algorithms that employ independent or velocity driven parameter updates, BCO is hypothesized to navigate the intricate, nonconvex search spaces associated with the heterogeneous classifiers studied in this literature. The communication mechanism in BCO plays a critical role in mitigating premature convergence. Tuning parameters for the classifiers such as SVM, RF, the search space includes numerous local optima that misleads to the region of high accuracy not representing the global optimum. Bacterial agents exchange information on nutrient concentration, which is respective to model accuracy for the classifier, and orientation.^39,40 This collaborating communication phase enables the population to discriminate local and global optimal areas and prevents the stagnation in suboptimal hyperparameter spaces, which is commonly observed in Particle Swarm Optimization.³⁸ Additionally, while tumbling and swimming, communication helps the chemotaxis process by directing the swarm away from low performing areas of the search arena. In high dimensional classifiers, where random exploration can result in needless computing expenditure, this tailored direction enhances search efficiency.⁵⁷

Evaluation methods

Support Vector Machine

In this research, the SVM was applied as a classifier with the aim of detecting the presence of cardiac disease. SVM derives an optimal hyperplane to separate two classes (patients with and without heart disease). Based on the training dataset, D = ${(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{n}, y_{n})}, x_{i} \in R^{m}, y_{i} \in {- 1, + 1}$

The classifier learns a separating function:

f (x) = sign (w^{T} x + b)

(6)

where w and b are computed by solving the optimization problem:

\min_{w, b, ξ} 1 / 2 ‖ w ‖^{2} + C \sum_{i = 1}^{n} ξ_{i}, s.t. y_{i} (w^{T} x_{i} + b) \geq 1 - ξ_{i}, ξ_{i} \geq 0

Since nonlinear correlations are frequently observed in patient health data,⁵⁸ kernel functions are employed. The resulting decision function is:

f (x) = sign (\sum_{i = 1}^{n} α_{i} y_{i} K (x_{i}, x) + b)

(7)

where K stands for the kernel function.

SVM performance, determined by C and γ, was optimized using BCO, where candidate pairs are estimated by accuracy and the colony iteratively prefers high performing candidates.

Logistic regression

LR is a widely used and reliable supervised method for addressing binary classification problems in medical research, including the prediction of heart disease. Linear regression predicts continuous values, whereas LR predicts the probability of a category. This characteristic makes LR especially valuable in medical diagnostics, where the primary objective is to determine whether a disease is present or absent. The logistic regression model predicts the probability of an outcome (e.g., the existence of heart disease) based on a linear combination of input features (age, blood pressure, cholesterol, electrocardiogram (ECG) results, blood sugar levels, etc.) weighted by coefficients⁵⁹:

z = β_{0} + β_{1} x_{1} + β_{2} x_{2} + \dots + β_{n} x_{n}

(8)

This linear predictor is passed through the sigmoid function to produce a probability ranging from 0 to 1, and then a decision threshold of 0.5 is applied to classify the class label.

P (y = 1 | x) = \frac{1}{1 + e^{- z}} = \frac{1}{1 + e^{- (β_{0} + β_{1} x_{1} + \dots + β_{n} x_{n})}}

(9)

K-Nearest Neighbors

The KNN algorithm is employed for heart disease detection by classifying patients in two groups: positive and negative cases, depending on their medical data.^60,61 In this context, clinical attributes like age, blood pressure, cholesterol levels, resting ECG, and various types of chest pain are represented as a feature vector:

X = (x_{1}, x_{2}, x_{3}, \dots, x_{n})

where x_i denotes the i^th clinical attribute of a patient.

The affinity between two records, X and Y, is calculated based on the Euclidean distance:

d (X, Y) = \sqrt{\sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}}

(10)

In heart disease datasets, the feature ranges differ: for example, cholesterol spans 100 to 190, while age ranges from 40 to 80, which potentially biases distance calculations. To ensure that all features contribute equally to the distance metric, continuous attributes are standardized.⁶²

The class of an unknown patient is derived from the labels of its closest neighbors. The neighborhood size k and distance metric have an impact on the k-NN’s performance. To optimize these parameters, BCO is utilized, which iteratively estimates candidate (k, d) solutions based on classification accuracy to determine the optimal configuration (k*, d*) for a reliable prediction of heart disease.

Multilayer Perceptron

The MLP, a variant of artificial neural networks, studies heart disease factors such as age, blood pressure, cholesterol level, blood sugar, and electrocardiographic observations through multiple layers of interconnected neurons to capture nonlinear dependencies and provide accurate estimations of heart disease risk. It is structured as a feedforward neural network composed of an input layer, multiple hidden layers, and an output layer.⁶³

For a neuron j in layer l, the output is computed as:

o_{j}^{(l)} = f (\sum_{i} w_{i j}^{(l)} o_{i}^{(l - 1)} + b_{j}^{(l)})

(11)

where,

w_{i j}^{(l)}

represents the connection weight from neuron i in layer l − 1 to neuron j in layer l,

b_{j}^{(l)}

denotes the bias associated with the neuron j in layer l, and f (⋅) indicates the activation function applied to the neuron’s total input.

The output layer generates the predicted class label for a patient.

The MLP’s performance relies on hyperparameters such as learning rate, hidden neuron numbers, and activation functions. The optimal configuration (θ*) for a reliable heart disease prediction is determined by employing BCO, which analyzes candidate configurations (θ) based on classification accuracy.

Naïve Bayes

NB is a supervised learning algorithm based on Bayes’ theorem, because of its simplicity, efficiency in computation, and consistent performance even with smaller datasets.⁶⁴ The model considers conditional independence among patient attributes, allowing the multivariate joint distribution to be represented as the product of individual probabilities.

For a patient with a feature vector, X = (x₁, x₂, …, x_n), the posterior probability of class C_k (e.g., heart disease existence) is defined as:

P (C_{k} | X) = \frac{P (C_{k}) \prod_{i = 1}^{n} P (x_{i} | C_{k})}{P (X)},

(12)

where, P (C_k) is the prior probability of class C_k, and P (x_i|C_k) denotes the likelihood of a feature x_i given the class.

As P (X) is constant across all classes, the classification simplifies to:

\hat{C} = \arg \max_{C_{k}} P (C_{k}) \prod_{i = 1}^{n} P (x_{i} | C_{k})

(13)

To improve performance, BCO is utilized to choose the best parameters, such as feature subsets, smoothing factors, and variance estimations. Individual candidate configuration is evaluated as determined by classification accuracy, enhancing generalization while maintaining computational efficiency.

Random Forest

RF is a resilient method for ensemble learning that generates multiple decision trees, where each tree is trained on a bootstrapped sample of patient records and a randomly selected subset of clinical features. Random selection of data and features generates a variety of decision trees, mitigating overfitting and enhancing generalization.⁶⁵ Each tree estimates heart disease, and the final classification is aggregated using majority voting, boosting diagnostic accuracy and stability over a single decision tree. Figure 2 illustrates how the RF classifier operates, where the predictions of multiple decision trees are combined through majority voting to produce the final class. In the context of heart disease detection, the dataset is expressed as:

X = {x_{1}, x_{2}, \dots, x_{n}}, Y = {y_{1}, y_{2}, \dots, y_{n}}, y_{i} \in {0,1},

where, x_i denotes patient features (e.g., age, cholesterol, blood pressure), y_i indicates disease presence (1) or absence (0).

Figure 2.

Random Forest process to form a reliable ensemble from multiple trees using majority voting.

Each decision tree is built by determining the best splits depending on the Gini Index:

Gini (D) = 1 - \sum_{k = 1}^{K} p_{k}^{2}

(14)

where p_k is the proportion of instances belonging to the class k in dataset D.

Majority voting is used to infer the final prediction for an unseen record x:

\hat{y} = \arg \max_{c \in {0,1}} \sum_{j = 1}^{M} I (h_{j} (x) = c)

(15)

where,

\hat{y}

is the predicted class, M is the number of classifiers (e.g., decision trees in RF), h_j (x) is the prediction of the j^th classifier, I (⋅) is the indicator function that equals 1 if the condition is true, 0 otherwise.

To improve prediction performance, BCO is utilized to optimize RF hyperparameters like the maximum depth and number of trees based on classification performance such as accuracy or F1-score.

Decision tree

A DT is a supervised classification method that uses recursive splitting of patient data, using clinical attributes to perform classification. At each decision node, the algorithm identifies the feature that maximizes class separation based on an impurity measure,⁶⁶ typically the Gini Index, as defined in Equation (14). The decision tree grows recursively until predefined stopping criteria, such as maximum tree depth or minimum samples per node, are satisfied, at which point the leaf nodes denote the predicted class (presence or absence of heart disease). At the leaf node, the final prediction for an unseen record x is determined on the basis of the majority class.

\hat{y} = \arg \max_{c \in {0,1}} p (c | leaf)

(16)

where, p (c | leaf) is the proportion of class c samples in that leaf.

Similar to RF, BCO is employed to optimize key DT hyperparameters, including maximum depth and minimum samples per split, enriching DT performance based on classification metrics.

Extreme Gradient Boosting

XGBoost is a robust ensemble learning method that sequentially develops a number of weak classifiers, such as decision trees, with each successive tree trained to minimize the residual errors produced by the preceding trees. By repeatedly focusing on misclassified samples, XGBoost minimizes bias and increases predictive accuracy.⁶⁷ It also employs regularization to prevent overfitting and a learning rate to control each tree’s contribution,⁶⁸ which makes it effective for the detection of heart disease using clinical attributes.

Figure 3 demonstrates that the working procedure of XGBoost involves sequentially constructing decision trees, optimizing residual errors, and applying regularization to enhance predictive performance. XGBoost constructs decision trees that follow a structure parallel to the DT model as described earlier in the same section. Consequently, the node-splitting in XGBoost also relies on impurity measures such as the Gini Index as previously defined in Equation (14).

Figure 3.

XGBoost process of constructing an optimized ensemble classifier.

The final prediction is obtained by combining the outputs of all trees in the ensemble, weighted by their learning rates:

\hat{y} = σ (\sum_{t = 1}^{T} η h_{t} (x))

(17)

where,

\hat{y}

is the predicted output, σ (⋅) is the sigmoid function, T is the number of weak classifiers, η is the weight for each weak classifier, and h_t (x) is the output of the t^th weak classifier.

Similar to its application in DT and RF, BCO is applied to XGBoost to optimize hyperparameters, such as the learning rate, the maximum depth of the tree, and the number of trees, improving the performance of the model based on classification metrics.

Light Gradient Boosting Machine

LGBM is a tree-based gradient-boosted ensemble method similar to XGBoost, but it differs in its tree construction. It employs a leaf-wise tree growth strategy with depth constraints,⁶⁹ selecting the leaf that maximizes loss reduction at each step, rather than the level-wise growth used in XGBoost. As a result, LGBM captures complex feature interactions more effectively, trains faster with lower memory consumption, and achieves enhanced accuracy on large-scale, high-dimensional datasets. Similar to XGBoost models, LGBM utilizes clinical features, specifying splits based on impurity measures.

To further improve diagnostic performance, BCO is utilized to tune key hyperparameters such as learning rate, maximum depth, and number of trees.

Adaptive Boosting

AdaBoost is an ensemble learning technique that combines multiple weak classifiers, commonly shallow decision trees, to construct a strong classifier. In contrast to RF, which constructs trees separately and combines their votes, AdaBoost trains decision trees in sequential order, where each successive classifier assigns higher weight to misclassified patient samples,⁷⁰ forcing the model to address difficult samples. Figure 4 illustrates the AdaBoost procedure for generating a strong classifier from multiple weak learners.

Figure 4.

Process of AdaBoost in forming a strong classifier from weak learners.

For the training data (x_i, y_i), where y_i ∈ {−1, + 1}, AdaBoost assigns initial equal weights: w_i = 1/N. At each iteration t, a weak classifier h_t (x) is trained, and its weighted error is computed as:

ϵ_{t} = \sum_{i = 1}^{N} w_{i} 1 (h_{t} (x_{i}) \neq y_{i})

(18)

where, N is the total number of training samples, w_i is the weight of the i^th training sample, h_t (x_i) is the prediction of the weak classifier at iteration t for sample i, and 1 (⋅) is the indicator function that equals 1 if the condition is true, 0 otherwise.

The weight of the classifier is then computed by:

α_{t} = \frac{1}{2} \ln (\frac{1 - ϵ_{t}}{ϵ_{t}})

(19)

Finally, the strong classifier is derived as:

H (x) = sign (\sum_{t = 1}^{T} α_{t} h_{t} (x))

(20)

Classifiers with lower error are assigned higher weights, enabling the model to focus on samples that are hard to classify and iteratively improve accuracy.

Like DT, RF, XGBoost, and LGBM, BCO is also used with AdaBoost to adjust key hyperparameters, such as the number of weak classifiers and the learning rate, thus enhancing the robustness and classification performance in the detection of heart disease.

Dataset description

Two benchmark datasets, the Cleveland Heart Disease Dataset (CLE)⁷¹ and another merged dataset from IEEE DataPort (HGR),⁷² were used for experimental evaluation. The CLE dataset is defined by 13 features and 303 samples, while HGR is outlined by 10 features and 1,190 samples. The missing values are replaced with the median value of the perspective feature. For both datasets, the class distributions are comparatively equal. There are 164 heart disease patients and 139 healthy individuals in the CLE dataset, having a ratio of 1.18 : 1, and 561 healthy and 629 heart disease patients in the HGR dataset, possessing a ratio of 1 : 1.12. To avoid any bias against the majority class, we gave priority to strong performance indicators such as F1-Score, ROC-AUC score, even if these ratios show little possibility of catastrophic class imbalance. Instead of only maximizing accurate predictions on the majority class, these criteria make sure that the model’s performance is assessed according to its ability to accurately detect positive situations. The PCA was adopted on standardized data for the minimization of the dimensionality preserving 95% explained variance ratio. It provided 12, and 10 extracted features for CLE, and HGR datasets, respectively. The detailed information about the dataset is listed in Table 2, and the information about the features of both the datasets is illustrated in Tables 3 and 4.

Table 2.

Dataset description.

	CLE⁷¹	HGR⁷²
No Features	13	11
Total Samples	303	1190
Extracted Features (PCA)	12	10
Class 0	164	561
Class 1	139	629

Table 3.

Feature details of CLE.

Feature	Description	Data type	Distribution
age	Age	Numerical	Min: 29.00, Max: 77.00, Mean: 54.44
sex	Sex	Binomial	1: 206, 0: 97
cp	Chest Pain	Polynomial	1: 23, 2: 50, 3: 86, 4: 144 (1: typical angina, 2: atypical angina, 3: non-angina pain 4: asymptomatic)
trestbps	Resting Blood Pressure	Numerical	Min: 94.00, Max: 200.00, Mean: 131.69
chol	Cholesterol	Numerical	Min: 126.00, Max: 564.00, Mean: 246.69
fbs	Fasting blood sugar	Binomial	0: 258, 1: 45
restecg	Electrocardiogram	Polynomial	0: 151, 2: 148, 1: 4 (0: normal, 1: above normal, 2: probable)
thalach	maximum heart rate achieved	Numerical	Min: 71.00, Max: 202.00, Mean: 149.61
exang	Exercise-induced angina	Binomial	0: 204, 1: 99 (0:no, 1: yes)
oldpeak	ST depression induced by exercise relative to rest	Numerical	Min: 0.00, Max: 6.20, Mean: 1.04
slope	The slope of the peak exercise ST segment	Polynomial	1: 142, 2: 140, 3: 21 (0:normal, 1: upsloping, 2:flat, 3:downsloping)
ca	The number of major vessels	Polynomial	0: 180, 1: 65, 2: 38, 3: 20
thal	Thallium heart rate	Polynomial	1: 168, 2: 18, 3: 117 (0: NULL 1: normal blood flow 2: fixed defect (no blood flow in some part of the heart) 3: reversible defect (a blood flow is observed but it is not normal (nominal))
num	Target Variable	Binomial	0: 164, 1: 139 (0: Heart Disease, 1: Normal)

Table 4.

Feature details of HGR.

Feature	Description	Data type	Distribution
age	Age	Numerical	Min: 28.00, Max: 77.00, Mean: 53.72
sex	Sex	Binomial	1: 909, 0: 281
chest pain type	Chest Pain	Polynomial	(1: typical angina, 2: atypical angina, 3: non-angina pain, 4: asymptomatic) 4: 625, 3: 283, 2: 216, 1: 66
resting bp s	Resting Blood Pressure	Numerical	Min: 0.00, Max: 200.00, Mean: 132.15
cholesterol	Cholesterol	Numerical	Min: 0.00, Max: 603.00, Mean: 210.36
fasting blood sugar	Fasting Blood Sugar	Binomial	0: 936, 1: 254
resting ecg	Electrocardiogram	Polynomial	(0: normal, 1: above normal, 2: probable) 0: 684, 2: 325, 1: 181
max heart rate	Maximum Heart Rate Achieved	Numerical	Min: 60.00, Max: 202.00, Mean: 139.73
exercise angina	Exercise-Induced Angina	Binomial	(0: no, 1: yes) 0: 729, 1: 461
oldpeak	ST Depression Induced by Exercise Relative to Rest	Numerical	Min: -2.60, Max: 6.20, Mean: 0.92
ST slope	Slope of the Peak Exercise ST Segment	Polynomial	(0: up sloping; 1: flat; 2: down sloping (Nominal)) 2: 582, 1: 526, 3: 81, 0: 1
target	Target Variable	Binomial	(0: Heart Disease, 1: Normal) 1: 629, 0: 561

Results

Parameter tuning

Choosing the right parameters is crucial to BCO’s efficacy. The configuration used in this implementation is designed to prioritize computational economy while preserving strong search capabilities across ten distinct ML classifiers. Colony sizes of 15, 20, 25, and 30 bacteria are explored to find the optimal size for achieving convergence. To ensure fine grained search and better exploration, the step size or run length is explored between 5, 10, and 15. To avoid the excessive computation, the size of the maximum step of a bacterium is explored between 2 to 4, and 4 is found optimal for both of the datasets. The reproduction rate of 0.5 makes a balance between exploration and exploitation.⁷³ A conservative diversity of (elimination_dispersal_prob) of 0.1 is used for stable convergence,⁷⁴ and 30 maximum iterations ensured adequate convergence time across ten ML classifiers. A brief of the optimal parameters of BCO used for both datasets is listed in Table 5. We established certain search ranges for the hyperparameters of each of the ten classifiers, such as limits for learning rate, regularization coefficients, and tree depth, in order to make reproducibility easier. The detailed search spaces along with the final optimal values extracted by BCO for both the datasets are listed in Table 6.

Table 5.

Parameter setting for BCO.

Parameter name	Role	Value used for CLE	Value used for HGR
n_bacteria	Number of bacteria in the population	30	15
run_length	Step size for movement during tumble/swim	15	15
swim_steps	Max swim steps in improving direction	4	4
reproduction_rate	Fraction of best bacteria reproduced	0.5	0.5
elimination_dispersal_prob	Probability of random elimination & dispersal	0.1	0.1
max_iterations	Stopping criterion	30	30
test_size	Train-test split ratio	0.2, 0.5	0.2, 0.5
k_folds	Number of Folds	5	5

Table 6.

Hyperparameter Search Space with best parameters.

Classifier	Parameter	Search range	Interpretation/decoded values	CLE	HGR
SVM	C	0.1 – 100.0	Regularization parameter (Float)	0.2	64.44
	kernel	0 – 2	Categorical: [‘linear’, ‘rbf’, ‘poly’]	linear	rbf
	gamma	0.001 – 1.0	Kernel coefficient (Float)	0.001	0.4
LR	C	0.01 – 100.0	Inverse regularization strength (Float)	1.4	79.96
	penalty	0 – 3	Categorical: [‘l1’, ‘l2’, ‘elasticnet’, None]	l2	l2
	max_iter	100 – 500	Max iterations (Integer)	227	492
KNN	n_neighbors	3 – 15	Number of neighbors (Integer)	15	11
	weights	0 – 1	Categorical: [‘uniform’, ‘distance’]	distance	distance
	algorithm	0 – 3	Categorical: [‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’]	brute	ball_tree
MLP	hidden_layer_sizes	10 – 100	Number of neurons in hidden layer (Integer)	14	57
	activation	0 – 2	Categorical: [‘relu’, ‘tanh’, ‘logistic’]	relu	tanh
	alpha	0.0001 – 0.01	L2 penalty parameter (Float)	0.0001	0.01
	learning_rate	0 – 2	Categorical: [‘constant’, ’invscaling’, ‘adaptive’]	constant	constant
	max_iter	100 – 500	Max iterations (Integer)	231	494
NB	var_smoothing	1e-9– 1e-5	Portion of largest variance of all features (Float)	8.295	6.14
RF	n_estimators	50 – 200	Number of trees (Integer)	190	99
	max_depth	3 – 30	Max depth of tree (Integer)	21	30
	min_samples_split	2 – 20	Min samples to split node (Integer)	17	9
	min_samples_leaf	1 – 10	Min samples at leaf node (Integer)	6	1
	bootstrap	0 – 1	Boolean: False (0) or True (1)	TRUE	FALSE
DT	max_depth	3 – 30	Max depth of tree (Integer)	3	15
	min_samples_split	2 – 20	Min samples to split node (Integer)	20	2
	min_samples_leaf	1 – 10	Min samples at leaf node (Integer)	1	1
	criterion	0 – 2	Categorical: [‘gini’, ‘entropy’, ‘log_loss’]	log_loss	log_loss
XGBoost	n_estimators	50 – 200	Number of trees (Integer)	117	128
	learning_rate	0.01 – 0.3	Step size shrinkage (Float)	0.165	0.107
	max_depth	3 – 15	Max depth of tree (Integer)	11	9
	subsample	0.5 – 1.0	Subsample ratio of training instances (Float)	0.52	0.627
	colsample_bytree	0.5 – 1.0	Subsample ratio of columns per tree (Float)	0.568	0.507
LGBM	n_estimators	50 – 200	Number of trees (Integer)	138	55
	learning_rate	0.01 – 0.3	Learning rate (Float)	0.05	0.3
	max_depth	3 – 15	Max depth (Integer)	7	15
	num_leaves	20 – 100	Max leaves in one tree (Integer)	49	86
	subsample	0.5 – 1.0	Subsample ratio (Float)	0.83	0.5
AdaBoost	n_estimators	50 – 200	Number of estimators (Integer)	80	198
	learning_rate	0.01 – 1.0	Learning rate (Float)	0.35	0.63
	algorithm	0 – 1	Categorical: [‘SAMME’, ‘SAMME.R’]	SAMME	SAMME

Classification performance

To make sure the robustness of the proposed architecture, the experiments are conducted in two parallel phases. At first, the datasets are split into two different train test ratio, one set maintains 50:50 ratio while another maintains 80:20. The results of CLE dataset in this phase are listed in the Tables 7–9. The outcomes for the HGR dataset are mentioned in Tables 10–12. The comparisons of the performance of this phase can be observed from the Figures 5 and 6, for CLE and Figures 7 and 8 for HGR. The other phase includes experiments using 5 fold cross validation, along with the comparison of different optimizers. The results are stored in the mean± std format in Table 13 for CLE and Table 14, and the outcomes with the 95% Confidence Interval (CI) are shown in Table 15 for CLE and Table 16. The comparison of the CLE dataset across different optimization techniques with respect to accuracy, precision, recall, and f1-score can be observed from Figures 9–12, respectively. And for the HGR dataset, Figures 13–16 represent accuracy, precision, recall f1 score, respectively.

Table 7.

Performance of ML classifiers without optimization technique for the CLE dataset.

Classifier	Training sample	Accuracy	Precision	Recall	F1 score	Kappa	ROC-AUC
SVM	50%	82.24	77.92	85.71	81.63	64.51	82.49
SVM	80%	86.89	83.33	89.29	86.21	73.74	87.07
LR	50%	83.55	81.69	82.86	82.27	66.93	90.80
LR	80%	86.89	81.25	92.86	86.67	73.88	95.13
KNN	50%	82.24	82.09	78.57	80.29	64.14	88.34
KNN	80%	86.89	81.25	92.86	86.67	73.88	91.13
MLP	50%	77.63	73.68	80.00	76.71	55.26	85.61
MLP	80%	80.33	72.22	92.86	81.25	61.23	92.32
NB	50%	84.87	82.19	85.71	83.92	69.64	91.13
NB	80%	90.16	84.38	96.43	90.00	80.41	97.08
RF	50%	84.21	81.08	85.71	83.33	68.36	90.73
RF	80%	83.61	78.13	89.29	83.33	67.34	93.24
DT	50%	75.66	72.00	77.14	74.48	51.27	75.77
DT	80%	77.05	70.59	85.71	77.42	54.53	77.71
XGBoost	50%	80.26	76.32	82.86	79.45	60.53	89.02
XGBoost	80%	86.89	83.33	89.29	86.21	73.74	92.53
LGBM	50%	77.63	72.50	82.86	77.33	55.45	89.20
LGBM	80%	81.97	74.29	92.86	82.54	64.37	94.70
AdaBoost	50%	76.32	73.61	75.71	74.65	52.43	81.62
AdaBoost	80%	78.69	71.43	89.29	79.37	57.89	88.96

Table 8.

Performance of ML classifiers with BCO optimization for the CLE dataset.

Classifier	Training sample	Accuracy	Precision	Recall	F1 score	Kappa	ROC-AUC
SVM	50%	84.21	81.94	84.29	83.10	68.29	84.22
SVM	80%	91.80	96.00	85.71	90.57	83.36	91.34
LR	50%	82.89	80.56	82.86	81.69	65.65	90.35
LR	80%	86.89	81.25	92.86	86.67	73.88	95.02
KNN	50%	85.53	86.36	81.43	83.82	70.75	92.37
KNN	80%	86.89	83.33	89.29	86.21	73.74	94.64
MLP	50%	80.92	78.08	81.43	79.72	61.72	90.45
MLP	80%	85.25	77.14	96.43	85.71	70.84	94.70
NB	50%	84.87	82.19	85.71	83.92	69.64	91.13
NB	80%	90.16	84.38	96.43	90.00	80.41	97.08
RF	50%	84.87	85.07	81.43	83.21	69.45	92.06
RF	80%	86.89	83.33	89.29	86.21	73.74	94.81
DT	50%	78.29	76.81	75.71	76.26	56.26	78.21
DT	80%	85.25	80.65	89.29	84.75	70.53	86.09
XGBoost	50%	86.18	82.67	88.57	85.52	72.34	92.61
XGBoost	80%	91.80	87.10	96.43	91.53	83.63	95.67
LGBM	50%	86.84	87.88	82.86	85.29	73.41	92.94
LGBM	80%	83.61	82.14	82.14	82.14	66.99	94.48
AdaBoost	50%	84.21	94.23	70.00	80.33	67.61	93.43
AdaBoost	80%	88.52	92.00	82.14	86.79	76.70	95.24

Table 9.

Performance improvement (%) after BCO optimization for the CLE dataset.

Classifier	Training sample	Accuracy	Precision	Recall	F1 score	Kappa	ROC-AUC
SVM	50%	1.97	4.02	-1.43	1.47	3.78	1.72
SVM	80%	4.92	12.67	-3.57	4.36	9.63	4.27
LR	50%	-0.66	-1.13	0.00	-0.58	-1.29	-0.45
LR	80%	0.00	0.00	0.00	0.00	0.00	-0.11
KNN	50%	3.29	4.27	2.86	3.53	6.61	4.02
KNN	80%	0.00	2.08	-3.57	-0.46	-0.14	3.52
MLP	50%	3.29	4.40	1.43	3.01	6.46	4.84
MLP	80%	4.92	4.92	3.57	4.46	9.62	2.38
NB	50%	0.00	0.00	0.00	0.00	0.00	0.00
NB	80%	0.00	0.00	0.00	0.00	0.00	0.00
RF	50%	0.66	3.99	-4.29	-0.12	1.10	1.32
RF	80%	3.28	5.21	0.00	2.87	6.39	1.57
DT	50%	2.63	4.81	-1.43	1.78	5.00	2.44
DT	80%	8.20	10.06	3.57	7.33	16.01	8.39
XGBoost	50%	5.92	6.35	5.71	6.07	11.81	3.59
XGBoost	80%	4.92	3.76	7.14	5.32	9.89	3.14
LGBM	50%	9.21	15.38	0.00	7.96	17.96	3.74
LGBM	80%	1.64	7.86	-10.71	-0.40	2.63	-0.22
AdaBoost	50%	7.89	20.62	-5.71	5.68	15.18	11.81
AdaBoost	80%	9.84	20.57	-7.14	7.43	18.82	6.28

Table 10.

Performance of ML classifiers without optimization technique for the HGR dataset.

Classifier	Training sample	Accuracy	Precision	Recall	F1 score	Kappa	ROC-AUC
SVM	50%	84.37	85.08	85.35	85.21	68.64	92.41
SVM	80%	87.82	88.80	88.10	88.45	75.56	93.36
LR	50%	81.34	81.04	84.39	82.68	62.48	89.12
LR	80%	84.45	85.04	85.71	85.38	68.78	90.60
KNN	50%	82.86	82.92	85.03	83.96	65.56	89.97
KNN	80%	87.39	88.10	88.10	88.10	74.70	92.23
MLP	50%	85.88	86.39	86.94	86.67	71.67	92.64
MLP	80%	89.50	91.74	88.10	89.88	78.97	93.65
NB	50%	81.51	81.10	84.71	82.87	62.81	89.46
NB	80%	83.61	84.25	84.92	84.58	67.10	89.79
RF	50%	88.24	87.89	90.13	88.99	76.36	93.14
RF	80%	92.44	95.00	90.48	92.68	84.87	96.43
DT	50%	81.68	82.13	83.44	82.78	63.21	81.58
DT	80%	91.60	92.06	92.06	92.06	83.13	91.57
XGBoost	50%	85.88	86.62	86.62	86.62	71.68	92.62
XGBoost	80%	92.02	92.80	92.06	92.43	83.99	95.50
LGBM	50%	85.55	85.63	87.26	86.44	70.97	92.72
LGBM	80%	92.44	93.55	92.06	92.80	84.84	95.48
AdaBoost	50%	81.01	81.31	83.12	82.20	61.85	87.51
AdaBoost	80%	84.87	84.09	88.10	86.05	69.55	91.25

Table 11.

Performance of ML classifiers with BCO optimization for the HGR dataset.

Classifier	Training sample	Accuracy	Precision	Recall	F1 score	Kappa	ROC-AUC
SVM	50%	85.55	85.40	87.58	86.48	70.96	92.25
SVM	80%	90.76	88.24	95.24	91.60	81.36	96.76
LR	50%	81.34	81.04	84.39	82.68	62.48	89.10
LR	80%	84.45	85.04	85.71	85.38	68.78	90.60
KNN	50%	87.39	86.77	89.81	88.26	74.66	93.87
KNN	80%	92.02	92.80	92.06	92.43	83.99	95.79
MLP	50%	84.37	83.59	87.58	85.54	68.55	91.98
MLP	80%	88.24	88.89	88.89	88.89	76.39	92.47
NB	50%	81.51	81.10	84.71	82.87	62.81	89.45
NB	80%	83.61	84.25	84.92	84.58	67.10	89.79
RF	50%	88.24	88.13	89.81	88.96	76.37	93.59
RF	80%	92.02	94.21	90.48	92.31	84.02	97.08
DT	50%	85.38	86.03	86.31	86.17	70.66	86.37
DT	80%	89.92	89.84	91.27	90.55	79.74	93.40
XGBoost	50%	87.90	88.05	89.17	88.61	75.71	93.62
XGBoost	80%	94.12	95.16	93.65	94.40	88.21	95.20
LGBM	50%	85.88	84.43	89.81	87.04	71.57	91.14
LGBM	80%	92.44	94.26	91.27	92.74	84.85	95.28
AdaBoost	50%	84.03	82.69	88.22	85.36	67.84	87.96
AdaBoost	80%	88.24	89.52	88.10	88.80	76.41	93.22

Table 12.

Performance improvement (%) after BCO optimization for the HGR dataset.

Classifier	Training sample	Accuracy	Precision	Recall	F1 score	Kappa	ROC-AUC
SVM	50%	1.18	0.32	2.23	1.26	2.32	-0.17
SVM	80%	2.94	-0.56	7.14	3.16	5.80	3.40
LR	50%	0.00	0.00	0.00	0.00	0.00	-0.02
LR	80%	0.00	0.00	0.00	0.00	0.00	-0.01
KNN	50%	4.54	3.85	4.78	4.30	9.10	3.90
KNN	80%	4.62	4.70	3.97	4.34	9.28	3.56
MLP	50%	-1.51	-2.81	0.64	-1.13	-3.11	-0.66
MLP	80%	-1.26	-2.85	0.79	-0.99	-2.58	-1.18
NB	50%	0.00	0.00	0.00	0.00	0.00	0.00
NB	80%	0.00	0.00	0.00	0.00	0.00	0.00
RF	50%	0.00	0.24	-0.32	-0.03	0.01	0.45
RF	80%	-0.42	-0.79	0.00	-0.38	-0.85	0.65
DT	50%	3.70	3.90	2.87	3.39	7.45	4.79
DT	80%	-1.68	-2.22	-0.79	-1.51	-3.39	1.83
XGBoost	50%	2.02	1.43	2.55	1.98	4.03	1.00
XGBoost	80%	2.10	2.36	1.59	1.97	4.22	-0.30
LGBM	50%	0.34	-1.19	2.55	0.60	0.60	-1.58
LGBM	80%	0.00	0.71	-0.79	-0.06	0.02	-0.20
AdaBoost	50%	3.03	1.38	5.10	3.16	5.99	0.45
AdaBoost	80%	3.36	5.43	0.00	2.75	6.86	1.97

Figure 5.

Performance comparison for CLE dataset (50% training samples).

Figure 6.

Performance comparison for CLE dataset (80% training samples).

Figure 7.

Performance comparison for HGR dataset (50% training samples).

Figure 8.

Performance comparison for HGR dataset (80% training samples).

Table 13.

Experimental results (mean ± std) for CLE 5 fold cross validation.

Model	Optimization technique	Accuracy	Precision	Recall	F1-score	Kappa	ROC-AUC
SVM	No Optimization	84.15 ± 2.89	85.38 ± 4.24	79.13 ± 4.20	82.06 ± 3.35	67.92 ± 5.88	89.79 ± 2.45
	Grid Search	85.13 ± 4.10	88.06 ± 5.50	78.41 ± 5.98	82.82 ± 4.77	69.81 ± 8.32	90.93 ± 1.89
	GA	85.14 ± 3.18	86.83 ± 3.45	79.84 ± 7.39	82.97 ± 4.11	69.86 ± 6.57	91.08 ± 1.94
	PSO	84.47 ± 4.79	85.55 ± 5.91	79.84 ± 7.39	82.41 ± 5.52	68.57 ± 9.71	91.04 ± 2.02
	BCO	85.79 ± 3.78	87.59 ± 4.25	80.56 ± 7.05	83.75 ± 4.58	71.19 ± 7.72	91.00 ± 2.18
LR	No Optimization	84.80 ± 4.54	86.33 ± 6.29	79.84 ± 6.67	82.76 ± 5.07	69.23 ± 9.17	90.85 ± 2.00
	Grid Search	84.80 ± 4.54	86.33 ± 6.29	79.84 ± 6.67	82.76 ± 5.07	69.23 ± 9.17	90.85 ± 2.00
	GA	84.15 ± 2.93	84.43 ± 4.30	80.56 ± 6.68	82.23 ± 3.62	67.96 ± 6.03	90.86 ± 2.18
	PSO	84.47 ± 4.44	85.59 ± 5.83	79.84 ± 6.67	82.44 ± 4.99	68.58 ± 8.97	91.05 ± 1.96
	BCO	84.47 ± 4.06	85.22 ± 6.31	80.56 ± 6.29	82.60 ± 4.46	68.62 ± 8.18	90.81 ± 1.95
KNN	No Optimization	82.18 ± 2.62	83.34 ± 3.84	77.01 ± 7.94	79.68 ± 3.80	63.92 ± 5.45	88.66 ± 2.16
	Grid Search	83.49 ± 3.18	85.10 ± 4.13	77.70 ± 4.72	81.15 ± 3.75	66.53 ± 6.48	89.98 ± 2.53
	GA	82.82 ± 3.48	84.31 ± 3.73	76.98 ± 6.60	80.32 ± 4.38	65.17 ± 7.13	90.24 ± 3.29
	PSO	83.49 ± 3.18	85.10 ± 4.13	77.70 ± 4.72	81.15 ± 3.75	66.53 ± 6.48	90.12 ± 2.54
	BCO	82.50 ± 3.45	87.23 ± 5.25	72.62 ± 5.58	79.12 ± 4.37	64.29 ± 7.13	87.94 ± 3.10
MLP	No Optimization	81.84 ± 3.80	82.62 ± 6.44	76.96 ± 3.02	79.60 ± 4.00	63.28 ± 7.63	88.83 ± 1.87
	Grid Search	84.47 ± 4.06	85.22 ± 6.31	80.56 ± 6.29	82.60 ± 4.46	68.62 ± 8.18	90.41 ± 2.11
	GA	81.19 ± 1.34	81.34 ± 4.00	76.98 ± 2.79	78.98 ± 1.12	61.98 ± 2.58	89.63 ± 2.50
	PSO	80.16 ± 7.51	77.92 ± 8.69	79.79 ± 7.61	78.73 ± 7.67	60.18 ± 14.93	85.94 ± 7.57
	BCO	84.47 ± 4.44	86.18 ± 6.33	79.13 ± 5.28	82.38 ± 4.88	68.56 ± 8.94	91.05 ± 2.06
NB	No Optimization	84.16 ± 2.45	84.96 ± 3.89	79.89 ± 6.13	82.14 ± 3.10	67.96 ± 5.04	90.33 ± 0.89
	Grid Search	84.16 ± 2.45	84.96 ± 3.89	79.89 ± 6.13	82.14 ± 3.10	67.96 ± 5.04	90.33 ± 0.89
	GA	84.16 ± 2.45	84.96 ± 3.89	79.89 ± 6.13	82.14 ± 3.10	67.96 ± 5.04	90.33 ± 0.89
	PSO	84.16 ± 2.45	84.96 ± 3.89	79.89 ± 6.13	82.14 ± 3.10	67.96 ± 5.04	90.33 ± 0.89
	BCO	84.16 ± 2.45	84.96 ± 3.89	79.89 ± 6.13	82.14 ± 3.10	67.96 ± 5.04	90.33 ± 0.89
RF	No Optimization	81.49 ± 4.21	82.95 ± 5.78	75.56 ± 6.08	78.89 ± 4.73	62.52 ± 8.47	89.33 ± 2.62
	Grid Search	83.15 ± 3.73	86.90 ± 4.80	74.87 ± 7.37	80.18 ± 4.71	65.73 ± 7.63	89.12 ± 2.71
	GA	81.50 ± 3.73	82.89 ± 5.33	75.58 ± 5.57	78.92 ± 4.33	62.54 ± 7.55	88.94 ± 2.71
	PSO	81.83 ± 4.36	84.10 ± 5.48	74.87 ± 8.03	78.94 ± 5.46	63.12 ± 8.91	89.31 ± 2.90
	BCO	82.82 ± 3.48	85.15 ± 5.35	76.30 ± 6.51	80.22 ± 4.14	65.16 ± 7.05	89.58 ± 2.83
DT	No Optimization	74.90 ± 3.02	74.24 ± 6.88	72.01 ± 8.72	72.34 ± 3.26	49.52 ± 5.65	74.76 ± 2.67
	Grid Search	79.54 ± 2.24	83.41 ± 6.77	70.53 ± 7.53	75.85 ± 3.09	58.36 ± 4.54	84.96 ± 1.48
	GA	77.54 ± 2.81	77.03 ± 6.78	74.13 ± 5.60	75.18 ± 2.70	54.72 ± 5.51	82.94 ± 3.60
	PSO	76.22 ± 3.77	77.40 ± 7.99	69.84 ± 6.42	72.94 ± 3.93	51.90 ± 7.42	82.58 ± 3.05
	BCO	79.54 ± 2.24	83.41 ± 6.77	70.53 ± 7.53	75.85 ± 3.09	58.36 ± 4.54	84.96 ± 1.48
XGBoost	No Optimization	81.50 ± 5.08	81.26 ± 7.00	78.47 ± 8.95	79.43 ± 6.02	62.70 ± 10.29	89.03 ± 3.17
	Grid Search	83.49 ± 3.93	87.30 ± 3.97	74.81 ± 5.99	80.52 ± 4.86	66.36 ± 8.07	90.80 ± 2.39
	GA	80.52 ± 3.36	83.58 ± 4.09	71.98 ± 8.46	77.00 ± 4.73	60.36 ± 7.01	88.70 ± 2.78
	PSO	81.51 ± 3.53	81.24 ± 3.90	77.75 ± 5.57	79.37 ± 4.13	62.66 ± 7.18	88.79 ± 3.01
	BCO	84.48 ± 3.11	89.21 ± 4.79	75.58 ± 5.57	81.66 ± 3.85	68.39 ± 6.35	90.01 ± 3.10
LightGBM	No Optimization	80.87 ± 1.87	80.30 ± 4.15	77.75 ± 4.56	78.83 ± 2.13	61.42 ± 3.77	88.13 ± 2.92
	Grid Search	82.83 ± 3.75	86.17 ± 4.46	74.84 ± 8.08	79.81 ± 4.80	65.06 ± 7.71	89.56 ± 2.02
	GA	82.84 ± 4.34	86.03 ± 5.21	74.89 ± 7.34	79.91 ± 5.43	65.11 ± 8.91	89.42 ± 1.46
	PSO	81.50 ± 2.93	85.27 ± 5.51	72.70 ± 4.66	78.28 ± 3.29	62.38 ± 5.88	90.35 ± 1.58
	BCO	84.47 ± 3.63	86.78 ± 5.52	78.47 ± 5.85	82.23 ± 4.25	68.53 ± 7.35	89.43 ± 2.38
AdaBoost	No Optimization	76.90 ± 3.99	78.03 ± 3.95	69.07 ± 6.95	73.16 ± 5.03	53.06 ± 8.20	86.87 ± 3.36
	Grid Search	82.49 ± 2.97	87.48 ± 5.90	72.70 ± 4.66	79.21 ± 3.40	64.33 ± 5.98	89.40 ± 2.76
	GA	81.83 ± 4.60	86.98 ± 7.96	71.98 ± 6.78	78.41 ± 5.26	63.00 ± 9.28	89.97 ± 2.63
	PSO	81.50 ± 4.66	85.46 ± 7.50	72.70 ± 6.49	78.27 ± 5.34	62.38 ± 9.40	89.03 ± 2.44
	BCO	84.15 ± 3.24	87.14 ± 4.00	77.01 ± 6.53	81.57 ± 4.17	67.81 ± 6.67	90.50 ± 2.27

Table 14.

Experimental results (mean ± std) for HGR 5 fold cross validation.

Model	Optimization technique	Accuracy	Precision	Recall	F1-score	Kappa	ROC-AUC
SVM	No Optimization	86.22 ± 1.73	85.91 ± 2.78	88.55 ± 1.29	87.18 ± 1.46	72.29 ± 3.52	92.40 ± 1.16
	Grid Search	88.57 ± 1.39	84.85 ± 2.01	95.55 ± 2.82	89.83 ± 1.23	76.89 ± 2.83	95.49 ± 0.98
	GA	87.90 ± 1.79	89.45 ± 2.18	87.44 ± 1.73	88.42 ± 1.71	75.75 ± 3.59	93.56 ± 1.56
	PSO	88.57 ± 1.39	84.85 ± 2.01	95.55 ± 2.82	89.83 ± 1.23	76.89 ± 2.83	95.49 ± 0.98
	BCO	89.83 ± 0.72	87.69 ± 2.26	94.12 ± 2.54	90.73 ± 0.60	79.50 ± 1.49	95.06 ± 1.20
LR	No Optimization	82.10 ± 2.10	83.37 ± 2.45	82.67 ± 2.07	83.00 ± 1.98	64.10 ± 4.22	90.07 ± 1.26
	Grid Search	82.52 ± 1.49	84.47 ± 1.69	82.03 ± 2.19	83.22 ± 1.53	64.99 ± 2.98	90.15 ± 1.28
	GA	82.10 ± 2.10	83.37 ± 2.45	82.67 ± 2.07	83.00 ± 1.98	64.10 ± 4.22	90.05 ± 1.25
	PSO	82.02 ± 1.51	83.70 ± 2.36	82.03 ± 1.88	82.83 ± 1.41	63.96 ± 3.05	90.07 ± 1.17
	BCO	82.10 ± 2.10	83.37 ± 2.45	82.67 ± 2.07	83.00 ± 1.98	64.10 ± 4.22	90.05 ± 1.25
KNN	No Optimization	86.13 ± 2.39	86.41 ± 3.80	87.76 ± 0.96	87.03 ± 2.00	72.14 ± 4.87	91.23 ± 2.11
	Grid Search	91.60 ± 1.38	92.01 ± 2.77	92.21 ± 0.92	92.08 ± 1.20	83.13 ± 2.80	96.02 ± 1.05
	GA	91.34 ± 1.47	91.95 ± 2.66	91.73 ± 0.38	91.82 ± 1.29	82.63 ± 2.97	95.29 ± 1.53
	PSO	91.34 ± 1.47	91.95 ± 2.66	91.73 ± 0.38	91.82 ± 1.29	82.63 ± 2.97	95.29 ± 1.53
	BCO	91.76 ± 1.02	92.10 ± 1.48	92.37 ± 1.40	92.22 ± 0.97	83.47 ± 2.04	95.78 ± 1.23
MLP	No Optimization	85.80 ± 1.91	86.15 ± 2.88	87.28 ± 1.87	86.67 ± 1.66	71.48 ± 3.86	92.61 ± 0.78
	Grid Search	90.76 ± 1.72	91.16 ± 2.24	91.41 ± 1.55	91.27 ± 1.59	81.45 ± 3.46	93.63 ± 1.22
	GA	81.60 ± 2.01	84.70 ± 2.64	79.65 ± 2.66	82.06 ± 1.99	63.21 ± 4.03	89.23 ± 1.14
	PSO	81.85 ± 1.79	82.06 ± 2.78	84.25 ± 4.09	83.05 ± 1.83	63.51 ± 3.58	90.47 ± 0.91
	BCO	88.32 ± 1.49	89.11 ± 2.60	88.87 ± 2.40	88.94 ± 1.40	76.56 ± 3.00	93.54 ± 0.66
NB	No Optimization	83.28 ± 1.75	83.84 ± 1.99	84.74 ± 2.44	84.26 ± 1.68	66.42 ± 3.51	90.17 ± 1.14
	Grid Search	83.28 ± 1.75	83.84 ± 1.99	84.74 ± 2.44	84.26 ± 1.68	66.42 ± 3.51	90.17 ± 1.14
	GA	83.28 ± 1.75	83.84 ± 1.99	84.74 ± 2.44	84.26 ± 1.68	66.42 ± 3.51	90.17 ± 1.14
	PSO	83.28 ± 1.75	83.84 ± 1.99	84.74 ± 2.44	84.26 ± 1.68	66.42 ± 3.51	90.17 ± 1.14
	BCO	83.28 ± 1.75	83.84 ± 1.99	84.74 ± 2.44	84.26 ± 1.68	66.42 ± 3.51	90.17 ± 1.14
RF	No Optimization	91.26 ± 1.69	91.73 ± 1.54	91.73 ± 2.18	91.72 ± 1.65	82.47 ± 3.38	95.89 ± 0.86
	Grid Search	92.02 ± 1.27	92.54 ± 1.86	92.37 ± 0.64	92.45 ± 1.17	83.98 ± 2.57	96.59 ± 0.72
	GA	91.43 ± 0.94	91.51 ± 1.35	92.37 ± 0.81	91.93 ± 0.87	82.79 ± 1.90	95.75 ± 0.79
	PSO	91.34 ± 0.73	91.38 ± 1.47	92.37 ± 1.08	91.86 ± 0.67	82.62 ± 1.48	96.09 ± 0.86
	BCO	92.02 ± 1.50	92.27 ± 1.86	92.69 ± 1.69	92.47 ± 1.42	83.98 ± 3.02	95.53 ± 0.98
DT	No Optimization	85.04 ± 1.02	86.12 ± 1.27	85.53 ± 2.81	85.79 ± 1.16	70.00 ± 1.99	85.01 ± 0.95
	Grid Search	87.14 ± 0.87	87.93 ± 1.03	87.76 ± 2.53	87.81 ± 1.00	74.21 ± 1.69	87.11 ± 0.79
	GA	86.97 ± 1.38	87.91 ± 1.33	87.45 ± 3.76	87.61 ± 1.60	73.88 ± 2.70	87.67 ± 0.90
	PSO	85.88 ± 0.43	86.96 ± 2.78	86.49 ± 3.65	86.61 ± 0.65	71.68 ± 0.86	86.27 ± 1.16
	BCO	86.97 ± 2.24	89.44 ± 2.36	85.54 ± 3.98	87.38 ± 2.29	73.94 ± 4.46	87.07 ± 2.19
XGBoost	No Optimization	89.92 ± 1.10	90.09 ± 1.36	90.94 ± 1.40	90.50 ± 1.05	79.75 ± 2.20	94.98 ± 0.64
	Grid Search	89.75 ± 1.11	89.83 ± 1.59	90.94 ± 1.97	90.36 ± 1.09	79.42 ± 2.23	95.09 ± 0.55
	GA	89.75 ± 1.11	89.83 ± 1.59	90.94 ± 1.97	90.36 ± 1.09	79.42 ± 2.23	95.09 ± 0.55
	PSO	90.59 ± 1.08	90.84 ± 0.99	91.42 ± 1.83	91.12 ± 1.08	81.11 ± 2.16	95.15 ± 0.32
	BCO	91.26 ± 1.04	90.72 ± 1.47	93.01 ± 1.68	91.84 ± 0.99	82.44 ± 2.09	95.06 ± 0.61
LGBM	No Optimization	90.92 ± 0.68	90.52 ± 0.79	92.53 ± 1.48	91.50 ± 0.70	81.77 ± 1.36	95.19 ± 0.47
	Grid Search	91.26 ± 0.97	90.84 ± 1.39	92.85 ± 1.00	91.82 ± 0.90	82.44 ± 1.96	94.98 ± 0.66
	GA	90.84 ± 1.29	90.50 ± 1.31	92.37 ± 1.49	91.42 ± 1.24	81.60 ± 2.57	95.31 ± 0.73
	PSO	90.59 ± 0.91	90.09 ± 1.17	92.37 ± 1.46	91.21 ± 0.87	81.09 ± 1.81	95.24 ± 0.67
	BCO	91.76 ± 0.57	91.30 ± 0.91	93.32 ± 1.18	92.29 ± 0.56	83.45 ± 1.14	95.49 ± 0.65
AdaBoost	No Optimization	81.85 ± 1.67	82.95 ± 2.89	82.83 ± 2.12	82.84 ± 1.43	63.58 ± 3.39	89.93 ± 0.77
	Grid Search	85.13 ± 1.05	86.87 ± 2.01	84.74 ± 1.96	85.76 ± 0.99	70.20 ± 2.11	90.49 ± 1.45
	GA	85.21 ± 1.29	86.42 ± 2.04	85.54 ± 2.25	85.94 ± 1.23	70.34 ± 2.58	90.61 ± 1.57
	PSO	84.12 ± 1.54	85.73 ± 1.55	83.94 ± 2.31	84.81 ± 1.56	68.17 ± 3.06	90.61 ± 1.35
	BCO	85.55 ± 1.02	87.45 ± 1.79	84.90 ± 1.94	86.12 ± 1.00	71.05 ± 2.03	90.50 ± 1.45

Table 15.

Experimental results (95% CI) for CLE 5 fold cross validation.

Model	Optimization technique	Accuracy	Precision	Recall	F1-score	Kappa	ROC-AUC
SVM	No Optimization	[80.14 - 88.17]	[79.49 - 91.27]	[73.29 - 84.96]	[77.41 - 86.70]	[59.75 - 76.08]	[86.39 - 93.20]
	Grid Search	[79.44 - 90.82]	[80.43 - 95.69]	[70.10 - 86.72]	[76.20 - 89.43]	[58.26 - 81.37]	[88.30 - 93.55]
	GA	[80.73 - 89.55]	[82.04 - 91.62]	[69.58 - 90.10]	[77.27 - 88.68]	[60.74 - 78.98]	[88.39 - 93.78]
	PSO	[77.82 - 91.12]	[77.35 - 93.76]	[69.58 - 90.10]	[74.75 - 90.07]	[55.09 - 82.05]	[88.24 - 93.84]
	BCO	[80.54 - 91.03]	[81.70 - 93.48]	[70.76 - 90.35]	[77.40 - 90.10]	[60.47 - 81.91]	[87.97 - 94.03]
LR	No Optimization	[78.49 - 91.11]	[77.59 - 95.06]	[70.59 - 89.09]	[75.72 - 89.80]	[56.49 - 81.96]	[88.08 - 93.62]
	Grid Search	[78.49 - 91.11]	[77.59 - 95.06]	[70.59 - 89.09]	[75.72 - 89.80]	[56.49 - 81.96]	[88.08 - 93.62]
	GA	[80.07 - 88.22]	[78.45 - 90.40]	[71.28 - 89.83]	[77.20 - 87.26]	[59.60 - 76.33]	[87.83 - 93.89]
	PSO	[78.31 - 90.63]	[77.51 - 93.68]	[70.59 - 89.09]	[75.51 - 89.37]	[56.12 - 81.04]	[88.33 - 93.76]
	BCO	[78.83 - 90.11]	[76.46 - 93.98]	[71.82 - 89.29]	[76.40 - 88.79]	[57.27 - 79.97]	[88.10 - 93.52]
KNN	No Optimization	[78.54 - 85.82]	[78.01 - 88.67]	[65.99 - 88.03]	[74.40 - 84.96]	[56.34 - 71.49]	[85.66 - 91.67]
	Grid Search	[79.07 - 87.90]	[79.37 - 90.83]	[71.14 - 84.26]	[75.94 - 86.36]	[57.54 - 75.52]	[86.47 - 93.50]
	GA	[77.99 - 87.65]	[79.13 - 89.49]	[67.83 - 86.14]	[74.24 - 86.40]	[55.27 - 75.08]	[85.67 - 94.80]
	PSO	[79.07 - 87.90]	[79.37 - 90.83]	[71.14 - 84.26]	[75.94 - 86.36]	[57.54 - 75.52]	[86.59 - 93.64]
	BCO	[77.71 - 87.28]	[79.94 - 94.52]	[64.87 - 80.37]	[73.05 - 85.19]	[54.39 - 74.18]	[83.63 - 92.24]
MLP	No Optimization	[76.57 - 87.12]	[73.68 - 91.56]	[72.77 - 81.14]	[74.04 - 85.15]	[52.69 - 73.87]	[86.23 - 91.43]
	Grid Search	[78.83 - 90.11]	[76.46 - 93.98]	[71.82 - 89.29]	[76.40 - 88.79]	[57.27 - 79.97]	[87.49 - 93.34]
	GA	[79.33 - 83.05]	[75.79 - 86.89]	[73.10 - 80.86]	[77.42 - 80.53]	[58.40 - 65.57]	[86.15 - 93.10]
	PSO	[69.73 - 90.59]	[65.86 - 89.98]	[69.23 - 90.35]	[68.08 - 89.38]	[39.46 - 80.90]	[75.43 - 96.45]
	BCO	[78.31 - 90.63]	[77.39 - 94.97]	[71.80 - 86.46]	[75.60 - 89.16]	[56.14 - 80.98]	[88.19 - 93.92]
NB	No Optimization	[80.75 - 87.56]	[79.57 - 90.36]	[71.38 - 88.41]	[77.84 - 86.44]	[60.97 - 74.96]	[89.09 - 91.56]
	Grid Search	[80.75 - 87.56]	[79.57 - 90.36]	[71.38 - 88.41]	[77.84 - 86.44]	[60.97 - 74.96]	[89.09 - 91.56]
	GA	[80.75 - 87.56]	[79.57 - 90.36]	[71.38 - 88.41]	[77.84 - 86.44]	[60.97 - 74.96]	[89.09 - 91.56]
	PSO	[80.75 - 87.56]	[79.57 - 90.36]	[71.38 - 88.41]	[77.84 - 86.44]	[60.97 - 74.96]	[89.09 - 91.56]
	BCO	[80.75 - 87.56]	[79.57 - 90.36]	[71.38 - 88.41]	[77.84 - 86.44]	[60.97 - 74.96]	[89.09 - 91.56]
RF	No Optimization	[75.65 - 87.33]	[74.92 - 90.98]	[67.12 - 83.99]	[72.33 - 85.46]	[50.76 - 74.27]	[85.69 - 92.98]
	Grid Search	[77.98 - 88.33]	[80.24 - 93.57]	[64.64 - 85.10]	[73.64 - 86.71]	[55.13 - 76.32]	[85.36 - 92.87]
	GA	[76.32 - 86.69]	[75.49 - 90.29]	[67.85 - 83.31]	[72.91 - 84.92]	[52.05 - 73.02]	[85.18 - 92.69]
	PSO	[75.78 - 87.88]	[76.49 - 91.70]	[63.72 - 86.02]	[71.36 - 86.52]	[50.74 - 75.49]	[85.29 - 93.34]
	BCO	[77.99 - 87.65]	[77.73 - 92.57]	[67.25 - 85.34]	[74.48 - 85.97]	[55.37 - 74.94]	[85.66 - 93.51]
DT	No Optimization	[70.70 - 79.09]	[64.69 - 83.80]	[59.91 - 84.11]	[67.82 - 76.86]	[41.67 - 57.37]	[71.05 - 78.46]
	Grid Search	[76.42 - 82.65]	[74.01 - 92.80]	[60.07 - 80.98]	[71.56 - 80.14]	[52.05 - 64.67]	[82.91 - 87.02]
	GA	[73.63 - 81.45]	[67.62 - 86.45]	[66.35 - 81.91]	[71.42 - 78.93]	[47.07 - 62.36]	[77.94 - 87.93]
	PSO	[70.99 - 81.46]	[66.31 - 88.49]	[60.92 - 78.76]	[67.49 - 78.39]	[41.60 - 62.20]	[78.34 - 86.81]
	BCO	[76.42 - 82.65]	[74.01 - 92.80]	[60.07 - 80.98]	[71.56 - 80.14]	[52.05 - 64.67]	[82.91 - 87.02]
XGBoost	No Optimization	[74.46 - 88.55]	[71.54 - 90.98]	[66.04 - 90.90]	[71.08 - 87.78]	[48.42 - 76.99]	[84.63 - 93.43]
	Grid Search	[78.03 - 88.95]	[81.79 - 92.82]	[66.50 - 83.13]	[73.77 - 87.27]	[55.16 - 77.56]	[87.49 - 94.11]
	GA	[75.86 - 85.19]	[77.91 - 89.26]	[60.24 - 83.73]	[70.44 - 83.56]	[50.62 - 70.10]	[84.83 - 92.56]
	PSO	[76.61 - 86.42]	[75.82 - 86.65]	[70.02 - 85.48]	[73.65 - 85.10]	[52.70 - 72.63]	[84.61 - 92.96]
	BCO	[80.15 - 88.80]	[82.56 - 95.86]	[67.85 - 83.31]	[76.31 - 87.00]	[59.58 - 77.21]	[85.70 - 94.32]
LGBM	No Optimization	[78.28 - 83.46]	[74.54 - 86.07]	[71.42 - 84.09]	[75.88 - 81.78]	[56.18 - 66.65]	[84.08 - 92.19]
	Grid Search	[77.63 - 88.03]	[79.98 - 92.36]	[63.62 - 86.06]	[73.14 - 86.48]	[54.36 - 75.76]	[86.75 - 92.37]
	GA	[76.82 - 88.87]	[78.80 - 93.26]	[64.70 - 85.09]	[72.37 - 87.45]	[52.74 - 77.47]	[87.39 - 91.44]
	PSO	[77.44 - 85.57]	[77.63 - 92.91]	[66.22 - 79.17]	[73.71 - 82.85]	[54.22 - 70.53]	[88.16 - 92.55]
	BCO	[79.43 - 89.51]	[79.11 - 94.45]	[70.34 - 86.59]	[76.33 - 88.12]	[58.33 - 78.73]	[86.13 - 92.73]
AdaBoost	No Optimization	[71.36 - 82.44]	[72.54 - 83.51]	[59.43 - 78.72]	[66.17 - 80.15]	[41.67 - 64.44]	[82.21 - 91.53]
	Grid Search	[78.37 - 86.61]	[79.28 - 95.68]	[66.22 - 79.17]	[74.49 - 83.92]	[56.03 - 72.62]	[85.57 - 93.22]
	GA	[75.44 - 88.22]	[75.94 - 98.03]	[62.57 - 81.40]	[71.10 - 85.71]	[50.12 - 75.88]	[86.32 - 93.62]
	PSO	[75.02 - 87.97]	[75.05 - 95.86]	[63.68 - 81.71]	[70.85 - 85.69]	[49.32 - 75.43]	[85.65 - 92.42]
	BCO	[79.66 - 88.65]	[81.59 - 92.69]	[67.95 - 86.07]	[75.78 - 87.36]	[58.55 - 77.06]	[87.34 - 93.66]

Table 16.

Experimental results (95% CI) for HGR 5 fold cross validation.

Model	Optimization technique	Accuracy	Precision	Recall	F1-score	Kappa	ROC-AUC
SVM	No Optimization	[83.82 - 88.62]	[82.06 - 89.77]	[86.77 - 90.34]	[85.15 - 89.22]	[67.41 - 77.17]	[90.79 - 94.01]
	Grid Search	[86.64 - 90.50]	[82.06 - 87.64]	[91.64 - 99.46]	[88.12 - 91.55]	[72.96 - 80.82]	[94.12 - 96.85]
	GA	[85.41 - 90.38]	[86.43 - 92.48]	[85.03 - 89.84]	[86.05 - 90.80]	[70.77 - 80.74]	[91.40 - 95.72]
	PSO	[86.64 - 90.50]	[82.06 - 87.64]	[91.64 - 99.46]	[88.12 - 91.55]	[72.96 - 80.82]	[94.12 - 96.85]
	BCO	[88.83 - 90.84]	[84.55 - 90.83]	[90.59 - 97.65]	[89.90 - 91.56]	[77.43 - 81.57]	[93.39 - 96.73]
LR	No Optimization	[79.18 - 85.02]	[79.97 - 86.76]	[79.79 - 85.54]	[80.25 - 85.75]	[58.24 - 69.96]	[88.31 - 91.82]
	Grid Search	[80.45 - 84.59]	[82.13 - 86.81]	[78.99 - 85.08]	[81.10 - 85.34]	[60.86 - 69.12]	[88.37 - 91.92]
	GA	[79.18 - 85.02]	[79.97 - 86.76]	[79.79 - 85.54]	[80.25 - 85.75]	[58.24 - 69.96]	[88.31 - 91.79]
	PSO	[79.92 - 84.12]	[80.43 - 86.97]	[79.42 - 84.65]	[80.87 - 84.78]	[59.72 - 68.19]	[88.45 - 91.69]
	BCO	[79.18 - 85.02]	[79.97 - 86.76]	[79.79 - 85.54]	[80.25 - 85.75]	[58.24 - 69.96]	[88.31 - 91.79]
KNN	No Optimization	[82.81 - 89.45]	[81.14 - 91.68]	[86.42 - 89.09]	[84.26 - 89.81]	[65.38 - 78.90]	[88.30 - 94.15]
	Grid Search	[89.68 - 93.51]	[88.16 - 95.85]	[90.94 - 93.49]	[90.41 - 93.75]	[79.25 - 87.01]	[94.56 - 97.48]
	GA	[89.30 - 93.39]	[88.27 - 95.64]	[91.21 - 92.26]	[90.03 - 93.62]	[78.50 - 86.76]	[93.17 - 97.41]
	PSO	[89.30 - 93.39]	[88.27 - 95.64]	[91.21 - 92.26]	[90.03 - 93.62]	[78.50 - 86.76]	[93.17 - 97.41]
	BCO	[90.36 - 93.17]	[90.04 - 94.15]	[90.43 - 94.30]	[90.87 - 93.57]	[80.65 - 86.30]	[94.07 - 97.49]
MLP	No Optimization	[83.15 - 88.44]	[82.15 - 90.15]	[84.69 - 89.88]	[84.37 - 88.97]	[66.12 - 76.84]	[91.52 - 93.69]
	Grid Search	[88.37 - 93.15]	[88.04 - 94.28]	[89.26 - 93.56]	[89.07 - 93.48]	[76.64 - 86.26]	[91.94 - 95.32]
	GA	[78.80 - 84.39]	[81.04 - 88.36]	[75.96 - 83.35]	[79.29 - 84.83]	[57.62 - 68.80]	[87.65 - 90.82]
	PSO	[79.36 - 84.33]	[78.21 - 85.92]	[78.57 - 89.93]	[80.51 - 85.58]	[58.54 - 68.49]	[89.20 - 91.74]
	BCO	[86.25 - 90.39]	[85.50 - 92.72]	[85.55 - 92.20]	[87.01 - 90.88]	[72.40 - 80.72]	[92.62 - 94.45]
NB	No Optimization	[80.85 - 85.71]	[81.08 - 86.61]	[81.34 - 88.13]	[81.92 - 86.60]	[61.54 - 71.30]	[88.58 - 91.75]
	Grid Search	[80.85 - 85.71]	[81.08 - 86.61]	[81.34 - 88.13]	[81.92 - 86.60]	[61.54 - 71.30]	[88.58 - 91.75]
	GA	[80.85 - 85.71]	[81.08 - 86.61]	[81.34 - 88.13]	[81.92 - 86.60]	[61.54 - 71.30]	[88.58 - 91.75]
	PSO	[80.85 - 85.71]	[81.08 - 86.61]	[81.34 - 88.13]	[81.92 - 86.60]	[61.54 - 71.30]	[88.58 - 91.75]
	BCO	[80.85 - 85.71]	[81.08 - 86.61]	[81.34 - 88.13]	[81.92 - 86.60]	[61.54 - 71.30]	[88.58 - 91.75]
RF	No Optimization	[88.92 - 93.61]	[89.60 - 93.87]	[88.70 - 94.76]	[89.44 - 94.01]	[77.78 - 87.15]	[94.70 - 97.08]
	Grid Search	[90.25 - 93.79]	[89.95 - 95.13]	[91.48 - 93.26]	[90.83 - 94.07]	[80.42 - 87.54]	[95.59 - 97.58]
	GA	[90.12 - 92.74]	[89.64 - 93.38]	[91.24 - 93.50]	[90.73 - 93.14]	[80.15 - 85.43]	[94.64 - 96.85]
	PSO	[90.33 - 92.36]	[89.34 - 93.42]	[90.87 - 93.87]	[90.93 - 92.79]	[80.57 - 84.67]	[94.89 - 97.28]
	BCO	[89.93 - 94.10]	[89.69 - 94.85]	[90.35 - 95.03]	[90.50 - 94.44]	[79.79 - 88.17]	[94.18 - 96.89]
DT	No Optimization	[83.63 - 86.45]	[84.36 - 87.89]	[81.64 - 89.43]	[84.18 - 87.40]	[67.23 - 72.77]	[83.69 - 86.33]
	Grid Search	[85.94 - 88.34]	[86.50 - 89.36]	[84.25 - 91.27]	[86.42 - 89.20]	[71.86 - 76.56]	[86.02 - 88.20]
	GA	[85.06 - 88.89]	[86.06 - 89.77]	[82.23 - 92.66]	[85.39 - 89.84]	[70.14 - 77.63]	[86.42 - 88.92]
	PSO	[85.29 - 86.48]	[83.10 - 90.81]	[81.41 - 91.56]	[85.71 - 87.50]	[70.49 - 72.87]	[84.66 - 87.88]
	BCO	[83.87 - 90.08]	[86.17 - 92.71]	[80.02 - 91.06]	[84.20 - 90.56]	[67.75 - 80.12]	[84.03 - 90.10]
XGBoost	No Optimization	[88.39 - 91.44]	[88.20 - 91.99]	[88.99 - 92.88]	[89.05 - 91.96]	[76.71 - 82.80]	[94.10 - 95.86]
	Grid Search	[88.20 - 91.30]	[87.62 - 92.04]	[88.20 - 93.68]	[88.85 - 91.87]	[76.32 - 82.51]	[94.33 - 95.86]
	GA	[88.20 - 91.30]	[87.62 - 92.04]	[88.20 - 93.68]	[88.85 - 91.87]	[76.32 - 82.51]	[94.33 - 95.86]
	PSO	[89.09 - 92.09]	[89.47 - 92.22]	[88.87 - 93.96]	[89.62 - 92.61]	[78.11 - 84.11]	[94.71 - 95.60]
	BCO	[89.81 - 92.71]	[88.68 - 92.77]	[90.67 - 95.35]	[90.47 - 93.20]	[79.53 - 85.34]	[94.22 - 95.91]
LGBM	No Optimization	[89.98 - 91.87]	[89.42 - 91.62]	[90.48 - 94.58]	[90.54 - 92.47]	[79.89 - 83.65]	[94.55 - 95.84]
	Grid Search	[89.91 - 92.61]	[88.91 - 92.78]	[91.45 - 94.24]	[90.58 - 93.07]	[79.73 - 85.16]	[94.07 - 95.89]
	GA	[89.06 - 92.62]	[88.68 - 92.32]	[90.29 - 94.44]	[89.70 - 93.13]	[78.03 - 85.17]	[94.29 - 96.32]
	PSO	[89.33 - 91.84]	[88.46 - 91.72]	[90.34 - 94.40]	[90.00 - 92.41]	[78.57 - 83.60]	[94.31 - 96.16]
	BCO	[90.97 - 92.56]	[90.04 - 92.57]	[91.69 - 94.96]	[91.52 - 93.06]	[81.87 - 85.04]	[94.60 - 96.39]
AdaBoost	No Optimization	[79.53 - 84.16]	[78.94 - 86.96]	[79.88 - 85.77]	[80.85 - 84.83]	[58.87 - 68.28]	[88.86 - 91.01]
	Grid Search	[83.67 - 86.58]	[84.08 - 89.66]	[82.02 - 87.46]	[84.38 - 87.14]	[67.27 - 73.13]	[88.49 - 92.50]
	GA	[83.43 - 86.99]	[83.58 - 89.26]	[82.41 - 88.66]	[84.23 - 87.65]	[66.76 - 73.93]	[88.42 - 92.79]
	PSO	[81.99 - 86.25]	[83.57 - 87.88]	[80.73 - 87.15]	[82.64 - 86.98]	[63.93 - 72.42]	[88.74 - 92.49]
	BCO	[84.14 - 86.96]	[84.96 - 89.93]	[82.21 - 87.58]	[84.74 - 87.51]	[68.23 - 73.87]	[88.49 - 92.51]

Figure 9.

Accuracy comparison for CLE using 5 fold cross validation.

Figure 10.

Precision comparison for CLE using 5 fold cross validation.

Figure 11.

Recall comparison for CLE using 5 fold cross validation.

Figure 12.

F1 score comparison for CLE using 5 fold cross validation.

Figure 13.

Accuracy comparison for HGR using 5 fold cross validation.

Figure 14.

Precision comparison for HGR using 5 fold cross validation.

Figure 15.

Recall comparison for HGR using 5 fold cross validation.

Figure 16.

F1 score comparison for HGR using 5 fold cross validation.

Discussion

Train test split evaluation

BCO consistently improves ML classifier performance, as demonstrated by the experiments conducted on the CLE and HGR datasets in this study. Notable performance improvement is observed in the ensemble approaches. On the CLE dataset, BCO optimization produced significant improvements for XGBoost and SVM, which attained an accuracy of 91.80% compared to 86.89% without BCO for 80% training samples. The XGBoost performed decently well among the other classifiers for both 50% and 80% training samples for the CLE dataset. The performance of LGBM is also increased to 86.84% from 77.63% with BCO optimization and 50% training samples. AdaBoost shows a notable improvement of classification accuracy from 78.69% to 88.52% for CLE. Similarly, BCO driven optimization yielded impressive results on the HGR dataset, with SVM demonstrating a notable improvement from 87.82% to 90.76% accuracy with 80% training data, and XGBoost achieving 94.12% accuracy compared to 92.02% without BCO. After optimization, XGBoost consistently performed well on both datasets, obtaining the best accuracy metrics, demonstrating that the optimization was especially successful for algorithms with complicated parameter spaces. In addition, BCO illustrated strong generalization ability by preserving or raising ROC-AUC scores for all datasets and for most of the classifiers. The RF reaches to 97.08% on HGR, and NB reaches to 97.08% on CLE for ROC-AUC score. The increased F1 scores for most of the classifiers demonstrate the positive impact of BCO for precision-recall trade-offs. XGBoost shows a noteworthy improvement in F1 scores of 91.53% and 94.40% for CLE and HGR datasets, respectively, in consideration of 80% training samples. The gains in the Kappa statistic validate the dependability of classification improvements above chance agreement. The detailed performance of the investigated classifiers is illustrated in Tables 7 and 8 without and with BCO optimization, respectively, for the CLE dataset. For the HGR dataset, Tables 10 and 11 shows the results before and after BCO optimization, respectively. The positive values in the Tables 9 and 12 signify the improvements of the classifiers performance after applying BCO optimization for CLE and HGR datasets, respectively. The performance comparison for both the dataset can be observed from the Figures 5–8. These steady performance improvements on two different datasets confirm that BCO is a reliable hyperparameter optimization method that can handle a variety of problem domains and classifier architectures.

5 fold cross validation evaluation

In order to prove the efficacy of the proposed framework and avoid overfitting issues, 5 fold cross validation is also adopted along with the train test ratio. The proposed BCO optimized architecture continuously improves the prediction performance of machine learning classifiers on both the CLE and HGR datasets, according to the comparative study of the experimental findings. Based on the findings, the BCO optimized models performed better in terms of accuracy and stability than both their unoptimized counterparts and conventional optimization methods. The BCO-SVM model outperformed the Grid Search (85.13%) and PSO (84.47%) benchmarks with the maximum peak accuracy of 85.79% for the CLE dataset. Similar to this, the RF with clearly outperformed the unoptimized RF model (91.26%) for the HGR dataset, achieving a leading accuracy of 92.02%. Notably, the BCO-XGBoost classifier shown strong improvements in accuracy, rising from 89.92% to 91.26% on the HGR dataset and from a baseline of 81.50% to 84.48% on the CLE dataset. In addition to maintaining lower standard deviations, which are indicative of superior model stability, this consistent superiority implies that the BCO algorithm successfully traverses the high dimensional feature space to find global optima that conventional techniques like Grid Search might overlook.

The 95% CI offers important information about the dependability and worst case performance of the optimized models in addition to mean accuracy. The BCO framework considerably tightens these intervals, according to the experimental results, suggesting better stability. The BCO-SVM model, for example, not only increased accuracy but also generated a tighter, nonoverlapping range of [88.83 - 90.84] on the HGR dataset, whereas the unoptimized SVM classifier had a broad CI of [83.82 - 88.62]. This lack of overlap strongly suggests a significant improvement. The BCO-XGBoost model also showed a notable increase in the bottom limit of the confidence interval on the CLE dataset, increasing from 74.46% (No Optimization) to 80.15% applying BCO. This change shows that while the baseline model is prone to considerable volatility, the BCO optimized model maintains a high performance floor even in the worst case cross validation folds. The suggested approach guarantees a more consistent and clinically reliable diagnostic result by increasing the lower limits and reducing the interval widths across important classifiers.

Statistical validation of results

A rigorous statistical analysis framework is utilized to validate the robustness of the proposed BCO optimized classifiers. The Shapiro-Wilk test is used to determine if the changes between the Baseline and BCO optimized accuracy scores are normal, as performance matrices over cross validation folds may not always follow a normal distribution.⁷⁵,⁷⁶ The Shapiro-Wilk test statistic W is denoted as, $W = {(\sum_{i = 1}^{n} a_{i} x_{i})}^{2} / {(\sum_{i = 1}^{n} x_{i} - \bar{x})}^{2}$ . By normalizing the distribution of W to take sample size n into account, the approximation approach introduced in Ref. 77, is exploited to get the probability value (p-value) associated with the test statistic W. The null hypothesis of normality is rejected if p is less than 0.05. Based on the Shapiro-Wilk p-value, the proper significance test id adopted. If p > 0.05, the Paired T-test is exploited as the as the p value indicates normal distribution. If the data is not in normal distribution, indicated by p < 0.05, the Wilcoxon Signed-Ranked Test is adopted.

According to the Shapiro-Wilk test, all 10 classifiers’ performance differences for the CLE dataset belong to the normal distribution, and the Paired t-test is utilized. The analysis listed in Table 17 indicates noticeable improvements of four classifiers Adaboost, MLP, DT, and XGBoost. With mean accuracy rising from 76.90% to 84.15%, AdaBoost significantly showed the largest improvement for CLE. This suggests that BCO successfully improved the ensemble weights and learning rate, which are crucial for boosting algorithms on smaller datasets. While there exists significant enhancements in mean accuracy for other classifiers like SVM and RF, these gains fell short of the 95% confidence level, indicating that the baseline configurations for these robust models are already functioning close to their maximum potential for the smaller sample size.

Table 17.

Statistical validation of results for CLE.

Classifier	Baseline avg	BCO avg	Shapiro P-val(base)	Shapiro P-val(BCO)	Normality	Test used	Final result P-value	Significant (p < 0.05)
SVM	0.8415	0.8579	0.7532	0.3757	Normal	Paired T-Test	0.0958	No
LR	0.8480	0.8447	0.9950	0.9898	Normal	Paired T-Test	0.8131	No
KNN	0.8218	0.8250	0.4203	0.7189	Normal	Paired T-Test	0.4541	No
MLP	0.8184	0.8447	0.7958	0.8294	Normal	Paired T-Test	0.0173	YES
NB	0.8416	0.8416	0.2359	0.2359	Normal	Paired T-Test	1.0000	No
RF	0.8149	0.8282	0.7112	0.0719	Normal	Paired T-Test	0.0500	No
DT	0.7490	0.7954	0.9005	0.4193	Normal	Paired T-Test	0.0160	YES
XGBoost	0.8150	0.8448	0.7547	0.2141	Normal	Paired T-Test	0.0262	YES
LGBM	0.8087	0.8447	0.2404	0.7997	Normal	Paired T-Test	0.0862	No
AdaBoost	0.7690	0.8415	0.7856	0.1442	Normal	Paired T-Test	0.0014	YES

Higher statistical power and notable gains in 6 of 10 classifiers are found through the examination of the bigger HGR dataset mentioned in Table 18. Although the Paired t-test is applied to most models, AdaBoost’s baseline scores showed a substantial departure from normality (Shapiro-Wilk p = 0.0167). Consequently, a statistically significant gain is confirmed by appropriately using the Wilcoxon Signed-Rank test with the value p = 0.0313. With extremely significant improvements for SVM and KNN for this bigger dataset, the BCO showed broad efficacy. This underscores the significance of hyperparameter adjustment for distance based and kernel based approaches as data volume rises. In each dataset, MLP, DT, and AdaBoost demonstrated statistically significant improvements in performance. This consistency provides compelling empirical support for the robustness and adaptability of the suggested BCO method, which can effectively optimize these structures despite the size of the dataset or the distribution properties. On the other hand, less complex linear models like LR and NB had very little variation between the two datasets, suggesting that there exists little room for optimization outside of the default values in their lower dimensional parameter spaces.

Table 18.

Statistical validation of results for HGR.

Classifier	Baseline avg	BCO avg	Shapiro P-val(base)	Shapiro P-val(BCO)	Normality	Test used	Final result P-value	Significant (p < 0.05)
SVM	0.8622	0.8983	0.2567	0.9276	Normal	Paired T-Test	0.0060	YES
LR	0.8210	0.8210	0.1719	0.1719	Normal	Paired T-Test	1.0000	No
KNN	0.8613	0.9176	0.3229	0.4272	Normal	Paired T-Test	0.0046	YES
MLP	0.8580	0.8832	0.9260	0.0963	Normal	Paired T-Test	0.0096	YES
NB	0.8328	0.8328	0.3274	0.3274	Normal	Paired T-Test	1.0000	No
RF	0.9126	0.9202	0.7340	0.8330	Normal	Paired T-Test	0.1329	No
DT	0.8504	0.8697	0.1656	0.9698	Normal	Paired T-Test	0.0396	YES
XGBoost	0.8992	0.9126	0.5846	0.6557	Normal	Paired T-Test	0.0911	No
LGBM	0.9092	0.9176	0.8258	0.4925	Normal	Paired T-Test	0.0445	YES
AdaBoost	0.8185	0.8555	0.0167	0.9796	Not Normal	Wilcoxon	0.0313	YES

Feature importance and model interpretability

A post hoc feature significance analysis using 5 fold cross validation is conducted to verify the clinical transparency of the suggested framework and make sure the results were resilient to data splitting artifacts. The model’s decision logic for the CLE dataset is primarily influenced by sophisticated diagnostic markers; the most important predictors, as shown in Figure 17 for this dataset, are the number of major vessels (ca), chest pain type (cp), and thallium heart rate (thal). Because it gives priority to variables like thal and ca direct indicators of heart disease instead of merely generic risk factors like age or cholesterol, the model performs well. The model’s clinical validity is validated by the SHAP analysis Figure 18, which shows that lower thalach and thal (indicating reversible defects) are powerful predictors of positive disease prediction.

Figure 17.

Feature importance stability for CLE.

Figure 18.

Shap summary for CLE.

The Figure 19 shows a unique ECG and symptom driven prediction characteristic in the HGR dataset. The analysis identifies the slope of the peak exercise ST segment (STslope) as the most important factor, outperforming all other factors in terms of information gained. As a key indication of myocardial ischemia, ST segment abnormalities under stress make this clinically relevant. The secondary factors include chest pain type and exercise angina, demonstrating the model’s dependence on symptomatic presentation. The SHAP summary in Figure 20 determines exercise induce angina (exerciseangina) and abnormal ST slopes are powerful predictors of heart disease, whereas max heart rate acts as a protective factor. This demonstrates that in datasets, the framework learns to prioritize physiological stress responses above static demographic characteristics in a robust manner.

Figure 19.

Feature importance stability for HGR.

Figure 20.

Shap summary for HGR.

Computational cost analysis

For the assessment of the practical feasibility of the proposed approach, the computational burden related to the BCO based hyperparameter tuning is systematically recorded for the 10 classifiers. A workstation with 8GB RAM and Intel i5 11th generation processor is used for the optimization process, and the resultant execution times stay within reasonable bounds for clinical implementation. The BCO process takes an average of 169.07 seconds per classifier for the CLE dataset, with the RF model requiring the longest duration of 480.72 seconds. The mean optimization time jumps to 251.46 seconds for the larger HGR dataset, with the SVM classifier achieving a maximum of 745.82 seconds. It takes around 28 minutes for CLE and 42 minutes for HGR to optimize all 10 classifiers. The BCO phase, which is theoretically described as the primary factor governing the computational complexity of the suggested framework, is O (I ⋅ S ⋅ f (n)), where I denotes the number of iterations, colony size is denoted by S, and f (n) defines training cost of the base classifier. These findings show that the BCO technique effectively reaches a high level of convergence while striking a balance between excellent diagnostic accuracy and a controllable computing cost on typical consumer grade workstation.

Conclusion

The study conducts a rigorous investigation into the field of ML for classifying CVDs. The thorough experimental testing carried out on two different datasets, CLE and HGR, unambiguously shows that the proposed BCO framework produces statistically substantial performance gains across different classifiers, substantiating the fundamental contribution. The experiments signify BCO as an effective metaheuristic approach for hyperparameter tuning. BCO optimized classifiers consistently outperform their default parameter counterparts, especially boosting XGBoost to reach high accuracies of 91.80% and 94.12%, respectively, according to extensive testing across the CLE and HGR datasets based on train test split mechanism. The 5 fold cross validation along with statistical validation test and 95% CI, signifies BCO as a significant performance enhancer for different ML classifiers. The increases in accuracy and f1 score are critical from a clinical standpoint because they result in a quantifiable decrease in false negatives, which lowers the possibility of heart disease patients being overlooked during routine screenings. Focusing on the physiological stress responses found in our SHAP research, this paradigm can assist practitioners in prioritizing high risk patients when included in clinical workflows as a decision support tool. The next crucial stages to guarantee the framework’s adaptability to various patient groups are integration into Electronic Health Record (EHR) systems and prospective validation in actual clinical settings. Moreover, the future research will build on these pioneering contributions by exploring adaptive BCO variants, where parameters such as population size and run length self adjust based on convergence dynamics, and developing hybrid models that combine BCO with other metaheuristics, such as Particle Swarm Optimization, to improve search efficiency for even more intricate automated ML pipelines.

Footnotes

ORCID iDs

Tanver Ahmed

Md. Muktar Hossain

Md. Toufikul Islam

A. S. M Delwar Hossain

Mohammad Kasedullah

Masud Ibn Afjal

Ethical considerations

This research is relied on secondary analysis of previously established, completely anonymized, publicly available datasets, therefore, it did not require ethical approval or informed consent. Therefore, this study does not include human individuals and is not subject to formal Institutional Review Board (IRB) review.

Author contributions

Tanver Ahmed: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Formal Analysis, Visualization, Writing – original draft, Writing – review & editing.

Md. Muktar Hossain: Formal analysis, Methodology.

Mohammad Kasedullah: Data curation, Validation, Writing – review & editing.

Md. Toufikul Islam: Validation, Writing – review & editing.

A.S.M Delwar Hossain: Data curation, Validation.

Masud Ibn Afjal: Supervision, Validation, Writing – review & editing.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Zheng

Shao

, et al. Developing and validating a machine learning-based model for predicting in-hospital mortality among ICU-admitted heart failure patients: A study utilizing the MIMIC-III database. Digital health 2025; 11: 20552076251335705. https://doi.org/10.1177/20552076251335705

Abubaker

Babayiğit

. Detection of Cardiovascular Diseases in ECG Images Using Machine Learning and Deep Learning Methods. IEEE Transactions on Artificial Intelligence 2023; 4(2): 373–382. https://doi.org/10.1109/tai.2022.3159505

Zhang

Chen

Pan

, et al. Associations of healthy lifestyle and socioeconomic status with mortality and incident cardiovascular disease: two prospective cohort studies. BMJ 2021; 373: n604. https://doi.org/10.1136/bmj.n604. Available from: https://www.bmj.com/content/373/bmj.n604

Inam

Samad

Vaughan

, et al. Global Cardiovascular Research: Gaps and Opportunities. Current Cardiology Reports 2023; 25: 1–8. https://doi.org/10.1007/s11886-023-01996-2

Agyekum

Folson

Abaidoo

, et al. Behavioural and nutritional risk factors for cardiovascular diseases among the Ghanaian population-a cross-sectional study. BMC Public Health 2024; 24(1): 194. https://doi.org/10.1186/s12889-024-17709-5

Matheus

ASM

Tannus

LRM

Cobas

, et al. Impact of Diabetes on Cardiovascular Disease: An Update. International Journal of Hypertension 2013; 2013(1): 653789. Available from: https://doi.org/10.1155/2013/653789

Masenga

Kirabo

. Hypertensive heart disease: risk factors, complications and mechanisms. frontiers in cardiovascular medicine 2023; 10: 1205475. https://doi.org/10.3389/fcvm.2023.1205475

Elvas

Almeida

Ferreira

, The Role of AI in Cardiovascular Event Monitoring and Early Detection: Scoping Literature Review. JMIR Medical Informatics 2025; 13, 13:e64349. https://doi.org/10.2196/64349

Saberi

Sadr

Yamaghani

. An Intelligent Diagnosis System for Predicting Coronary Heart Disease. In: 2024 10th International Conference on Artificial Intelligence and Robotics (QICAR), Qazvin, Iran, 29 February 2024, pp. 131–137.

10.

Ramesh

Kambhampati

Monson

, et al. Artificial intelligence in medicine. Annals of the Royal College of Surgeons of England 2004; 86(86): 334–338. https://doi.org/10.1308/147870804290

11.

Shen

, et al. Mapping the landscape of machine learning in chronic disease management: A comprehensive bibliometric study. Digital Health 2025; 11: 20552076251361614. https://doi.org/10.1177/20552076251361614

12.

Kamio

Ikegami

Machida

, et al. Machine learning-based prognostic modeling of patients with acute heart failure receiving furosemide in intensive care units. Digital health 2023; 9: 20552076231194933. https://doi.org/10.1177/20552076231194933

13.

Abdellatif

Mubarak

Abdellatef

, et al. Computational detection and interpretation of heart disease based on conditional variational auto-encoder and stacked ensemble-learning framework. Biomedical Signal Processing and Control 2023; 88: 105644. https://doi.org/10.1016/j.bspc.2023.105644

14.

Cuevas-Chávez

Hernández

Ortiz-Hernandez

, et al. A Systematic Review of Machine Learning and IoT Applied to the Prediction and Monitoring of Cardiovascular Diseases. Healthcare 2023; 11(16): 2240. https://doi.org/10.3390/healthcare11162240. Available from: https://www.mdpi.com/2227-9032/11/16/2240

15.

Ogunpola

Saeed

Basurra

, et al. Machine Learning-Based Predictive Models for Detection of Cardiovascular Diseases. Diagnostics 2024; 14(2): 144. https://doi.org/10.3390/diagnostics14020144

16.

Al-Absi

Islam

Refaee

, et al. Cardiovascular Disease Diagnosis from DXA Scan and Retinal Images Using Deep Learning. Sensors 2022; 22: 4310. https://doi.org/10.3390/s22124310

17.

Ahamad

Shafiullah

Imdadullah

, et al. Influence of optimal hyperparameters on the performance of machine learning algorithms for predicting heart disease. Processes 2023; 11(3): 734. https://doi.org/10.3390/pr11030734

18.

Akkaya

Sener

Gursu

. A comparative study of heart disease prediction using machine learning techniques. In: 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Ankara, Turkey, 9–11 June 2022, pp. 1–8. IEEE.

19.

Tougui

Jilbab

El Mhamdi

. Heart disease classification using data mining tools and machine learning techniques. Health and Technology 2020; 10(5): 1137–1144. https://doi.org/10.1007/s12553-020-00438-1

20.

Abiodun

Ukandu

Emmanuel

, et al. Detection of Heart Disease Using Binary Classification Machine Learning Model. Ingenierie des Systemes d’Information 2025; 30(5): 1111–1122. https://doi.org/10.18280/isi.300501

21.

Akther

Kohinoor

MSR

Priya

, et al. Multi-Faceted Approach to Cardiovascular Risk Assessment by Utilizing Predictive Machine Learning and Clinical Data in a Unified Web Platform. IEEE Access 2024; 12: 120454–120473. https://doi.org/10.1109/access.2024.3436020

22.

Anjum

Siddiqua

Haider

, et al. Improving Cardiovascular Disease Prediction Through Comparative Analysis of Machine Learning Models. Journal of Computer Science and Technology Studies 2024; 6(2): 62–70. https://doi.org/10.32996/jcsts.2024.6.2.7

23.

Bai

Zhang

, et al. Exploration and comparison of the effectiveness of swarm intelligence algorithm in early identification of cardiovascular disease. Scientific Reports 2025; 15(1): 4647. https://doi.org/10.1038/s41598-025-87598-0

24.

Alshraideh

, et al. Enhancing Heart Attack Prediction with Machine Learning: A Study at Jordan University Hospital. Applied Computational Intelligence and Soft Computing 2024; 2024(1): 5080332. Available from: https://doi.org/10.1155/2024/5080332

25.

Rabbi

MSH

Bari

Debnath

, et al. Performance evaluation of optimal ensemble learning approaches with PCA and LDA-based feature extraction for heart disease prediction. Biomedical Signal Processing and Control 2025; 101: 107138. https://doi.org/10.1016/j.bspc.2024.107138. Available from: https://www.sciencedirect.com/science/article/pii/S1746809424011960

26.

Mokeddem

Atmani

Mokaddem

. Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based on Genetic Algorithm. In: Computer Science & Information Technology (CS & IT). CSE-2013. Academy & Industry Research Collaboration Center (AIRCC), 2013. Available from: https://doi.org/10.5121/csit.2013.3305

27.

Amin

Chiam

Varathan

. Identification of significant features and data mining techniques in predicting heart disease. Telematics and Informatics 2019; 36: 82–93. https://doi.org/10.1016/j.tele.2018.11.007

28.

Khanna

Sahu

Baths

, et al. Comparative Study of Classification Techniques (SVM, Logistic Regression and Neural Networks) to Predict the Prevalence of Heart Disease. International Journal of Machine Learning and Computing 2015; 10(5): 414–419. https://doi.org/10.7763/ijmlc.2015.v5.544

29.

Patidar

Jain

Gupta

. Comparative Analysis of Machine Learning Algorithms for Heart Disease Prediction. 6th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India; 2022. 1340-1344. https://doi.org/10.1109/ICICCS53718.2022.9788408

30.

Kumar

Koushik

Deepak

. Prediction of heart diseases using data mining and machine learning algorithms and tools. International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018; 3(3): 887–898.

31.

Arroyo

JCT

Delima

AJP

. An optimized neural network using genetic algorithm for cardiovascular disease prediction. Journal of Advances in Information Technology 2022; 13(1): 95–99. https://doi.org/10.12720/jait.13.1.95-99

32.

Chaithra

Madhu

. Classification models on cardiovascular disease prediction using data mining techniques. Cardiovascular Diseases & Diagnosis 2018; 6(6): 1–4.

33.

Lin

. Utilizing a two-stage Taguchi method and artificial neural network for the precise forecasting of cardiovascular disease risk. Bioengineering 2023; 10(11): 1286. https://doi.org/10.3390/bioengineering10111286

34.

Singhal

Kumar

Passricha

. Prediction of heart disease using CNN. Am Int J Res Sci Technol Eng Math 2018; 23(1): 257–261.

35.

Amilo

Sadri

Hincal

. A hybrid approach to heart disease prediction using a fractional-order mathematical model and machine learning algorithm. Computer Methods in Biomechanics and Biomedical Engineering 2025; 1–30. PMID: 40569240. Available from: https://doi.org/10.1080/10255842.2025.2523313

36.

Nazari

Emami

Rabiei

, et al. Detection of cardiovascular diseases using data mining approaches: application of an ensemble-based model. Cognitive Computation 2024; 16(5): 2264–2278. https://doi.org/10.1007/s12559-024-10306-z

37.

Ghafil

Karoly

. Comparative study of particle swarm optimization and artificial bee colony algorithms. MultiScience - XXXII. microCAD International Multidisciplinary Scientific Conference, University of Miskolc, Hungary, 2018; https://doi.org/10.26649/musci.2018.030

38.

Guan

Peng

, et al. Enhancing the decision optimization of interaction design in sustainable healthcare with improved artificial bee colony algorithm and generative artificial intelligence. PLOS ONE 2025; 20(2): 1–31. Available from: https://doi.org/10.1371/journal.pone.0317488

39.

Vigneshvaran

Kathiravan

. Heart Disease Prediction using an optimized Extreme Learning Machine with Bacterial Colony optimization. In: 2022 3rd International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 20–22 October 2022, pp. 1425–1429.

40.

Vigneshvaran

Kathiravan

. Optimized Extreme Learning Machine with Bacterial Colony Optimization Algorithm for Disease Diagnosis in Clinical Datasets. SN Computer Science 2024; 5: 584. https://doi.org/10.1007/s42979-024-02864-8

41.

Dhanka

Kumar

Maini

, et al. Padding interpolation, median imputation, RobustScalar, and particle swarm optimization with heterogeneous classifiers: a robust combination for effective heart disease diagnosis. Frontiers in Medicine 2025; 12: 1721740. https://doi.org/10.3389/fmed.2025.1721740

42.

Dhanka

Maini

. A hybrid machine learning approach using particle swarm optimization for cardiac arrhythmia classification. International Journal of Cardiology 2025; 432: 133266. https://doi.org/10.1016/j.ijcard.2025.133266

43.

Dhanka

Maini

. HyOPTXGBoost and HyOPTRF: hybridized intelligent systems using optuna optimization framework for heart disease prediction with clinical interpretations. Multimedia Tools and Applications 2024; 83(29): 72889–72937. https://doi.org/10.1007/s11042-024-18312-x

44.

Dhanka

Maini

. A hybridization of XGBoost machine learning model by Optuna hyperparameter tuning suite for cardiovascular disease classification with significant effect of outliers and heterogeneous training datasets. International Journal of Cardiology 2025; 420: 132757. https://doi.org/10.1016/j.ijcard.2024.132757

45.

Dhanka

Bhardwaj

Maini

. Comprehensive analysis of supervised algorithms for coronary artery heart disease detection. Expert Systems 2023; 40(7): e13300. https://doi.org/10.1111/exsy.13300

46.

Dhanka

Maini

. Multiple machine learning intelligent approaches for the heart disease diagnosis. In: IEEE EUROCON 2023-20th International Conference on Smart Technologies, Turin, Italy, 6–8 July 2023, pp. 147–152. IEEE.

47.

Dhanka

Maini

. Random Forest for Heart Disease Detection: A Classification Approach. In: 2021 IEEE 2nd International Conference On Electrical Power and Energy Systems (ICEPES), Bhopal, India, 10–11 December 2021, pp. 1–3.

48.

Dhanka

Sharma

Kumar

, et al. Advancements in Hybrid Machine Learning Models for Biomedical Disease Classification Using Integration of Hyperparameter-Tuning and Feature Selection Methodologies: A Comprehensive Review. Archives of Computational Methods in Engineering 2025; 33: 289–324. https://doi.org/10.1007/s11831-025-10309-5. Available from: https://api.semanticscholar.org/CorpusID:279737240

49.

Dhanka

Kumar

Sharma

, et al. Advances in machine learning and deep learning for hormonal disorder diagnosis: an exhaustive review on PCOS, thyroid, and optimization techniques. Archives of Computational Methods in Engineering 2025; 33: 1–45. https://doi.org/10.1007/s11831-025-10380-y

50.

Amilo

Sadri

Hincal

, et al. Integrating Machine Learning and Fractional-Order Dynamics for Enhanced Psoriasis Prediction and Clinical Decision Support. AppliedMath 2025; 5(4): 143. https://doi.org/10.3390/appliedmath5040143. Available from: https://www.mdpi.com/2673-9909/5/4/143

51.

Amilo

Sadri

Hincal

, et al. Dual approach artificial neural networks-fractional order operator to enhancing in vitro fertilization predictions and success measure. Modeling Earth Systems and Environment 2025; 11(4): 281. https://doi.org/10.1007/s40808-025-02450-8

52.

Amilo

Sadri

Hincal

, et al. An integrated machine learning and fractional calculus approach to predicting diabetes risk in women. Healthcare Analytics 2025; 8: 100402. https://doi.org/10.1016/j.health.2025.100402. Available from: https://www.sciencedirect.com/science/article/pii/S2772442525000218

53.

Maini

Dhanka

. Hyper Tuned RBF SVM: A New Approach for the Prediction of the Breast Cancer. In: 2024 1st International Conference on Smart Energy Systems and Artificial Intelligence (SESAI), Balaclava, Mauritius, 3–6 June 2024, pp. 1–4.

54.

Kumar

Dhanka

Sharma

, et al. Comprehensive Review of Bias in AI, ML, and DL Models: Methods, Impacts, and Future Directions. Archives of Computational Methods in Engineering 2025; 12.

55.

Niu

Wang

. Bacterial colony optimization. Discrete dynamics in nature and society 2012; 2012(1): 698057. https://doi.org/10.1155/2012/698057

56.

Vigneshvaran

Vijaya Kathiravan

. An Enhanced Extreme Learning Machine Based on a Swarm Intelligence Approach for Heart Disease Detection. International Research Journal of Multidisciplinary Technovation 2025; 7: 182–199. https://doi.org/10.54392/irjmt25413

57.

Guo

Tang

Niu

, et al. A survey of bacterial foraging optimization. Neurocomputing 2021; 452: 728–746. https://doi.org/10.1016/j.neucom.2020.06.142. Available from: https://www.sciencedirect.com/science/article/pii/S0925231220319172

58.

Chesnaye

van Diepen

Dekker

, et al. Non-linear relationships in clinical research. Nephrology Dialysis Transplantation 2025; 40(2): 244–254. https://doi.org/10.1093/ndt/gfae187

59.

Oise

Oyedotun

Nwabuokei

, et al. ENHANCED PREDICTION OF CORONARY ARTERY DISEASE USING LOGISTIC REGRESSION. FUDMA JOURNAL OF SCIENCES 2025; 9(3): 201–208. https://doi.org/10.33003/fjs-2025-0903-3263

60.

Souza

Lima

. Cardiac Disease Diagnosis Using K-Nearest Neighbor Algorithm: A Study on Heart Failure Clinical Records Dataset. Artificial Intelligence and Applications 2025; 3: 56–71.

61.

Spencer

Thabtah

Abdelhamid

, et al. Exploring feature selection and classification methods for predicting heart disease. Digital health 2020; 6: 2055207620914777. https://doi.org/10.1177/2055207620914777

62.

Shouman

Turner

Stocker

. Applying k-nearest neighbour in diagnosing heart disease patients. International Journal of Information and Education Technology 2012; 2(3): 220–223. https://doi.org/10.7763/ijiet.2012.v2.114

63.

Joshi

Singh

Kalonia

, et al. Predictive Modeling of Cardiovascular Disease Using Feedforward Neural Networks. In: International Conference on Paradigms of Communication, Computing and Data Analytics. Springer, 2024, pp. 195–206.

64.

Ogunpola

Saeed

Basurra

, et al.. Machine Learning-Based Predictive Modeling for Early Detection of Cardiovascular Disease. Diagnostics 2024; 14(2): 144. 10.3390/diagnostics14020144.

65.

Halabaku

Bytyçi

. Overfitting in Machine Learning: A Comparative Analysis of Decision Trees and Random Forests. Intelligent Automation & Soft Computing 2024; 39(6): 987–1006. https://doi.org/10.32604/iasc.2024.059429

66.

da Silva

Oliveira

Villa

. A comparison of decision tree-based algorithms for food discrimination using vibrational spectroscopy. Food Chemistry 2025; 488: 144909. https://doi.org/10.1016/j.foodchem.2025.144909

67.

Imani

Beikmohammadi

Arabnia

. Comprehensive analysis of random forest and XGBoost performance with SMOTE, ADASYN, and GNUS under varying imbalance levels. Technologies 2025; 13(3): 88. https://doi.org/10.3390/technologies13030088

68.

Tompra

Papageorgiou

Tjortjis

. Strategic machine learning optimization for cardiovascular disease prediction and high-risk patient identification. Algorithms 2024; 17(5): 178. https://doi.org/10.3390/a17050178

69.

Zhao

Wang

. Machine Learning-Based Surrogate Ensemble for Frame Displacement Prediction Using Jackknife Averaging. Buildings 2025; 15(16): 2872. https://doi.org/10.3390/buildings15162872

70.

Al-Mahdi

Darwish

Madbouly

. Heart Disease Prediction Model Using Feature Selection and Ensemble Deep Learning with Optimized Weight. Computer Modeling in Engineering & Sciences 2025; 143(1): 875–909. https://doi.org/10.32604/cmes.2025.061623

71.

Janosi

Steinbrunn

Pfisterer

, et al. Heart Disease. UCI Machine Learning Repository, 1989. https://doi.org/10.24432/C52P4X

72.

Siddhartha

. Heart Disease Dataset (Comprehensive). IEEE Dataport 2020. Available from: https://doi.org/10.21227/dz4t-cm36

73.

Tang

, et al. Bacterial Foraging Algorithm for Optimal Power Flow in Dynamic Environments. Circuits and Systems I: Regular Papers. IEEE Transactions on 2008; 55: 2433–2442. https://doi.org/10.1109/tcsi.2008.918131

74.

Kim

Abraham

. In: Abraham

Grosan

Ishibuchi

(eds) A Hybrid Genetic Algorithm and Bacterial Foraging Approach for Global Optimization and Robust Tuning of PID Controller with Disturbance Rejection. Springer Berlin Heidelberg, 2007, pp. 171–199. Available from: https://doi.org/10.1007/978-3-540-73297-6_8

75.

Demsar

. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 2006; 7: 1–30.

76.

Souza

Toebe

Mello

, et al. Sample size and Shapiro-Wilk test: An analysis for soybean grain yield. European Journal of Agronomy 2023; 142: 126666. https://doi.org/10.1016/j.eja.2022.126666

77.

Royston

. Approximating the Shapiro-Wilk W-test for non-normality. Statistics and computing 1992; 2(3): 117–119. https://doi.org/10.1007/bf01891203

Accelerating the performance of machine learning classifiers using bacterial colony optimization for heart disease prediction

Abstract

Objectives

Methods

Results

Conclusion

Keywords

Introduction

Methodology

Feature reduction using Principal Component Analysis

Hyperparameter tuning: Bacterial colony optimization

Chemotaxis and communication

Elimination and reproduction

Migration

Hypothesized advantages of BCO in hyperparameter tuning

Evaluation methods

Support Vector Machine

Logistic regression

K-Nearest Neighbors

Multilayer Perceptron

Naïve Bayes

Random Forest

Decision tree

Extreme Gradient Boosting

Light Gradient Boosting Machine

Adaptive Boosting

Dataset description

Results

Parameter tuning

Classification performance

Discussion

Train test split evaluation

5 fold cross validation evaluation

Statistical validation of results

Feature importance and model interpretability

Computational cost analysis

Conclusion

Footnotes

ORCID iDs

Ethical considerations

Author contributions

Funding

Declaration of conflicting interests

References