Abstract
Background:
Coronary artery disease (CAD) remains one of the leading causes of death globally. Traditional manual scoring methods using non-contrast computed tomography (NCCT) are time-consuming, subjective, and require expertise. To overcome these limitations, this research introduces an AI-driven model to predict and classify more efficiently and accurately. Convolutional Neural Networks (CNNs) are a crucial deep learning tool for detecting cardiovascular diseases (CVDs) from ECG images due to their ability to automatically extract complex patterns and hierarchical features. DenseNet201 is a deep learning model effectively used for cardiovascular disease (CVD) detection from ECG imagery, demonstrating high accuracy in classifying cardiac conditions, particularly for multi-class scenarios. InceptionV3 is a deep learning model widely used for cardiovascular disease (CVD) detection from electrocardiogram (ECG) imagery by leveraging its fine-tuned architecture to classify cardiac conditions.
Objectives:
To develop a deep learning-based model for automatic classification and prediction of coronary artery calcium scores. To enhance accuracy using an improved BiGRU model incorporating, to reduce the error and bias in current automatic scoring systems and improve clinical decision-making.
Design:
The study designs a novel architecture named HeProbAtt BiGRU Net. The model performs both classification (healthy vs non-healthy) and regression on NCCT image data.
Methods:
Data collection, 14 127 NCCT slices—dataset from Tabriz University of Medical Sciences, Preprocessing, Model Development, Performance Evaluation Metrics: Accuracy, precision, recall, F1-score, ROC-AUC, MAE, RMSE.
Results:
The proposed model outperformed all compared models with: Classification: Accuracy = 99%, F1-score = 99%, ROC-AUC = .99, Regression: MAE = .065, RMSE = .145. The inclusion of attention and probabilistic weights enhanced learning efficiency and decision precision. Visualization tools (eg, loss curves, confusion matrix, ROC) showed stable and high-performing learning behavior.
Conclusion:
The HeProbAtt BiGRU Net provides a highly accurate, automated, and efficient method for coronary artery calcium scoring. Its hybrid framework allows real-time classification and regression, aiding clinicians in early CAD diagnosis. Future work could include validation on larger, multi-center datasets, and incorporation of clinical explain-ability features.
Keywords
Introduction
The World Health Organization (WHO) states that heart disorders, or cardiovascular diseases, are the primary cause of death globally. They are thought to claim 17.9 million lives a year, or 32% of all deaths worldwide. Heart attacks, also known as myocardial infarctions, are responsible for approximately 85% of all deaths resulting from heart disease. Numerous lives can be saved if cardiovascular disease is appropriately identified and treated at an early stage. 1 Numerous diagnostic techniques are used in the medical system to find heart diseases. These include echocardiograms (echo), which employ ultrasonic waves to make images of the heart, and electrocardiograms (ECGs), which monitor the electrical function of the heart. Computed tomography (CT) and cardiac magnetic resonance imaging (MRI) offer fine-grained cross-sectional pictures of the heart’s anatomy. Additionally, blood tests are used to detect biomarkers indicative of heart conditions, contributing to comprehensive cardiovascular assessment and diagnosis.
Artificial intelligence (AI) is transforming the field of cardiovascular medicine by improving the way it interpret electrocardiograms (ECGs). ECGs are essential tools for diagnosing heart conditions, but they require a lot of expertise to interpret accurately. Deep learning convolutional neural networks, in particular, have made it feasible to analyze ECGs more successfully thanks to artificial intelligence. These AI systems are capable of detecting tiny signals and patterns that even the most skilled human experts might miss. As a result, ECGs have become even more powerful as non-invasive biomarkers. Artificial intelligence (AI) models have been created to detect a variety of cardiac disorders, such as hypertrophic cardiomyopathy (a growing of the heart muscle), silent atrial fibrillation (a kind of heart rhythm disorder that may not cause symptoms), and left ventricular dysfunction (a deteriorating of the heart’s pumping ability). Thus, AI is advancing our knowledge of cardiac health and enhancing our capacity to identify and manage cardiovascular disorders. 2 This article’s main focus is on the identification of people who are at a high risk of suffering from sudden cardiac death (SCD). Even though sudden cardiac death is common everywhere, there aren’t many effective risk assessment techniques available today. Because they are inexpensive and easily accessible, electrocardiograms (ECGs) present a potential option. However, developing a computational model capable of accurately predicting SCD risk from ECG data presents a significant computational and methodological challenge. 3
A summary of artificial intelligence’s (AI) revolutionary potential in healthcare, especially in the area of cardiology. It draws attention to the confluence of strong computing platforms and digitalized medical data, which makes it possible to create AI models for a range of healthcare applications. Specifically, it discusses the emergence of machine learning, focusing on its applications in healthcare innovation, including disease detection, risk stratification, and therapy selection, with a particular emphasis on electrocardiogram (ECG) analysis. 4
MCC, Kappa, and GDR are 3 essential performance measures that are employed in classification tasks. They are especially effective in unbalanced datasets, such as for detecting cardiovascular illness using electrocardiogram (ECG) pictures.
This work uses deep learning, especially transfer learning, to make ECG images more accurate at predicting major cardiovascular problems. Researchers’ work should be used to create and test strong predictive models for finding problems. The study looks at Convolutional Neural Networks (CNNs), InceptionV3, and DenseNet201 designs to see how well they can predict CVD. This study’s goal is to show how transfer learning techniques can be used to improve cardiovascular health screening. This will allow for early intervention to lower the number of deaths and illnesses caused by CVD.
Related Work
By investigating demographic biases in the effectiveness of models based on deep learning for predicting heart failure, 5 add to the body of previous literature. It underscores the importance of addressing these biases to prevent perpetuating disparities in healthcare outcomes. Through assessing how age, sex, race, and ethnicity affect model performance, the study emphasizes the need for tailored strategies to reduce algorithmic biases. This aligns with broader efforts in the field to develop more equitable and effective machine learning tools for prognosticating and managing heart failure, thereby improving healthcare equity and patient outcomes.
Highlighted the growing interest in computational diagnostic techniques for Electrocardiogram (ECG) signal analysis, aiming at early detection of Cardiovascular Diseases. These techniques, comprising data preprocessing, feature engineering, classification, and application stages, have witnessed advancements. Particularly, End-to-End models have streamlined the analysis process by integrating feature extraction and classification. While traditional machine learning methods remain fundamental, deep learning innovations present opportunities for automated optimization and multitasking. 6
The work of Rashed-Al-Mahfuz et al 7 adds to the increasing amount of studies that use deep learning methods to analyze ECG data for cardiovascular diagnosis. It tackles the need for precise identification of important elements in ECG waveforms by putting forth a novel ECG beat classifier built on a tailored Convolutional Neural Network (CNN). Leveraging time frequency representations values, the model enhances interpretability and classification accuracy. This research aligns with previous work in developing computational methods for ECG analysis, further advancing the automation of cardiovascular diagnosis systems and offering potential applications in clinical settings.
According to Rath et al 8 heart disease is a major worldwide health concern that takes a large toll on lives when compared to other illnesses. Electrocardiogram (ECG) signals are vital for diagnosis, and early and precise identification is essential for saving lives. 9 However, uneven HD data presents many difficulties for conventional Dl and ML algorithms, which reduces detection accuracy. In order to generate more synthetic data, this research focuses on the Generative Adversarial Network (GAN) model while investigating other appropriate DL and ML models. Long short-term memory (LSTM) and GAN are combined to create an ensemble model that performs better in HD identification than separate models.
The literature by offering a detailed analysis of machine learning technologies for heart disease detection. It assesses different methods and models for heart disease prediction, emphasizing performance indicators like recall, the f1 score, accuracy, as well as precision. By comparing these methods, the study aims to enhance early detection of heart diseases, facilitating effective treatment, and minimizing adverse outcomes. This research aligns with the broader trend of utilizing machine learning in medical diagnostics, emphasizing its potential for improving healthcare outcomes through advanced predictive modeling. 10
Detecting rare genetic heart diseases, particularly those caused by mutations in the Phospholamban gene, presents a substantial challenge. While deep learning techniques hold promise in signal processing, they often demand vast datasets for effective training. In scenarios with limited data availability, transfer learning emerges as a potential solution to boost accuracy. Their research suggests utilizing transfer learning from an algorithm that was originally trained for gender recognition to develop an ECG based mutation identification technique. The resultant model performs better in identifying mutations by using transfer learning than methods that undergo training from beginning. This approach underscores the efficacy of transfer learning in leveraging information from disparate datasets to refine the accuracy of rare disease detection models. 11
The use of deep neural networks and ECG sensors for the classification of heart diseases. They pointed out that the most common cause of death globally, cardiovascular disorders, require early, and effective detection. The study suggested a generalized approach that makes use of a deep neural network structure based on MobileNet v2 to handle ECG images in different formats. The 4 main cardiac problems that this method sought to identify were myocardial infarction, irregular heartbeat, prior myocardial infarction history, and normal heart function. The system’s high accuracy was manually verified by cardiologists, who recommended its use for screening cardiac disorders. The possibility of DL to improve the identification of heart illness is highlighted by this study. 12
Machine learning algorithms’ effectiveness in detecting heart disease (HD) using electrocardiogram-based arrhythmia datasets. A number of creative approaches are examined, and 8 approaches are assessed in both balanced and imbalanced class scenarios. Evaluation metrics include f1-score, recall, accuracy, and precision. The paper emphasizes the difficulty of datasets that are unequal and suggests the Synthetic Minority Over-sampling Technique (SMOTE) to improve overall and individual class accuracy by balancing the data. This research contributes valuable insights into algorithm performance and dataset structuring for improved HD detection. 13
Used a mixture of ML and DL approaches to study the prediction of cardiac disease. They implemented several algorithms utilizing the UCI Machine Learning Heart Disease dataset, containing important features, to compare their efficacy. The study involved preprocessing the data by handling irrelevant features with Isolation Forest and normalizing the dataset. The evaluation metrics included accuracy and other metrics. The findings revealed that ML algorithms performed well when data preprocessing was applied, demonstrating the efficacy of ML in heart disease prediction. 14
An efficient system for classifying ECG arrhythmias using deep learning and transfer learning techniques. The system leverages the pre-trained DenseNet architecture, originally trained on the dataset, to enhance the classification of heartbeats in ECG data. By fine-tuning DenseNet with augmented ECG datasets, CardioNet achieves faster and robust classification of various arrhythmias, addressing the challenges posed by imbalanced datasets. The approach demonstrated significant effectiveness in detecting different irregular heartbeats, showcasing the potential of transfer learning to improve automated arrhythmia detection and enhance cardiac care diagnostics. 15
Convolutional neural networks used for electrocardiogram (ECG) signals are a novel method for automatic identification of cardiovascular disease (CVD). They suggested CNN-based design immediately processes ECG information, providing an efficient means of identifying heart problems automatically. This deep learning methodology demonstrates superior accuracy and low implementation complexity, effectively capturing the distinct characteristics of heart disease from ECG data. According to this research, the CNN strategy works better than current cutting-edge techniques, highlighting the opportunity for improving robotic CVD identification and management. 16
The usage of DL algorithms in electrocardiogram diagnosis for cardiovascular disease (CVD). It covers the mechanics, evolution, and potential uses of 4 common algorithms—stacked auto-encoders, profound belief networks, convolutional neural networks, and recurrent neural networks—in ECG diagnostics. The evaluation methodically looks at the strong points and weaknesses of every algorithm, highlighting how they can be more accurate and efficient than expert manual classification. This thorough analysis offers important insights into how deep learning is developing in the field of ECG diagnosis and suggests prospective lines of inquiry for further study and advancement. 17
A method for converting ECG signals into binary images was presented, to improve the diagnosis of cardiac arrhythmias. In their publication, From ECG signals to images: a transformation-based strategy for DL, they investigated this technique using pre-trained convolutional neural network (CNN) models. Transfer learning was used to extract and concatenate features from these pre-trained models in order to expedite the classification process. They demonstrated the efficacy of this transformation-based approach in cardiac diagnostics by achieving excellent accuracy in the binary categorization of ventricular arrhythmias through the application of support vector machine (SVM) techniques. 18
Highlight the pressing concern of cardiovascular diseases (CVDs) globally, driving the need for automated screening tools utilizing electrocardiogram (ECG) signals for timely detection. ECG-based methods provide non-invasive and efficient means to detect CVDs. Numerous studies have looked into the classification of CVDs from ECG data using deep learning techniques, especially CNNs. These endeavors seek to improve screening precision and effectiveness, thereby supporting early intervention and treatment approaches to alleviate the impact of CVD-related morbidity and mortality. 19
Compare feature selection methods for classification algorithms in heart disease prediction. Good accuracy is achieved by deep learning. Their research evaluates how feature selection affects the effectiveness and precision of predictive models, offering important insights into the optimal procedures for cardiovascular disease prediction assignments. They shed light on practical approaches for utilizing ECG data in healthcare analytics by highlighting the significance of choosing relevant features for improving the performance of cardiovascular disease prediction models through systematic analysis. 20
Investigate advances in methods for extracting features from ECG signals for use in AI and digital health applications. In order to increase machine learning and deep learning model performance, it highlights techniques in the time, frequency, time–frequency, decomposition, and sparse domains. According to recent research, compaction, and dimensionality reduction are crucial for effective classification, detection, and automated applications. In order to handle enormous biomedical datasets and increase the precision and dependability of ECG-based health evaluations, the article emphasizes the necessity of robust feature extraction. 21
How the Internet of Things (IoTs) has improved ECG monitoring systems, emphasizing the move to TinyML-based on-device classification. It examines the drawbacks of conventional techniques that rely on sending data to outside servers, highlighting the significant resource usage and delay problems. According to recent research, it is possible to do effective and real-time ECG data classification on embedded devices like the Arduino prototype by directly implementing machine learning models on them. For continuous health monitoring, this strategy promises considerable gains in reaction time, resource usage, and overall system efficiency. 22
Developments in the use of machine learning to diagnose myocardial infarction (OMI) in patients without ST-elevation from ECG data. Early detection is hampered by the limits of traditional diagnostic methods, delaying therapy. The creation of machine learning models that greatly increase diagnostic sensitivity and accuracy is emphasized in recent research. These models provide better risk categorization and decision support than commercial systems and traditional approaches, having been evaluated by clinical specialists. Enhanced outcomes for patients through prompt and accurate diagnosis and treatment are promised by the combination of such models using medical judgement. 23
In their assessment of deep learning (DL) applications in the categorization of ECG arrhythmias, 24 identify trends, obstacles, and prospects. ECG databases, preprocessing methods, deep learning approaches, assessment paradigms, and performance measures are all examined. The article highlights the widespread usage of convolutional neural networks and the prominence of the MIT-BIH Arrhythmia Database. Managing noise, data augmentation, and performance degradation in inter-patient assessments are among the difficulties. In order to improve clinical implementation, the survey recommends that future research concentrate on utilizing a variety of databases, sophisticated preprocessing, incorporating unique DL models, and enhancing inter-patient paradigm performance.
The survey underscores the significance of electrocardiogram (ECG) data in cardiovascular diagnosis, while also stressing the increasing application for the categorization of illnesses and medical situations. While machine learning techniques are gaining popularity, researchers still primarily concentrate on classification techniques. The study investigates various machine learning techniques, including Gaussian NB, Random Forest, Logistic Regression, Linear Discriminant Analysis, and Dummy Classifier, for automated ECG data categorization. The results show how well various algorithms distinguish between those with cardiac illness and those who are not, with the Gaussian NB Classifier having the best accuracy in classification. 25
Research Gap Identified
The literature points to a research void in the advancement of deep learning methods for the use of electrocardiogram (ECG) images in the diagnosis of cardiovascular disease (CVD). Existing studies show promise, but there’s a need to refine and optimize algorithms for better diagnostic accuracy and efficiency. Additionally, integrating transfer learning techniques with deep learning models for CVD detection from ECG data, especially to overcome small dataset limitations, is underexplored. Addressing these gaps could lead to robust, scalable solutions for early CVD detection and intervention.
Limitations of cardiovascular disease detection using ECG imagery: Loss of raw signal information, trustworthiness and ability to interpret models, quality of data and annotations, and the ability to apply to all devices and conditions. There are problems with the multi-class accuracy, image resolution and quality, regulatory and ethical issues, computational needs, unbalanced datasets, and the need for clinical validation. CNN is performing very low in cardiovascular disease using ECG images: Images that aren’t clear or aren’t important, a dataset that is too small or noisy, or ECG signals that weren’t properly preprocessed.
Technologies
The Technical background provides an in-depth technical background on key components of modern computing, including algorithms.
Convolutional Neural Network
Convolutional neural networks (CNNs) are artificial neural networks designed primarily to analyze grid-like input, such as images and movies. CNNs are exceptionally skilled at recognizing spatial patterns and hierarchies in data, in contrast to traditional neural networks. This is achieved by a series of computational layers, including convolutional, pooling, as well as fully linked layers. Convolutional layers in CNNs work by applying a collection of filters, or “kernels,” throughout the input data in order to detect features such as edges, textures, or shapes. The spatial dimensions are then down sampled by pooling layers the Figure 1 represents the CNN architecture.

The CNN architecture.
Densenet201
“Densely Connected Convolutional Networks 201,” or DenseNet-201, is a deep learning architecture that has significantly improved computer vision applications. With the goal of building even more sophisticated and effective neural networks, it was first presented as an expansion of the original DenseNet. DenseNet-201 features a unique dense connectivity network since each single level is forward feedback connected to all other layers. 26 This wide link not only prevents the issue of vanishing gradients but also enhances gradient flow, that in turn leads to better model performance and training. The above Figure 2 shows the Architecture of DenseNet-201.

Architecture of DenseNet-201.
InceptionV3
Google created the convolutional neural network (CNN) framework known as InceptionV3. 27 It is specifically intended for use in computer vision applications involving picture recognition and categorization. With approximately 23 million trainable parameters, InceptionV3 features deep layers and sophisticated modules like Inception modules, which allow for efficient learning and representation of complex visual patterns. The above Figure 3 shows the Architecture of InceptionV3.

Architecture of InceptionV3.
Proposed Work
The proposed system uses deep learning techniques, notably Convolutional Neural Networks (CNNs) and transfer learning procedures, to overcome the shortcomings of manual ECG interpretation and current automated solutions.
The system architecture for ECG detection is depicted in the provided in Figure 4.
1. ECG dataset: The project begins with an ECG dataset, which contains electrocardiogram (ECG) signals collected from patients. This data serves as the input for the entire pipeline.
2. Data visualization: Before preprocessing, data visualization is performed. Visualizing the data helps in understanding the distribution, patterns, and any anomalies present in the ECG signals.
3. Pre-processing: The raw ECG data undergoes preprocessing to clean and prepare it for model training. These steps involve noise reduction, normalization, and possibly feature extraction to enhance the quality and relevance of the data for the models.
4. Transfer learning: Transfer learning is applied using 3 different pre-trained models. Each model undergoes training and testing with the processed ECG data. That is, CNN, DenseNet201 and InceptionV3.
• CNNs are effective for image and signal data, making them suitable for analyzing ECG signals.
• DenseNet201 is a deep learning model known for its dense connections, improving gradient flow and enabling efficient training with fewer parameters.
• InceptionV3 is a deep convolutional network that achieves high accuracy with efficient computation, making it well-suited for complex signal data like ECG.
5. Model testing: Each of the 3 transfer learning models is tested with the processed ECG data. This step evaluates the performance of each model in detecting and classifying ECG signals.
6. Models result analysis: The results from the 3 models are analyzed and compared. The analysis includes accuracy, precision, recall, F1 score, and Receiver Operating Characteristic – Area Under Curve (ROC-AUC) metrics. Comparing these metrics helps determine the most effective model for ECG detection and classification.
The system architecture involves preprocessing the ECG dataset, applying transfer learning with 3 different pre-trained models (CNN, DenseNet201, and InceptionV3), testing each model, and then comparing their performance using various evaluation metrics. This approach leverages the strengths of different models to achieve accurate and reliable ECG signal detection and classification.
Data pre-processing: The ECG dataset undergoes several pre-processing steps to enhance its quality and suitability for model training. This includes noise reduction to eliminate irrelevant signals, normalization to standardize signal amplitude, and segmentation to extract relevant features from the ECG signals. These steps ensure the data is clean, consistent, and ready for effective feature extraction and model input.

System architecture.
Bar Chart Visualization of After Augmented ECG Dataset Class Counts
The bar graph as in Figure 5 visualizes the distribution of augmented ECG dataset samples across 4 target classes: Abnormal Heartbeat (233 samples), History of Myocardial Infarction (172 samples), Myocardial Infarction (239 samples), and Normal (284 samples). The purpose of data augmentation is to address class imbalance, enhancing the dataset’s diversity and improving model training robustness. While the Normal class remains the most prevalent and History of Myocardial Infarction the least, the overall distribution is more balanced post-augmentation. This balanced dataset is crucial for training accurate and unbiased machine learning models for ECG detection and classification.

Visualizing data shape after data augmentation using bar-chart.
Pie Chart Visualization of After Augmented ECG Dataset Class Counts
The pie chart as in Figure 6 displays the distribution of ECG samples across 4 classes after data augmentation. The Normal class comprises the largest portion at 30.60%, followed by Abnormal Heartbeat with 25.11% and Myocardial Infarction at 25.75%. The History of MI class represents the smallest segment at 18.53%. This distribution shows a more balanced dataset compared to the original, indicating effective augmentation. The balanced proportions across classes help in training machine learning models by reducing bias and improving the model’s ability to generalize and perform accurately in detecting and classifying ECG signals.

Visualizing data shape after data augmentation using pie-chart.
Data Splitting
The dataset has been divided into training and testing subsets to prepare for model training and evaluation. The data was split so that 80% of the samples would be used for training and 20% would be used for testing. One-hot encoding was used on the target labels. The training set has 742 samples, and each 1 is 512 × 512 pixels with 3 color bands. The testing set has 186 samples with the same dimensions. The original class distribution is kept in both the training set and the testing set. This makes sure that the review of the model’s performance is fair and accurate.
Experimental Results
The Result Analysis delves into the performance evaluation of convolutional neural network (CNN), InceptionV3, and DenseNet201 models in ECG detection. Through classification reports, ROC-AUC curves, accuracy plots, and loss plots, the effectiveness of each model is scrutinized across various metrics. These comprehensive analyses provide insights into the models’ abilities in accurately classifying ECG signals, guiding improvements for enhanced diagnostic precision.
True positive (TP): The number of positive samples correctly identified.
True negative (TN): The number of negative samples correctly identified.
False positive (FP): The number of negative samples incorrectly identified as positive.
False negative (FN): The number of positive samples incorrectly identified as negative.
Formulas for accuracy, precision, recall, and F1-score are given below:
Convolutional Neural Network
The classification report as in Table 1 for the CNN model reveals its performance across 4 categories: “Abnormal Heartbeat,” “History of MI,” “Myocardial Infarction,” and “Normal.” The metrics of precision, recall, and F1-score offer valuable information on how well the model can categories examples within each category. With recall scores of .85 and .67, respectively, and precision values of .61 and .46 for “Abnormal Heartbeat” and “Myocardial Infarction,” the model notably shows reasonably strong precision and recall for these 2 categories. However, it struggles with classifying “History of MI” and “Normal” instances, as indicated by lower precision and recall scores for these categories. The model's overall accuracy across all classes is 51%, indicating that there is potential for improvement, especially in classes where recall levels are lower.
Classification Report of CNN.
As depicted in Figure 7, the confusion report includes 4 classes. Thirteen cases were incorrectly placed in the “Abnormal Heartbeat” category, compared to forty that were correctly identified. Of the occurrences in “History of MI,” 5 were correctly classified while the remaining 29 were misclassified. In the “Myocardial Infarction” class, 32 instances were correctly classified, with 16 misclassified. Lastly, in the “Normal” category, 18 instances were correctly classified, while 20 were misclassified. These give a thorough summary of how well the model performs in differentiating between the various classes and point out areas that may use improvement, such lowering the number of misclassifications, especially in the “History of MI” and “Normal” categories.

Confusion matrix of CNN.
The ROC curve in Figure 8’s area under the curve (AUC) values varies according to the class. With an AUC value of .936%, abnormal has the highest value. Closely behind having an AUC of .883% is the History of MI, followed by the Normal has an AUC of .856%. It’s interesting to note that myocardial infarction also has an AUC of .856%. Higher AUC values indicate greater discriminating capabilities, and these numbers shed light on how well the model performs in differentiating between classes.

ROC-AUC curve of CNN.
The provided image in Figure 9 illustrates the accuracy of an ECG detection model across training epochs. The red line represents training accuracy, which typically increases as the model learns patterns within the training data. On the other hand, validation accuracy, which measures the effectiveness of the model on unobserved information, is represented as the blue line. The prediction model could possibly be overfitting if there is a distinction among both curves, which would lead to an excessive dependence on data from training sequences and poor generalization. Even though the model's accuracy seems promising, overfitting must be prevented by closely monitoring validation accuracy. Apart from accuracy, other important metrics for a thorough assessment of the model’s performance include sensitivity and specificity, which take into account the model’s capability to distinguish between ECG and non-ECG signals.

Accuracy plot graph of CNN.
The loss plot graph illustrated in Figure 10 is used for monitoring the performance of an ECG detection model during training. Loss, a measure of model performance, is calculated as the disparity between predicted and actual ECG signals. The y-axis shows loss, while the x-axis shows epochs. The graph shows training loss (red line) and validation loss (blue line) over epochs. Both losses show declining patterns, which indicates the algorithm is learning well. Overfitting, in which the model fits the training data too closely, may be indicated by a difference between the training and validation losses. Notably, a potential increase in validation loss after a certain epoch count could signal the onset of overfitting, highlighting the importance of monitoring loss trends to optimize model performance and generalization.

Loss plot graph of CNN.
InceptionV3 model
There is strong performance in all 4 cardiac disease categories according to the classification report shown in Table 2. With 47 cases, the precision, recall, and F1-score for Abnormal Heartbeat are all .94. Based on 34 cases, the history of MI reveals a precision of .84, recall of .79, and F1-score of .82. With 48 occurrences, Myocardial Infarction attains an F1-score of .98, perfect recall of 1.00, and precision of .96. With 57 examples, normal classification has an F1-score of .91, recall, and precision of .91. With macro and weighted averages of .91 and .92 across 186 cases, respectively, the model’s overall accuracy is 92%. These measurements show that the model performs exceptionally well in precisely identifying and categorizing ECG signals.
Classification Report of InceptionV3.
Figure 11 displays the confusion report, which includes 4 classes. Three cases were incorrectly classified for the “Abnormal Heartbeat” category, whereas 44 occurrences were correctly identified. Within the “History of MI” category, 27 cases had the correct classification, while 7 had the incorrect one. For “Myocardial Infarction,” there were no incorrect classifications in any of the 48 cases. Finally, 52 cases in the “Normal” group were correctly identified, whereas 5 instances were misclassified. These findings demonstrate the model’s excellent accuracy in classifying most occurrences properly across all categories, with the “Myocardial Infarction” class performing especially well with perfect classification.

Confusion matrix of InceptionV3.
The ROC curve displayed in Figure 12 assesses how well an InceptionV3 model performs in classifying 4 heart conditions: irregular heartbeat, history of myocardial infarction (MI), myocardial infarction, and normal. Myocardial Infarction has the highest AUC value (1.000%) of all these classes. After a detailed examination, Normal has an AUC of .991%, while Abnormal Heartbeat has an AUC of .989%. The AUC for History of MI is noteworthy at .948%. Higher values indicate better discriminatory power. These AUC values show how well the model works in differentiating between various cardiac diseases.

ROC-AUC curve of InceptionV3.
The x-axis on the accuracy plot of the model, as shown in Figure 13, represents the total amount of training epochs, while the y-axis displays accuracy. Training accuracy is shown by the blue line, and it usually rises over epochs as a model picks up on patterns in the data. Validation accuracy, which evaluates how effectively the model generalizes to new data, is represented by the green line. The model may be overfitted if it does well with data used for training but badly on validation data, as suggested by the gap among these lines. To address overfitting, consider training for fewer epochs, collecting more data, using regularization techniques, or experimenting with different model architectures. Additionally, while accuracy is a key performance metric, also consider sensitivity and specificity to fully evaluate model performance.

Accuracy plot graph of InceptionV3.
The loss plot graph n Figure 14, displaying the training and validation losses of an ECG detection model over several epochs. Loss measures the model’s performance by calculating the difference between predicted and actual ECG signals. Training loss (red line) typically decreases as the model learns from the data, while validation loss (blue line) assesses generalization to unseen data. Ideally, both losses should decrease and track closely. The trained model is learning in this graph as both losses reduce over time, but an imbalance between them points to overfitting, a situation in which the model works well on data used for training but finds it difficult to generalize to new data.

Loss plot graph of InceptionV3.
DenseNet201
The classification report as in Table 3 for DenseNet model shows high performance, achieving an overall accuracy of 91% across 4 classes. With 47 cases, the model’s precision, recall, and F1-score for Abnormal Heartbeat are .95, .85, and .90, respectively. With 34 cases, the history of MI has an accuracy of .88, recall of .82, and F1-score of .85. With a precision of .94, recall of .96, and F1-score of .95 over 48 cases, myocardial infarction exhibits strong performance. With 57 occurrences, the Normal class has an F1-score of .92, recall of .96, and precision of .87. The model performs well in identifying ECG signals, as evidenced by the macro averages of .91 for precision, .90 for recall, and .90 for F1-score; the weighted averages are constantly .91.
Classification Report of DenseNet201.
The confusion report, as depicted in Figure 15, entails 4 classes. For Abnormal Heartbeat, 40 instances were correctly classified, while 7 were misclassified. In the History of MI category, 28 instances were correctly classified, with 6 misclassified. Myocardial Infarction saw 46 instances correctly classified and 2 misclassified. Lastly, for the Normal class, 55 instances were correctly classified, with 2 misclassified. These figures offer a thorough summary of the algorithm’s achievement in correctly classifying cases in each of the 4 classes, with remarkable accuracy in recognizing Normal instances in particular.

Confusion matrix of DenseNet201.
The ROC curve depicted in Figure 16 illustrates varying values for the area under the curve (AUC) across different classes. With an AUC value of .999%, myocardial infarction has the highest discriminating among both favorable and adverse cases in this class. Following closely is abnormal with an AUC of .993%, indicating effective classification performance. Normal shows an AUC of .986%, while History of MI has an AUC of .982%, both demonstrating considerable discriminatory power but slightly lower than the former classes. Higher AUC values indicate an improvement in class separation and provide information regarding the model’s capacity to distinguish between several classes.

ROC-AUC curve of DenseNet201.
A figure showing the accuracy of an ECG detection model throughout several training epochs is shown in Figure 17. The accuracy of the model on the training data is shown by the red line, which shows its capacity to learn and get better at classifying ECG signals. This accuracy typically rises as epochs go by. On the other hand, the accuracy on the validation data, which is used to assess how well the model generalizes to new data, is represented by the blue line. An appreciable discrepancy in the accuracies of training and validation points to the possibility of overfitting, a phenomenon in which the model over-learns the patterns in the training data but finds it difficult to generalize to new data. In machine learning, overfitting is a prevalent problem since it makes the model less accurate when applied to unseen data.

Plot graph of accuracy for DenseNet201.
A graph of the loss plot is shown in the Figure 18. The y-axis, which shows the loss value, represents the distinction among the anticipated and actual ECG signals, while the x-axis represents the total amount of epochs. Less loss indicates better model performance. The red line, representing training loss, shrinks with the number of epochs, indicating an improvement in the model’s ability to match the training set. Conversely, the blue line, which represents validation loss, indicates how well the model applies to new data. Interestingly, although both lines gradually get thinner, a space between them may indicate overfitting, a situation in which the model finds it difficult to extrapolate beyond the training set.

Loss plot graph of DenseNet201.
Comparative Analysis of Model Accuracies in ECG Detection
This comparative analysis as in Figure 19 provides the accuracies of 3 different models—CNN, InceptionV3, and DenseNet201—in ECG detection tasks. The basic CNN model has an accuracy of 51%, which indicates that it is not very effective at capturing the complex patterns that may be observed in ECG data. Convolutional neural network designs that are more sophisticated, such as InceptionV3 and DenseNet201, display much better accuracies, with 92% and 91%, respectively, of their accuracy. Based on these findings, it appears that the advanced architectures of InceptionV3 and DenseNet201 allow for improved feature extraction and classification of electrocardiogram (ECG) data. As a consequence, the detection accuracy of these technologies is superior to that of the basic CNN model. The findings, taken as a whole, highlight the significance of utilizing various complex neural network designs in order to achieve reliable electrocardiogram detection.

Comparison graph.
Discussion
Cardiovascular Disease Detection in ECG Imagery is an interesting subject that combines heart disease with machine learning, image processing, and image processing. Early identification and labelling of some illnesses can make a big difference in how many people survive. Electrocardiogram (ECG) readings are important for finding cardiovascular disease (CVD) because they are a frequent, cheap, and painless way to check the electrical activity of the heart. Four main heart conditions—abnormal heartbeats, myocardial infarction, history of myocardial infarction, and normal heart function—were predicted by researchers using deep learning algorithms. An openly accessible dataset of ECG pictures from patients with cardiac problems was used for the analysis. Convolutional Neural Networks (CNNs), InceptionV3, and DenseNet201 architectures were tested in this study. The results showed that the algorithms used made very different predictions about how accurate they would be. CNN was only 51% accurate, but DenseNet201 was 91% accurate, which is a much better score. With an accuracy rate of 92%, InceptionV3 stood out as the best model. Abnormal Heartbeat (233 samples), History of Myocardial Infarction (172 samples), Myocardial Infarction (239 samples), and Normal (284 samples). The purpose of data augmentation is to address class imbalance, enhancing the dataset’s diversity and improving model training robustness. The Normal class comprises the largest portion at 30.60%, followed by Abnormal Heartbeat with 25.11% and Myocardial Infarction at 25.75%. The History of MI class represents the smallest segment at 18.53%. This distribution shows a more balanced dataset compared to the original, indicating effective augmentation. The results showed that the algorithms used made very different predictions about how accurate they would be. CNN was only 51% accurate, but DenseNet201 was 91% accurate, which is a much better score. With an accuracy rate of 92%, InceptionV3 stood out as the best model.
Conclusion and Future Work
This research work demonstrates the potential to improve the identification of cardiovascular disease by applying deep learning and transfer learning methods to ECG imaging. Chronic cardiovascular illnesses (also known as CVDs) are the largest cause of death throughout the world. It is essential to diagnose these situations as early as possible and with precision. However, manual interpretation of electrocardiograms (ECGs) is not only time-consuming but also prone to human mistake. ECGs are frequently utilized for identifying cardiovascular diseases (CVDs). An ECG can reveal a wide range of cardiac disorders, such as an enlarged heart, coronary artery disease, heart valve disease, fast, slow, or irregular heart rhythm, and heart defects. By leveraging pre-trained models and fine-tuning them on specialized datasets, the proposed system demonstrates significant improvements in diagnostic accuracy and efficiency. When it came to predicting how accurate they would be, the data demonstrated that the algorithms that were utilized generated significantly diverse forecasts. CNN had an accuracy rate of just 51%, but DenseNet201 had a score of 91%, which is a significantly higher score. As the model with the highest accuracy rate, InceptionV3 showed out as the most successful option. This scalable solution can integrate into existing healthcare infrastructure, aiding clinicians in early detection and intervention. In order to achieve broad adoption, which would contribute to improved patient outcomes and a lower burden of cardiovascular disease, more refining and validation on bigger datasets are necessary. Real-time electrocardiogram (ECG) data streaming, predictive analytics for early risk assessment, and collaboration with medical professionals are all potential areas of investigation for future study.
Footnotes
Authors Contribution
Ayeesha Soudagar: review and editing (equal). Savita K. Shetty and Saifullah Khalid: Conceptualization (lead); writing – original draft (lead); formal analysis (lead); writing – review and editing (equal). Shashidhara H. S.: Software (lead); writing – review and editing (equal). Niranjanamurthy Mudligiriyappa: Methodology (lead); writing – review and editing (equal). Anurag Sinha and Syed Immamul Ansarullah: Conceptualization (supporting); Writing – original draft (supporting); Writing – review and editing (equal).
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
