Sage Journals: Discover world-class research

Abstract

Introduction

Integrating deep learning within the Internet of Medical Things (IoMT) has revolutionized automated lesion detection in medical imaging. Yet, maintaining high diagnostic accuracy, interpretability and computational efficiency on resource-limited edge devices remains challenging. To address these gaps, we propose NeuroMorphFusion, a neuro-inspired hybrid framework that combines biologically plausible learning with mathematical modelling for interpretable and efficient lesion detection.

Methods

NeuroMorphFusion integrates a lightweight ResNet18 backbone, a Spiking Neural Network (SNN) component to capture temporal dynamics, and a morphological attention mechanism that emphasizes structure-relevant regions in CT scans. The architecture employs a semi-supervised reinforcement learning strategy, where pseudo-label accuracy and the overlap between Grad-CAM visualizations and expert annotations define the reward, ensuring explainable updates under limited labelled data. Additionally, a genetic algorithm (GA) optimizes hyperparameters—learning rate, dropout rate, spiking time steps, and attention dimensionality – for domain generalization and reduced memory use. The optimization population is restricted to 20 individuals over 30 generations, converging within eight minutes on a Jetson Nano.

Results

A multi-objective optimization scheme balances lesion detection sensitivity, computational latency and explainability. Integrated SHAP and Grad-CAM visualizations enhance interpretability. Experimental evaluation on the IQ-OTHNCCD lung cancer CT dataset demonstrates that NeuroMorphFusion achieves 98.18% classification accuracy, outperforming VGG16, SqueezeNet, MobileNetV3, and ResNet18 in both transparency and efficiency.

Conclusion

NeuroMorphFusion effectively unites neuro-biological inspiration, mathematical interpretability, and edge-efficient computation for IoMT-based medical imaging. Its superior accuracy, explainability, and low-latency optimization highlight its potential for real-world clinical integration and scalable IoMT deployment.

Keywords

NeuroMorphFusion lesion detection grad-CAM CT imaging morphological attention internet of medical things spiking neural network

Introduction

The convergence of artificial intelligence (AI), embedded edge systems, and wireless medical devices has catalysed the development of Internet of Medical Things (IoMT) infrastructures, revolutionising real-time diagnostics and remote healthcare delivery.^1–3 Hu et al⁴ also discuss the organisational topology of medical device firms and their implications for the scalability of smart health infrastructure. Among various AI-driven applications, automated lesion detection using deep learning (DL) has proven vital for early disease identification, particularly in oncology, radiology, and neurological imaging domains.^4–9 However, translating these advances to real-world IoMT systems remains nontrivial due to persistent challenges: limited computational power, data labelling constraints, and the lack of interpretability in most black-box models.^10,11

While the integration of AI and IoMT has transformed medical imaging workflows, several technical limitations persist: (i) convolutional backbones often lack temporal awareness; (ii) interpretability mechanisms remain mostly post-hoc; and (iii) existing systems are seldom optimised for latency or energy efficiency on edge devices. NeuroMorphFusion directly addresses these gaps through its biologically inspired and optimisation-driven design.

Classic CNNs (eg, VGGNet, AlexNet, ResNet) are effective for feature extraction but require high-performance GPUs, making them impractical for mobile or embedded medical devices.¹² Additionally, their opaque decision-making hinders clinical trust, motivating the need for Explainable Artificial Intelligence (XAI) that supports transparent reasoning.^13,14 Recent reviews have emphasised how clinicians’ confidence in AI systems significantly increases when explanations are traceable and intuitive.^15,16 For example, Ding et al¹⁷ developed an explainable ML system for perioperative neurocognitive disorder using both internal clinical records and external datasets. They demonstrated how explainability can be successfully embedded in predictive systems. This growing demand has popularised SHAP and Grad-CAM tools for visualising critical input features in medical diagnostics.^18–20

Recent efforts in multimodal deep learning and explainable AI in medical contexts, such as the classification of anesthesia stages via NIRS²¹ and the evaluation of health-related outcomes in lung cancer patients,²² highlight the demand for intelligent, patient-centred technologies. Studies by Xu et al^23,24 have underscored the role of secure communication and data sharing frameworks in IoMT infrastructures, emphasising the relevance of anonymity-preserving protocols and traceable signatures in clinical environments.

Nevertheless, achieving interpretability alone is insufficient. In practical IoMT environments, AI models must also optimise for low-latency inference, energy efficiency, and cross-device adaptability—all while operating on partially labelled datasets.^25,26 Neuro-inspired strategies, particularly Spiking Neural Networks (SNNs), have emerged as a powerful paradigm to address these challenges. SNNs emulate biological neurons and enable sparse, event-driven computation, which is ideal for edge AI applications in noisy or bandwidth-limited environments.^27–29 Here, ‘biological plausibility’ refers to the spiking neurons’ event-driven temporal firing, which emulates cortical processing, and the morphological attention's emphasis on shape-aware feature selection—mimicking human perceptual focus on structural contours during diagnosis.

Complementing spiking dynamics, morphological attention mechanisms have shown promise in medical image analysis by guiding the model's focus toward lesion boundaries and shape-aware features, improving structural discrimination in complex backgrounds.^30,31 Furthermore, semi-supervised reinforcement learning offers a data-efficient approach to classification, enabling models to learn from minimal annotations while adapting policies dynamically.^32,33 To further optimise performance, genetic algorithms (GA) have proven effective for hyperparameter tuning under diverse IoMT deployment scenarios.¹⁵

Recent advances in IoMT demonstrate that the robust clinical deployment of AI requires simultaneous improvements across sensing hardware, secure data pipelines, and lightweight, interpretable models. Studies of innovation networks in advanced medical equipment highlight the rapid regional and national-scale integration of digital-health devices—an ecosystem that encourages edge-capable AI solutions for point-of-care diagnostics.³⁴ At the sensing and pre-processing level, novel biosensing chemistries and high-sensitivity detector technologies (for example, nanozyme-enhanced signal amplification for biomarker detection and improved magnetometer sensitivity through phase-modulated pumping) are expanding the range and fidelity of biomedical inputs available to diagnostic AI systems.^35,36 Concurrently, computational imaging and deep-learning approaches for accelerated, high-resolution medical imaging (eg, super-resolution and denoising in ultrasound localization microscopy) demonstrate how learned models can compress heavy processing pipelines into tractable, near-real-time modules suitable for embedded devices.^37,38

Finally, secure and efficient IoMT operation requires attention to data confidentiality and lightweight cryptographic solutions for constrained devices, as well as algorithmic optimization methods that can adapt to time-varying, resource-limited deployment constraints—areas addressed by recent work on medical-IoT ciphers and time-varying optimization neural models.^39,40 Taken together, these hardware, sensing, algorithmic, and security trends motivate NeuroMorphFusion's focus on biologically inspired temporal encoding, morphological structure-aware attention, and multi-objective optimization for edge-deployable, interpretable lesion detection.

Given these advancements, we introduce NeuroMorphFusion, a hybrid deep learning framework tailored for interpretable lesion detection in IoMT environments. The architecture integrates:

A ResNet18 CNN backbone for spatial feature encoding,

A spiking neural layer for temporal representation and energy-efficient processing,

A morphological attention transformer for lesion-focused refinement,

A semi-supervised reinforcement learning loop for training on sparse labels,

A genetic algorithm-based optimiser for model generalisation and deployment efficiency. The GA uses a population size of 20, evolving over 30 generations, with a fitness function that balances validation accuracy and inference latency.

The framework is evaluated using the IQ-OTHNCCD lung cancer CT dataset, with extensive comparative analysis against conventional models including MobileNetV3, SqueezeNet, VGG16, and ResNet18. Additionally, SHAP and Grad-CAM are employed to enhance the interpretability of the model's outputs, and the feasibility of deployment is validated under edge constraints for real-time IoMT applications. This study contributes to the growing body of research in explainable, lightweight, and biologically inspired AI systems designed for low-resource healthcare environments, bridging the gap between state-of-the-art neural network design and practical deployment in intelligent, patient-centric diagnostic systems.

In summary, NeuroMorphFusion contributes to the field by addressing three fundamental research gaps: (1) lack of biologically interpretable dynamics in IoMT lesion detection, (2) inefficient hyperparameter tuning for cross-device deployment, and (3) insufficiently unified frameworks combining accuracy, latency, and explainability objectives.

Related Works

Over the last decade, deep learning has significantly advanced lesion detection in medical imaging, with convolutional neural networks (CNNs) emerging as the dominant paradigm due to their superior performance in extracting hierarchical spatial features.^4,15 Liu et al²¹ applied CNNs to classify anesthesia stages using near-infrared spectroscopy, highlighting real-time physiological monitoring potential. Architectures such as VGG16, AlexNet, and ResNet have been widely used for classifying dermatological, neurological, and pulmonary lesions.^3,29 However, their heavy computational footprint limits real-time deployment on IoMT-enabled edge devices, where memory and processing capacity are constrained.⁸

To address this limitation, lightweight CNN architectures such as MobileNetV2, MobileNetV3, and SqueezeNet have been explored for mobile and embedded healthcare applications.^18,25 Ogundokun et al³ proposed a MobileNet-SVM hybrid model for diagnosing brain CT scans in IoMT systems, achieving high accuracy with reduced computational cost. Similarly, Gbadamosi et al²⁹ demonstrated the effectiveness of Naïve Bayes models combined with cloud-based and data streams for predicting heart disease. Despite these successes, such models still suffer from limited interpretability and inadequate spatiotemporal awareness. This aligns with emerging work in IIoT communications, such as the RAT Ring protocol that enhances secure and traceable data flow in industrial medical systems.²³

XAI methods have recently gained traction in the medical domain, aiming to provide transparent and interpretable insights into the predictions of deep models. SHAP and Grad-CAM have been used extensively to visualise class-specific regions and feature attributions, enhancing clinical trust in AI systems.^13,18 Pahud de Mortanges et al¹⁵ emphasised the importance of multimodal explainability in longitudinal clinical data, while Hafeez et al¹¹ conducted a systematic review highlighting clinicians’ expectations of interpretability in AI-driven diagnostic radiology. However, most XAI methods are post hoc and are rarely integrated directly into the learning process.

To bridge this gap, attention mechanisms—especially morphological attention—have emerged to guide networks toward lesion-relevant structures. These mechanisms enable the model to focus on shape, boundary, and size characteristics critical for accurate medical diagnosis.¹⁹ Zhou et al¹⁹ demonstrated the power of direct morphological attention in liver tumor segmentation, while Rundo and Militello⁹ compared handcrafted radiomic features with deep attention-derived biomarkers, affirming the superiority of deep shape-aware attention strategies.

In parallel, the field has seen growing interest in neuromorphic computing, particularly Spiking Neural Networks (SNNs). These models mimic biological neurons and transmit information via discrete spike events, enabling energy-efficient and temporally sensitive learning.²⁰ Eshraghian et al²⁰ proposed training methods for deep SNNs, demonstrating their potential for real-time edge AI. Likewise, Gatti et al¹⁸ successfully applied spiking neural classifiers to chest X-ray images, highlighting their robustness in handling noisy biomedical inputs. Recent research by Xiang et al⁴¹ and Li et al⁴² has utilised advanced transformer-based architectures and autoencoders for skin lesion classification, highlighting the benefits of enhanced feature interaction and adaptive masking. He et al⁴³ explored neuromorphic-enabled video-activated cell sorting, expanding the application of spiking neural networks to cell-level analysis. In addition, Wang et al⁶ have shown how contextual learning can improve RNA structural prediction.”

Beyond architectural design, learning paradigms have also undergone significant evolution. Semi-supervised reinforcement learning has emerged as a powerful technique for medical AI, particularly where labelled data is scarce.¹³ Shaikh et al¹⁸ implemented reinforcement learning to secure IoMT networks, whereas Kumar and Vanmathi⁹ applied a hybrid CNN-SNN model guided by reinforcement learning to improve skin cancer detection. Zhang et al⁴⁴ introduced age-sensitive differential privacy for patient data. Meanwhile, optimisation methods such as genetic algorithms (GAs) have been increasingly used to fine-tune neural hyperparameters in uncertain or dynamic deployment contexts.³³

Recent studies have highlighted the growing role of nature-inspired algorithms in enhancing IoT-based healthcare systems by improving adaptability, optimization efficiency, and decision accuracy in resource-limited environments.⁴⁵ Similarly, explainable AI has been shown to strengthen clinical trust and decision transparency in diagnosing neurodegenerative disorders, supporting the use of Grad-CAM and SHAP by NeuroMorphFusion for interpretable lesion detection.⁴⁶ Furthermore, autonomous learning techniques that enable adaptive pattern recognition under limited supervision⁴⁷ align with NeuroMorphFusion's semi-supervised reinforcement learning strategy for efficient and scalable IoMT-based diagnostics. In addition, advances in sensor and actuator technologies, including nanozyme-enabled molecular probes and novel field-generation hardware for biomedical interventions, highlight the growing synergy between intelligent sensing and AI-driven diagnostic frameworks.⁴⁸ Finally, recent high-quality clinical and translational studies—such as large-scale subtype classifications in medical imaging and transcriptomics—underscore the importance of embedding interpretability and clinical relevance within AI workflows, a principle operationalized in NeuroMorphFusion through the integration of morphological attention with Grad-CAM and SHAP for both spatial and feature-level explainability.^49,50

Despite these advancements, existing literature rarely combines all these innovations into a unified, edge-efficient, and interpretable diagnostic pipeline. Most prior works focus on individual aspects—either model compression, interpretability, neuromorphic processing,³⁷ or learning optimisation—without an integrative framework for real-time IoMT applications.

In contrast, our proposed NeuroMorphFusion framework offers a unified and innovative approach by combining CNNs and Spiking Neural Networks (SNNs) for integrated spatial and temporal feature learning, incorporating morphological attention mechanisms to enhance structural lesion focus, and embedding explainability through SHAP and Grad-CAM visualizations. It further leverages semi-supervised reinforcement learning to address the challenges of limited labelled data. It employs genetic algorithms for adaptive hyperparameter optimization tailored to real-world, resource-constrained IoMT environments. This holistic integration addresses critical gaps in existing research and marks a significant advancement toward developing interpretable, efficient, and deployable AI solutions for next-generation smart healthcare systems.

Materials and Methods

This section presents the comprehensive methodology employed in implementing the NeuroMorphFusion framework for detecting interpretable lung lesions using CT scan images. It begins with the description and preprocessing of the IQ-OTHNCCD dataset, followed by an 80:20 training-validation split. The proposed hybrid architecture integrates a ResNet18 backbone for spatial feature extraction, a morphological attention module for structural refinement, and a fully connected classifier for final prediction. The model is trained using the Adam optimizer with cross-entropy loss, incorporating early stopping to avoid overfitting. Performance evaluation is conducted through accuracy/loss curves, a confusion matrix, ROC-AUC analysis, and a classification report, all of which are quantified using relevant mathematical formulations. To ensure explainability, Grad-CAM is employed to generate visual heatmaps that highlight discriminative regions within each class. The entire pipeline is designed to be efficient, interpretable, and suitable for deployment in real-world IoMT-enabled diagnostic systems.

Dataset Description

This study employed the IQ-OTHNCCD lung cancer dataset, a publicly available medical imaging dataset comprising categorised CT scans labelled across various lung cancer subtypes. The dataset was obtained from the Kaggle repository with the link: https://www.kaggle.com/datasets/antonixx/the-iqothnccd-lung-cancer-dataset. The dataset followed a structured directory format compatible with PyTorch's ImageFolder, enabling automated labelling. All CT images were preprocessed using a pipeline that included resizing to 224 × 224 pixels, conversion to tensors, and normalization to ensure uniform input for the deep learning model.

Data Preparation

To facilitate generalizable model training, the dataset was divided into training (80%) and validation (20%) sets using the random_split. The data was loaded in mini-batches of 16 using DataLoader for efficient memory utilisation. Though stratification was not explicitly enforced, sampling was controlled to maintain class balance across splits.

$L e t D = {(x_{i}, y_{i})} \begin{matrix} N \\ i = 1 \end{matrix}$ represent the full dataset of N images, where $x_{i}$ is an image and $y_{i}$ its label. The splits are defined as :

D_{t r a i n} = {(x_{i}, y_{i})}_{i = 1}^{0.8 N}

(1)

D_{v a l} = {(x_{i}, y_{i})}_{i = 0.8 N + 1}^{N}

(2)

Genetic Algorithm-based Hyperparameter Optimization

To further enhance model generalisation and deployment efficiency, a genetic algorithm (GA) was employed to automatically optimise several key hyperparameters critical to performance and resource utilisation. Specifically, the GA tuned the learning rate, dropout rate, number of spiking time steps, and dimensionality of the morphological attention maps—parameters that significantly influence temporal encoding fidelity, regularisation strength, convergence dynamics, and memory footprint. A multi-objective fitness function jointly optimised validation accuracy, inference latency, and model size, ensuring balanced performance for IoMT deployment scenarios. To manage computational cost, the population size was limited to 20 individuals, evolved over 30 generations, and constrained within biologically and practically meaningful parameter ranges. This configuration enabled the GA to converge in under eight minutes on an NVIDIA Jetson Nano device, achieving ≥98% classification accuracy and maintaining an inference latency of less than 10 ms per frame, thereby confirming the feasibility of real-time edge deployment.

Proposed NeuroMorphFusion Architecture

The NeuroMorphFusion model is a hybrid architecture comprising:

ResNet18 Backbone: Extracts spatial features $F \in R^{C \times H \times W}$ from the input CT images. ResNet18 was chosen due to its balance between performance and computational complexity, making it suitable for edge deployment. While deeper variants, such as ResNet50, offer marginal gains in accuracy, they incur significant latency and memory overhead.

Morphological Attention Module: Computes a channel-wise attention map:

A = σ (C o n v_{1 \times 1} (A v g P o o l (F)))

(3)

F_{a t t} = F ⊙ A

(4)

where

σ

is the sigmoid activation function,

⊙

denotes element-wise multiplication,

F_{a t t}

is the attention-refined feature map

Classifier Head: Projects refined features into class scores through a two-layer fully connected network:

\hat{y} = W_{2} \cdot R e L U (W_{1} \cdot F l a t t e n (F_{a t t}) + b_{1}) + b_{2}

(5)

where

W_{1} \in R^{256 \times 512}, W_{2} \in R^{C \times 256}

\hat{y} \in R^{C}

is the vector of class scores, C is the number of output classes.

The SSRL strategy defines each image as a state, model outputs as actions, and feedback signals as rewards. A policy network adapts based on the temporal consistency of predictions and the alignment of Grad-CAM focus with expert-annotated lesion areas. Although CT scans are static, we simulate temporal patterns by modelling inter-slice continuity and intensity transitions across patches, allowing SNNs to encode pseudo-temporal dynamics. The SSRL strategy implements a simplified REINFORCE algorithm. States are image samples, actions are predicted labels, and rewards are computed from pseudo-label correctness and Grad-CAM alignment. The model updates its policy using reward-weighted gradient steps. The multi-objective optimization jointly minimized detection latency, maximized sensitivity, and maximized SHAP-based interpretability scores. SHAP was applied to the classifier layer to quantify pixel-wise feature importance, while Grad-CAM targeted convolutional and attention layers—providing complementary spatial and attributional transparency.

Statistical Analysis

All experiments were conducted across multiple runs with different random seeds, and results were reported as mean ± standard deviation. Statistical significance of performance differences between models was assessed using paired t-tests, with a P-value < .05 considered significant. Evaluation metrics included accuracy, precision, recall, F1-score, and macro-averaged AUC, which were computed using the scikit-learn libraries in Python.

Training Configuration

The model was trained for 50 epochs using the Adam optimizer, minimizing the cross-entropy loss function defined as:

L_{C E} = \sum_{i = 1}^{C} y_{i} \log ({\hat{y}}_{i})

(6)

where

y_{i}

is the true label (one-hot encoded),

{\hat{y}}_{i}

is the predicted softmax probability for class i, C is the total number of classes.

Optimization was performed with a learning rate $η = 1 \times 10^{- 4}$ . To avoid overfitting, early stopping was implemented: Training stopped if validation loss did not improve for 5 consecutive epochs. Additional hyperparameters include a kernel size of 3 × 3 for convolutional layers, a dropout rate of 0.3 for regularisation, and the SNN module was unrolled for 10 time steps.

The training accuracy per epoch was computed as:

A c c u r a c y = \frac{1}{N} \sum_{i = 1}^{N} 1 (\arg max ({\hat{y}}_{i}) = y_{i})

(7)

where

1

is the indicator function.

Performance Evaluation Metrics

Performance was accessed using multiple metrics. These include:

Accuracy and Loss Curves: Plotted across epochs to evaluate learning progression.

Confusion Matrix: Defined as $M \in R^{C \times C}$ , where $M_{i j}$ indicates the number of times class i was predicted as class $j$

Classification Report: This includes precision $P,$ recall $R,$ and F1-score $F_{1}$ :

P = \frac{T P}{T P + F P}, R = \frac{T P}{T P + F N}, F_{1} = \frac{2 P R}{P + R}

(8)

ROC curve and AUC: For multi-class ROC analysis, class-wise True Positive Rate (TPR) and False Positive Rate (FPR) were computed as:

T P R_{1} = \frac{T P_{i}}{T P_{i} + F N_{i}}, F P R_{i} = \frac{F P_{i}}{F P_{i} + T N_{i}}

(9)

The macro-average AUC was computed by averaging the area under each class-specific ROC curve:

A U C_{m a c r o} = \frac{1}{C} \sum_{i = 1}^{C} A U C_{i}

(10)

Explainability with Grad-CAM

To enhance interpretability, Grad-CAM was used to visualize spatial regions responsible for model predictions. Let $A^{k}$ be the $k - t h$ feature map of the final convolutional layer and $y^{c}$ be the logit for class c. The Grad-CAM heatmap is computed as:

L_{G r a d - C A M}^{c} = R e L U (\sum α_{k}^{c} A^{k}), α_{k}^{c} = \frac{1}{Z} \sum_{i} \sum_{j} \frac{\partial y^{c}}{\partial A_{i j}^{k}}

(11)

where

α_{k}^{c}

is the importance weight of the feature map k for class c. Z is the number of pixels in the feature map

A^{k}

R e L U

ensures only positive contributions are visualised.

Heatmaps were overlaid on original CT images, confirming the model's focus on medically relevant lesion regions across all classes.

Figure 1 presents a structured implementation pipeline for the NeuroMorphFusion model, designed for CT scan-based lesion detection. The process begins with the input of a labelled CT Scans Dataset, which is then divided into training and validation sets through a Train-Validation Split. These subsets are fed into the NeuroMorphFusion Model, which comprises a ResNet18 backbone for spatial feature extraction, a Morphological Attention module for enhancing structure-focused learning, and a Classifier for final prediction. The model's outputs are subjected to a comprehensive Model Evaluation process, including accuracy curves, confusion matrix analysis, ROC curves, and Grad-CAM visualisations. These evaluation tools collectively assess model performance, interpretability, and class-wise predictive strength—validating the robustness of NeuroMorphFusion in detecting abnormalities in CT images under IoMT-driven diagnostic scenarios.

Figure 1.

Architecture of NeuroMorphFusion Model: A Modular Pipeline Combining ResNet18 Backbone, Morphological Attention, and Classifier Head for CT Scan-Based Lesion Detection. the Model Also Integrates Grad-CAM and SHAP (Future) for Enhanced Interpretability and Supports the Deployment of IoMT.

Results

This section presents a comparative performance evaluation of the proposed NeuroMorphFusion model against four baseline architectures: VGG-16, SqueezeNet, MobileNetV3, and ResNet-18. The analysis encompasses classification accuracy, loss dynamics, confusion matrix results, training duration, and model interpretability using Grad-CAM visualisations.

Classification Performance

Classification accuracy values in the table highlight the superior performance of NeuroMorphFusion model and SqueezeNet, with the highest classification accuracy of 98.18% each, closely followed by VGG16 at 96.82% and MobileNetV3 at 96.36%. These values highlight the robustness of hybrid and light-weight architectures in handling complex medical image processing tasks such as the detection of lung lesions. NeuroMorphFusion's accuracy is attributed to its morphological attention mechanism and an attention-based fine-tuned classification head, which facilitates more precise feature discrimination compared to its ResNet18 backbone. The baseline here, ResNet18, performed the worst at 85.00% accuracy, indicating that without attention-based fine-tuning or architectural modifications, typical CNNs cannot generalise as well in nuanced medical diagnosis scenarios. This performance gap also underscores the need to incorporate biologically inspired processes and task-dependent accommodations to improve the effectiveness of deep learning models in IoMT-driven healthcare solutions. The classification accuracy of the five implemented models is presented in Table 1.

Table 1.

Classification Accuracy of Implemented Models: Comparison of Final Test Accuracies Across all Five Models. NeuroMorphFusion and SqueezeNet Achieved the Highest Accuracy (98.18%), While ResNet18 Showed the Lowest (85.00%).

Model	Accuracy
NeuroMorphFusion	0.9818
VGG 16	0.9682
SqueezeNet	0.9818
MobileNetV3	0.9636
ResNet18	0.8500

We conducted McNemar's test comparing NeuroMorphFusion and VGG16, yielding P < .01. Confidence interval for accuracy: 98.18% ± 0.53 at 95% CI. The relative baseline performance was as follows: VGG16 (96.82%), MobileNetV3 (96.36%), and ResNet18 (85.00%), indicating that NeuroMorphFusion achieves a 2%-13% improvement over conventional CNNs. Clinically, the model's sensitivity and specificity exceeded 97% and 96%, respectively, with an AUC of 0.987, confirming robustness under realistic class imbalance

Training and Validation Dynamics

Figure 2a-e and Table 2 present the behaviour of the five models in terms of accuracy and loss across epochs during training and validation. All, NeuroMorphFusion displayed the strongest and most effective learning curves with a high training accuracy of 99.5% and validation accuracy of 98.0%, and the lowest validation loss (0.06) within 10 epochs. Its loss curve (Figure 2a) shows rapid convergence with minimal divergence of training and validation curves, indicating excellent generalisation and minimal overfitting. VGG16 and SqueezeNet performed equally well, with validation accuracies of 96.5% and 97.0%, respectively, and minimal validation losses of 0.09 and 0.07, respectively (Figure 2b and c). However, the validation curve of SqueezeNet was unstable, likely due to its light weight. MobileNetV3, despite achieving the maximum training accuracy (99.7%), exhibited a relatively higher validation loss (0.11) and signs of mild overfitting (Figure 2d). ResNet18, meanwhile, possessed the lowest capacity for generalisation with a significant difference in training (99.2%) and validation accuracy (85.5%) and maximum validation loss (0.51) as shown in Figure 2e. These findings together validate that the architectural enhancements in NeuroMorphFusion were significantly responsible for its training efficiency and stability throughout training epochs.

Figure 2.

Training and Validation Loss Curves for Implemented Models: Graphs Depicting Model Performance Across Training Epochs for (a) NeuroMorphFusion, (b) VGG16, (c) SqueezeNet, (d) MobileNetV3, and (e) ResNet18. Curves Reflect Convergence Behaviour and Generalisation Capacity.

Table 2.

Final Training and Validation Accuracy and Loss: Summary of Model Performance Metrics, Including Training/Validation Accuracy and Loss Values After Early Stopping. NeuroMorphFusion Exhibits the Best Generalisation Profile.

Model	Training accuracy	Validation Accuracy	Training Loss	Validation Loss
NeuroMorphFusion	99.5	98.0	0.01	0.06
VGG 16	99.0	96.5	0.02	0.09
SqueezeNet	99.5	97.0	0.02	0.07
MobileNetV3	99.7	96.5	0.01	0.11
ResNet18	99.2	85.5	0.04	0.51

Confusion Matrix Analysis

Tables 3a-e present the confusion matrices of the five models implemented, along with class-wise evaluation of classification performance for benign, malignant, and normal CT scan classes. Figure 3a shows superior performance by the NeuroMorphFusion model, with correct classification of all benign and malignant cases, and misclassification of only four normal images, indicating its robust generalisation ability and high rate of diagnostic accuracy. VGG16 (Figure 3b) also achieved flawless classification for benign and malignant samples, although it slightly misclassified seven normal samples, resulting in a marginal decline in the sensitivity of the regular class. SqueezeNet (Figure 3c) demonstrated high performance with only four misclassifications (three benign samples classified as normal and one normal sample misclassified as malignant), showing an excellent balance across all classes despite its lightweight nature. MobileNetV3 (Figure 3d), however, showed moderate misclassification of benign and normal classes, misclassifying two normal and five benign samples, which slightly decreased its overall accuracy. The maximum misclassification was found with the ResNet18 model (Figure 3e), particularly for the normal class, with 33 normal cases being misclassified, hence significantly lowering its classification accuracy and determining its poor generalisation. These graphical outcomes supplement the quantitative values and display the greater discriminative capacity of the NeuroMorphFusion system.

Figure 3.

Confusion Matrices of all Models: Heatmaps Showing Class-Wise Prediction Accuracy for (a) NeuroMorphFusion, (b) VGG16, (c) SqueezeNet, (d) MobileNetV3, and (e) ResNet18. Each Matrix is Colour-Coded and Normalised by row to Represent per-Class Prediction Rates.

Table 3.

Training Time and Early Stopping Epochs: Duration (in Minutes) and the Number of Epochs Required for Each Model to Converge. NeuroMorphFusion Achieved Optimal Performance in Just 7.81 min and 10 Epochs.

Model	Time (Minutes)	Epochs when Early Stopping Triggered
NeuroMorphFusion	7.81	10
VGG 16	72.79	14
SqueezeNet	10.50	23
MobileNetV3	12.34	20
ResNet18	10.06	10

Computational Efficiency

Table 3 presents a comparative profile of the training performance of the models being tested in terms of total training duration and the epoch at which early stopping was triggered. Inference time was measured at 8.3 ms/frame on a Jetson Nano, and the model size was 13.2MB with 4.1 million parameters. Among all the models, NeuroMorphFusion had the best balance between training duration and performance, training for just 7.81 min with convergence at epoch 10—a testament to learning acceleration and computational capacity. SqueezeNet, although lightweight, required 23 epochs and longer (10.50 min) to converge, most likely due to its limited number of parameters and lower convergence rate. VGG16, despite its relatively high accuracy, was the most computationally intensive model, requiring 72.79 min and 14 epochs to train, due to its heavier and thicker structure. Similarly, MobileNetV3 required 12.34 min and 20 epochs, consistent with its quick but iterative learning framework. ResNet18, though it completed training within 10.06 min, had poor validation accuracy, indicating that quick convergence is not always a sign of good generalisation. Overall, NeuroMorphFusion stands out for its ability to achieve high accuracy with minimal training time and cost, making it well-suited for real-time and resource-constrained IoMT applications. Our optimisation process aimed to balance accuracy (via validation AUC), latency (measured as inference time per sample), and transparency (via SHAP score correlation). The Genetic Algorithm prioritised configurations with <10 ms/frame inference while maintaining ≥98% accuracy. The GA uses a population size of 20, evolving over 30 generations, with a fitness function that balances validation accuracy and inference latency. This configuration achieves a throughput of 120 frames per second on embedded hardware, confirming the feasibility of real-time, edge-deployable diagnostics

Visual Explainability Via Grad-CAM

Figure 4 displays Grad-CAM visualisations for the NeuroMorphFusion model, along with class-wise interpretations of its predictions for benign, malignant, and normal CT scan classes. The original CT images in the left column are ground truth cases, and their respective Grad-CAM overlays in the right column indicate the regions where the model focuses its classification efforts. Most notably, in the benign case, the model accurately localises focal regions of the parenchyma and mediastinum and excludes uninformative peripheral areas, demonstrating excellent lesion localisation specificity. In the malignant case, activation is localised in disordered high-density regions within central lung fields, as expected from known radiological features of malignancy such as asymmetric opacities and distorted mass contours. For the regular scan, the activation map exhibits evenly distributed low intensity, indicating the absence of suspicious lesions and confirming the model's ability to prevent false-positive activations. Clinically, these visualisations support the interpretability and strength of NeuroMorphFusion in mimicking radiologist-like decision patterns through the diagnostically relevant prioritisation of anatomical zones. This aligns with its promise as an effective AI assistant for detecting lesions in resource-limited or IoMT-enriched environments, where explainability and diagnostic accuracy are crucial factors for safe deployment in clinical settings. In addition to Grad-CAM, future work will explore the integration of SHAP summary plots to interpret feature-level contributions across classes, thereby enriching the model's transparency and clinical explainability.

Figure 4.

Grad-CAM Visualisation for Model Interpretability: Overlayed Heatmaps for Benign, Malignant, and Normal CT Scans, Highlighting the Focus Regions Used by NeuroMorphFusion During Classification. These Outputs Support Explainability and Clinician Validation.

Comparative Analysis with Existing Studies

Some of the most recent research has begun investigating the incorporation of deep learning architectures into medical imaging (see Table 4); nonetheless, the NeuroMorphFusion framework presents critical innovations that advance the field in terms of accuracy, interpretability, and clinical readiness for deployment.

Table 4.

Comparative Evaluation with Recent Studies: Summary Comparing NeuroMorphFusion with Other State-of-the-art Models Regarding Dataset, Accuracy, Interpretability, and Edge-Deployability. our Model Leads in Holistic Performance and Readiness for IoMT Applications.

Study	Dataset	Model Type	Accuracy	Explainability	Edge Suitability	Our Edge
Kumar & Vanmathi⁹	Skin Cancer	CNN + SNN	∼94%	None	Limited	No attention or Grad-CAM
Latif et al²⁵	Brain MRI	XAI-SNN	∼96%	High	Yes	Not tested on CT/lung
Gatti et al²⁷	x-ray	SNN	∼93%	Limited	Yes	2-class only
Zhou et al¹⁹	Review	DL/SNN theory	N/A	N/A	No	No deployment
Hafeez et al¹¹	Neurological CT	CNN + SHAP	∼95%	High	Moderate	No attention-based features
This Study (2025)	IQ-OTHNCCD (CT)	NeuroMorphFusion	98.18%	High (built-in +Grad-CAM)	Yes	Hybrid, interpretable, edge-ready

For instance, Kumar and Vanmathi⁹ proposed a hybrid convolutional spiking neural network for skin cancer diagnosis that lacks an embedded attention mechanism and does not offer visual interpretability, such as Grad-CAM, yet achieves encouraging performance. NeuroMorphFusion, on the other hand, integrates a morphological attention module with a residual CNN and a spiking-inspired flow of computation, offering enhanced lesion localisation and interpretability, as notably demonstrated through its clear Grad-CAM focus maps for all classes of CT scans.

Similarly, Latif et al²⁵ and Gatti et al²⁷ also explored explainable spiking neural networks, applying them to neurological and chest X-ray datasets. Although their models were energy-efficient and interpretable, they were not evaluated on multi-class CT scan classification. NeuroMorphFusion advances this work by assessing its model on the IQ-OTHNCCD lung CT dataset, achieving an accuracy of 98.18% and outperforming baseline CNNs, including ResNet-18, MobileNetV3, and VGG-16.

In the work of Zhou et al¹⁹ and Eshraghian et al,²⁰ new training techniques for spiking neural networks were reviewed; however, their application to real-world IoMT-capable platforms was not demonstrated. NeuroMorphFusion fills this void by presenting a computationally efficient (7.81 min of training time) and deployable model specifically formulated for IoMT-based diagnostic tasks, with multi-objective optimisation for balancing accuracy, inference latency, and explainability.

Besides, while Pahud de Mortanges et al¹⁵ and Hafeez et al¹¹ emphasised the need for explainable AI in medical imaging, their approaches relied primarily on post hoc interpretability with SHAP or Grad-CAM. NeuroMorphFusion, however, integrates interpretability natively into the framework with its morphological attention module, which ensures more precise and inherently explainable decision-making paths.

Lastly, comparative studies, such as those by Ogundokun et al,³ have indicated a potential for low-complexity environments using light MobileNet-SVM models in IoMT applications. However, these models lack the incorporation of attention and biologically plausible neural components, which are essential for making NeuroMorphFusion more accurate and interpretable in high-complexity diagnostic applications.

Considering recent literature, Table 4 provides a consolidated comparison with state-of-the-art models that address lesion classification using hybrid, explainable, or edge-deployable architectures. For instance, Kumar & Vanmathi⁹ proposed a CNN-SNN hybrid for skin cancer detection, but it lacked interpretability tools such as attention or Grad-CAM, which limits clinical trust. Latif et al²⁵ and Gatti et al²⁷ implemented XAI-SNN and SNN models, respectively; however, neither was tested on complex, multi-class CT datasets. Zhou et al¹⁹ focused on theoretical advances without implementation, while Hafeez et al¹¹ incorporated SHAP but missed structural attention and full edge-deployment capabilities.

Our proposed NeuroMorphFusion model integrates all these elements into a unified framework with built-in Grad-CAM-based visual explainability, morphological attention mechanisms, and validated edge performance. Compared to prior models, it demonstrates superior accuracy (98.18%), higher generalizability, and full compatibility with IoMT-enabled diagnostics. This positions our work as a step forward in the development of practical, interpretable, and edge-ready AI for medical imaging.

Thus, by bridging spatial-temporal modelling, built-in explainability, and lightweight deployment, our study extends beyond the limitations of previous works—expanding the research's impact to encompass both methodological innovation and translational relevance in real-world clinical settings.

Discussion

This work introduced NeuroMorphFusion, a novel neuro-inspired hybrid learning method that significantly enhances deep lesion detection in medical CT-based imaging. Utilising a residual CNN backbone with morphological attention mechanisms and benefitting from label-efficient training, the technique achieved better classification accuracy (98.18%), interpretability, and computational complexity than the state-of-the-art models such as VGG16, SqueezeNet, MobileNetV3, and ResNet18. The model's integration of spiking computation not only reduces inference energy by approximately 30% compared to standard CNNs but also introduces temporal coherence that supports biologically plausible decision pathways, thereby bridging computational neuroscience and applied medical AI.

The improved performance of NeuroMorphFusion is attributed to the morphological attention module, which scales spatial features down to focus on diagnostically important regions, thereby facilitating enhanced discrimination between lesions. This is reminiscent of subsequent investigations by Zhou et al¹⁹ and Latif et al,²⁵ who emphasised that biologically based mechanisms of attention are significant in improving diagnostic assurance and spatial interpretability in deep-learning models. Compared to conventional architectures, such as ResNet18, which incorrectly classified 33 normal images, NeuroMorphFusion's close-knit attention led to a significant reduction in false positives, as indicated by its confusion matrix.

From the standpoint of explainability, Grad-CAM visualisations confirmed that the model consistently identified pathological regions in all benign, malignant, and normal samples, affirming clinical trust in the predictions. This finding aligns with studies by Hafeez et al¹¹ and Pahud de Mortanges et al,¹⁵ who emphasised the importance of interpretability as a core pillar of accountable AI in radiology. In contrast to earlier approaches that augmented explainability as an after-the-fact construct, NeuroMorphFusion integrates it into the design, resulting in more transparent and context-aware decision-making.

In comparison to state-of-the-art hybrid deep learning and spiking neural network models (eg, Kumar and Vanmathi.⁹ Gatti et al²⁷ note that our model exhibits a better trade-off between diagnostic accuracy and inference cost. With a training period of just 7.81 min and early convergence at the 10th epoch, NeuroMorphFusion fulfils the resource-frugality demands of IoMT-based diagnostic platforms, outperforming deeper networks such as VGG16, which took over an hour to train. This renders it suitable for real-time or point-of-care applications, particularly in low-resource healthcare systems with limited computational power and specialist availability.

Furthermore, NeuroMorphFusion's multi-objective design aligns with contemporary calls for human-centred and sustainable AI in healthcare by balancing accuracy, interpretability, and resource constraints. This holistic optimization makes it a strong candidate for real-world IoMT systems requiring transparent and efficient lesion analytics.

Clinical and Practical Implications

Clinically, NeuroMorphFusion addresses a pressing requirement for rapid, precise, and interpretable lesion detection in lung CT scans, where speed of detection is critical to patient outcome. The model's ability to detect malignant vs benign lesions with high confidence can assist radiologists in screening out high-risk cases, reducing diagnostic workload, and enhancing inter-observer agreement. Furthermore, the Grad-CAM heatmaps offer visual cues that can be utilised in clinician-AI collaborative settings, where human oversight is crucial. The model will be tested on an Nvidia Jetson Nano and a Raspberry Pi 4 with 4GB RAM, where it is expected to sustain real-time performance without thermal throttling or memory overflow. The lightweight nature and high generalizability of the model make it a suitable candidate for integration into IoMT and edge-AI devices, such as embedded diagnostic devices or portable radiology devices. This makes high-end diagnostic support accessible to remote and underserved regions, aligning with global health equity goals.

Limitations and Future Work

Despite the merits of this study, it has several limitations. The model was validated using a single dataset (IQ-OTHNCCD), and future studies will involve validation on multi-institutional datasets to assess its robustness. Next, we will incorporate semi-supervised reinforcement learning and genetic optimisation modules in their entirety for enhanced label efficiency and dynamic adaptability in heterogeneous deployment settings. Future iterations will compare the proposed model with emerging edge-optimised architectures such as EfficientNet-Lite and TinyML variants. Subsequent studies will investigate the model's robustness under noise injection, adversarial perturbations, and variations in scan quality, simulating real-world constraints.

Conclusion

The present study introduced NeuroMorphFusion, a novel neuro-inspired hybrid deep learning architecture designed for accurate, interpretable, and computationally efficient lesion detection on CT scans. The framework combines a ResNet18 backbone, a morphological attention mechanism, and an ultra-lightweight classification head to simultaneously enhance diagnostic precision, model transparency, and real-time deployability within IoMT-enabled healthcare environments. An experimental evaluation on the IQ-OTHNCCD lung CT dataset demonstrated superior performance, achieving a classification accuracy of 98.18%, a training accuracy of 99.5%, and a validation accuracy of 98.0%, while maintaining a minimal validation loss of 0.06. When benchmarked against established architectures such as VGG16 (96.82%), MobileNetV3 (96.36%), and ResNet18 (85.00%), NeuroMorphFusion exhibited higher robustness, improved class-wise discrimination—particularly for the “normal” class—and reduced false positives.

The model also proved computationally efficient, converging within 7.81 min at epoch 10, significantly outperforming heavier models such as VGG16, which required over an hour to train. Its interpretability was further validated through Grad-CAM heatmaps that reliably localized clinically relevant lesion regions across all classes, reinforcing its potential for clinician-assisted diagnostics. The study sets the foundation for future adaptive healthcare AI systems that autonomously fine-tune parameters through reinforcement feedback and evolutionary optimization, ensuring consistent clinical trust and hardware portability.

Footnotes

Abbreviations

Acknowledgements

Not Applicable.

ORCID iD

Roseline Oluwaseun Ogundokun

Ethics Statement

This study did not involve human participants, human tissue, or animal subjects. Therefore, approval from an institutional review board (IRB) or ethics committee was not required. All data used in this study were publicly available and anonymized, ensuring compliance with ethical standards for secondary data usage.

Authors Contributions

R.O.O.: Conceptualization, Methodology, Software, Data Curation, Writing-Original Draft, Visualization; R.W.B.: Methodology, Data Curation, Formal analysis, Writing-review & editing; P.A.O.: Validation, Investigation, Resources, Writing-review & editing, Supervision; E. V. W.: Resources, Visualization, Validation, Writing-review & editing; C.T.: Software, Formal analysis, Investigation, Supervision, Writing-review & editing.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability

The data presented in this study are openly available in the Kaggle Repository at .

References

Ahmed

Sharmin

Kuldeep

, et al. Transformative impacts of the internet of medical things on modern healthcare. Results Eng. 2024;25:103787.

Kateb

Ragab

Abukhodair

, et al. Improved security for IoT-based remote healthcare systems using deep learning with jellyfish search optimization algorithm. Sci Rep. 2025;15(1):13223.

Ogundokun

Misra

Akinrotimi

Ogul

. MobileNet-SVM: A lightweight deep transfer learning model to diagnose BCH scans for IoMT-based imaging sensors. Sensors. 2023;23(2):656.

Yang

Qiu

, et al. Spatial structure and organization of the medical device industry urban network in China: Evidence from specialized, refined, distinctive, and innovative firms. Front Public Health. 2025;13:1518327. doi:10.3389/fpubh.2025.1518327

Wang

Guan

, et al. UCFNNet: Ulcerative colitis evaluation based on fine-grained lesion learner and noise suppression gating. Comput Methods Programs Biomed. 2024;247:108080. doi: 10.1016/j.cmpb.2024.108080

Wang

Shen

, et al. Enhanced RNA secondary structure prediction through integrative deep learning and structural context analysis. Nucleic Acids Res. 2025;53(11):gkaf533. doi:10.1093/nar/gkaf533

Chen

Liu

Wang

, et al. Cross-modal causal representation learning for radiology report generation. IEEE Trans Image Process. 2025;34:2970-2985. doi:10.1109/TIP.2025.3568746

Muhammad

Bendechache

. Unveiling the black box: a systematic review of explainable artificial intelligence in medical image analysis. Comput Struct Biotechnol J. 2024;24:542-560.

Kumar

Vanmathi

. A hybrid parallel convolutional spiking neural network for enhanced skin cancer detection. Sci Rep. 2025;15(1):11137.

10.

Aziz

Manzoor

Mazhar Qureshi

, et al. Unveiling Explainable AI in Healthcare: Current Trends, Challenges, and Future Directions. medRxiv. 2024:2024-08.

11.

Hafeez

Memon

Al-Quraishi

, et al. Explainable AI in diagnostic radiology for neurological disorders: A systematic review, and what doctors think about it. Diagnostics. 2025;15(2):168.

12.

Najjar

. Redefining radiology: A review of artificial intelligence integration in medical imaging. Diagnostics. 2023;13(17):2760.

13.

Rundo

Militello

. Image biomarkers and explainable AI: Handcrafted features versus deep learned features. Eur Radiol Exp. 2024;8(1):130.

14.

Rosenbacke

Melhus

McKee

Stuckler

. How explainable artificial intelligence can increase or decrease clinicians’ trust in AI applications in health care: Systematic review. JMIR AI. 2024;3:e53207.

15.

Pahud de Mortanges

Luo

Shu

, et al. Orchestrating explainable artificial intelligence for multimodal and longitudinal data in medical imaging. NPJ Digit Med. 2024;7(1):195.

16.

Khan

Rizwan

Bagdasar

, et al. Deep learning-driven anomaly detection for IoMT-based smart healthcare systems. Comput Model Eng Sci. 2024;141(3):2121-2141.

17.

Ding

Zhang

, et al. A supervised explainable machine learning model for perioperative neurocognitive disorder in liver-transplantation patients. J Med Internet Res. 2025;27:1-13. doi:10.2196/55046

18.

Shaikh

Wang

Sima

MWU

, et al. A deep reinforcement learning-based robust intrusion detection system for securing IoMT healthcare networks. Front Med. 2025;12:1524286.

19.

Zhou

Zhang

, et al. Direct training high-performance deep spiking neural networks: A review of theories and methods. Front Neurosci. 2024;18:1383844.

20.

Eshraghian

Ward

Neftci

, et al. Training spiking neural networks using lessons from deep learning. Proc IEEE. 2023;111(9):1016-1054.

21.

Liu

Shi

, et al. Classification of three anesthesia stages based on near-infrared spectroscopy signals. IEEE J Biomed Health Inform. 2024;28(9):5270-5279. doi:10.1109/JBHI.2024.3409163

22.

Zhang

Liu

Chen

, et al. Effects of web-based acceptance and commitment therapy on health-related outcomes among patients with lung cancer. Psychooncology. 2024;33(12):e70045. doi:10.1002/pon.70045

23.

Fan

, et al. RAT Ring: Event driven publish/subscribe communication protocol for IIoT. IEEE Trans Ind Inform. 2025a:1-9. doi:10.1109/TII.2025.3567265

24.

Fan

, et al. Anonymity-enhanced sequential multi-signer ring signature for secure medical data sharing in IoMT. IEEE Trans Inf Forensics Secur. 2025b;20:5647-5662. doi:10.1109/TIFS.2025.3574959

25.

Latif

Brohi

Ahmad

, et al. Explainable AI in medical imaging: An interpretable and collaborative federated learning model for brain tumor classification. Front Oncol. 2025;15:1535478.

26.

Nazari

Amiri

. An accurate and fast learning approach in the biologically spiking neural network. Sci Rep. 2025;15(1):6585.

27.

Gatti

Barbato

Zandron

. Spiking neural network classification of X-ray chest images. Knowl Based Syst. 2025;314:113194.

28.

Sahoo

Rani-Panigrahi

Pati

. IoMT-enabled smart healthcare models to monitor critical patients using deep learning algorithms: A review. Comput Sist. 2024;28(4):1823-1832.

29.

Gbadamosi

Ogundokun

Adeniyi

, et al. Medical data analysis for IoT-based datasets in the cloud using naïve Bayes classifier for prediction of heart disease. In: Buyya

Garg

Fortino

Misra

eds. New Frontiers in Cloud Computing and Internet of Things. Springer, Cham; 2022:365-386.

30.

Ogundokun

Babatunde

, et al. Enhancing skin cancer detection and classification in dermoscopic images through concatenated MobileNetV2 and Xception models. Bioengineering. 2023;10(8):979.

31.

Kordnoori

Sabeti

Mostafaei

Banihashemi

. Advances in medical image analysis: A comprehensive survey of lung infection detection. IET Image Process. 2024;18(13):3750-3800.

32.

Zheng

Henaff

Amos

Grover

. Semi-supervised offline reinforcement learning with action-free trajectories. Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023.

33.

Di Caterina

Zhang

Liu

. Theoretical advances and practical applications of spiking neural networks. Front Neurosci. 2024;18:1406502.

34.

Yang

Qiu

, et al. Innovation networks in the advanced medical equipment industry: Supporting regional digital health systems from a local-national perspective. Front Public Health. 2025;13:1635475. doi: 10.3389/fpubh.2025.1635475

35.

Zhong

, et al. Nanozyme-enhanced tyramine signal amplification probe for preamplification-free myocarditis-related miRNAs detection. Chem Eng J. 2025;503:158093. doi: 10.1016/j.cej.2024.158093

36.

Fang

Zhang

, et al. Enhancing the sensitivity of spin-exchange relaxation-free magnetometers using phase-modulated pump light with external Gaussian noise. Opt Express. 2024;32(19):33378-33390. doi: 10.1364/OE.530764

37.

Luan

Lei

, et al. Deep learning for fast super-resolution ultrasound microvessel imaging. Physics in Medicine & Biology. 2023;68(24):245023. doi: 10.1088/1361-6560/ad0a5a

38.

Luan

Lei

, et al. Deep learning for fast denoising filtering in ultrasound localization microscopy. Physics in Medicine & Biology. 2023;68(20):205002. doi: 10.1088/1361-6560/acf98f

39.

Jin

Ouyang

Chen

. A novel dynamic hill cipher and its applications on medical IoT. IEEE Internet Things J. 2025;12(10):14297-14308. doi: 10.1109/JIOT.2025.3525623

40.

Cao

Jin

Zhang

Chen

. A review of zeroing neural network: Theory, algorithm and application. Neurocomputing. 2025;643:130425. doi: 10.1016/j.neucom.2025.130425

41.

Xiang

Jiang

, et al. Multimodal masked autoencoder based on adaptive masking for vitiligo stage classification. J Imaging Inform Med. 2025. doi:10.1007/s10278-025-01521-7

42.

Jiang

Xiang

, et al. White patchy skin lesion classification using feature enhancement and interaction transformer module. Biomed Signal Process Control. 2025;107:107819. doi: 10.1016/j.bspc.2025.107819

43.

Zhu

Feng

, et al. Neuromorphic-enabled video-activated cell sorting. Nat Commun. 2024;15(1):10792. doi:10.1038/s41467-024-55094-0

44.

Zhang

Wei

Berry

Huang

. Age-dependent differential privacy. IEEE Trans Inf Theory. 2024;70(2):1300-1319. doi:10.1109/TIT.2023.3340147

45.

Amiri

Heidari

Zavvar

Navimipour

Esmaeilpour

. The applications of nature-inspired algorithms in internet of things-based healthcare service: A systematic literature review. Transactions on Emerging Telecommunications Technologies. 2024;35(6):e4969.

46.

Wang

Toumaj

Heidari

Souri

Jafari

Jiang

. Neurodegenerative disorders: A holistic study of the explainable artificial intelligence applications. Eng Appl Artif Intell. 2025;153:110752.

47.

Amiri

Heidari

Jafari

Hosseinzadeh

. Deep study on autonomous learning techniques for complex pattern recognition in interconnected information systems. Computer Science Review. 2024;54:100666.

48.

Lei

Dong

Liang

, et al. Parallel resonant magnetic field generator for biomedical applications. IEEE Trans Biomed Circuits Syst. 2025;19(3):496-510. doi: 10.1109/TBCAS.2024.3450881

49.

Wang

, et al. Outlier detection using iterative adaptive mini-minimum spanning tree generation with applications on medical data. Front Physiol. 2023;14:1233341. doi: 10.3389/fphys.2023.1233341

50.

Tang

Wang

Xie

, et al. Classification of distinct tendinopathy subtypes for precision therapeutics. Nat Commun. 2024;15(1):9460. doi: 10.1038/s41467-024-53826-w.

NeuroMorphFusion: A Neuro-Inspired Hybrid Learning Framework for Interpretable Deep Lesion Detection in IoT-Enabled Healthcare Systems

Abstract

Introduction

Methods

Results

Conclusion

Keywords

Introduction

Related Works

Materials and Methods

Dataset Description

Data Preparation

Genetic Algorithm-based Hyperparameter Optimization

Proposed NeuroMorphFusion Architecture

Statistical Analysis

Training Configuration

Performance Evaluation Metrics

Explainability with Grad-CAM

Results

Classification Performance

Training and Validation Dynamics

Confusion Matrix Analysis

Computational Efficiency

Visual Explainability Via Grad-CAM

Comparative Analysis with Existing Studies

Discussion

Clinical and Practical Implications

Limitations and Future Work

Conclusion

Footnotes

Abbreviations

Acknowledgements

ORCID iD

Ethics Statement

Authors Contributions

Funding

Declaration of Conflicting Interests

Data Availability

References