Deep Hybrid CNN–BiLSTM–Attention Model for EEG Classification Using Wavelet Features

Abstract

Background and Purpose

Electroencephalography (EEG) is a popular non-invasive method for studying brain dynamics because of its excellent temporal resolution. However, the non-stationarity, intersubject variability and class imbalance of EEG data, make it difficult to automatically discriminate between brain states that correspond to various cognitive or sensory circumstances. With the use of a deep hybrid convolutional neural networks (CNNs) and bidirectional long short-term memory (BiLSTM) networks with an attention architecture intended to improve discriminative learning from wavelet-based time–frequency features, this study attempts to categorise EEG recordings into discrete brain states recorded prior to and during auditory (mantra) stimulation.

Methods

Experienced practitioners (mean age: 37 ± 6 years; mean practice: 5 years) had their EEG data recorded in two different experimental settings: (a) When they were at rest before the auditory stimulus and (b) while they were listening to mantras. Each segment’s time–frequency representations were produced using wavelet transforms and fed into a hybrid model that combined convolutional, recurrent and attention layers. To guarantee steady convergence, adaptive learning rate scheduling and early stopping were used in the model optimisation process.

Results

With CNN (76.92%), long short-term memory (LSTM) (75.30%), CNN+LSTM (84.62%) and CNN+BiLSTM (88.65%), baseline models performed moderately. The suggested CNN–BiLSTM–Attention model achieved an independent test accuracy of 99.46%, greatly outperforming all baselines. High discriminative capability was confirmed by the receiver operating characteristic (ROC) analysis, which produced an AUC near 1.0.

Conclusion

The inclusion of convolutional, recurrent and attention methods greatly improves spatial–temporal feature learning, as demonstrated by the suggested framework’s ability to distinguish between resting and during mantra EEG states. These results demonstrate the model’s resilience and possible use in neurophysiological monitoring and real-time cognitive state detection.

Keywords

Convolutional neural network (CNN)long short-term memory (LSTM)mantra meditation OM wavelet

Introduction

A popular method for tracking brain activity is electroen-cephalography (EEG), which is non-invasive, portable and has a high temporal resolution.¹ Clinical diagnosis, emotional computing, brain–computer interfaces (BCIs) and cognitive neuroscience have all found use for it.^{2, 3} Nevertheless, accurate classification when the data is very less, is a difficult issue since EEG signals are intrinsically non-stationary, extremely sensitive to noise and show substantial inter-subject variability.⁴ Conventional EEG classification methods use classifiers such as support vector machine (SVM) and k-nearest neighbors (k-NN) after manually constructed features, including power spectral density, entropy and wavelet coefficients.⁵

Although these techniques work effectively under controlled circumstances, their limited capacity to capture intricate spatial–temporal patterns frequently results in poor generalisation. Additionally, they perform poorly when the dataset is unbalanced.⁶ In EEG analysis, deep learning techniques have demonstrated great promise⁷; however, they require huge amounts of data. While recurrent neural networks (RNNs), especially long short-term memory (LSTM) and its bidirectional variation (BiLSTM), capture temporal dynamics of brain activity, convolutional neural networks (CNNs) are good at simulating spatial correlations across electrode channels.^{8, 9}

According to recent research,^{10, 11} hybrid CNN–RNN architectures can improve categorisation by utilising both temporal and spatial representations. EEG-based models have been further improved by attention processes, which adaptively highlight important temporal or spectral aspects.¹² Wavelet transform has also demonstrated efficacy in breaking down non-stationary EEG into useful time–frequency representations, frequently surpassing Fourier-based methods in clinical and cognitive EEG tasks.¹³ In addition to conventional EEG classification, an increasing amount of research has examined how aural stimuli, especially mantras and meditation sounds, affect brain activity.

Research on the sacred phrase ‘OM’ has revealed deep relaxation states, improved brain synchronisation and higher alpha and theta oscillations.^14–16 These results imply that unique spatial–temporal patterns in EEG data captured during mantra-based auditory stimulation can be used for classification. This article presents a hybrid CNN–BiLSTM–Attention model for wavelet-based feature-based EEG categorisation. This method enables robust feature learning across spatial, temporal and discriminative dimensions by integrating convolutional, recurrent and attention mechanisms, in contrast to previous efforts. The model was evaluated on EEG signals recorded before and during mantra stimulus conditions, achieving a test accuracy of 99.46%, significantly outperforming baseline models and highlighting the potential of deep hybrid frameworks for meditation-related EEG research.

There are still several modelling research gaps in EEG-based deep learning, despite significant advancements. The integrated modelling of spatial and temporal relationships is limited by the fact that much current research only use CNN or RNN architectures. Moreover, EEG classification frameworks rarely incorporate attention mechanisms, which can improve discriminative feature selection. Furthermore, wavelet-based representations are often neglected in hybrid deep networks, especially for paradigms related to cognition or meditation, such as mantra listening, despite their ability to manage non-stationary EEG inputs. The creation of an architecture that can successfully combine spatial, temporal and discriminative data for reliable EEG classification in the face of sparse and unbalanced data is motivated by these gaps. The following are the main contributions of this study:

A deep hybrid CNN–BiLSTM–Attention model is proposed to simultaneously extract spatial, temporal and discriminative representations from EEG signals.

Wavelet-based time–frequency features are employed as model inputs to enhance interpretability and effectively handle signal non-stationarity.

On mantra-related EEG data, the proposed model outperforms baseline architectures such as CNN, LSTM, CNN+LSTM and CNN+BiLSTM in classification performance.

The study provides empirical evidence that attention-enhanced hybrid architectures can improve the decoding of brain states during auditory meditation tasks.

This is how the rest of the article is structured: The literature survey is described in Section 2. Section 3 describes the training methodology and the CNN–BiLSTM–Attention architecture suggested. Comparative studies, including baseline models and experimental results, are presented in Section 4. Section 6 wraps up with important findings and suggestions for further research, while Section 5 includes discussion and comparative studies.

Literature Survey

Handcrafted characteristics such as power spectral density and entropy were used in early EEG classification experiments in conjunction with machine learning classifiers.^{17, 18} Despite their effectiveness, the performance of these methods was limited in difficult tasks because they were unable to capture higher-level representations. CNNs have shown success in capturing spatial interdependence across EEG channels since the development of deep learning.¹⁹ However, LSTM and BiLSTM networks are prone to disappearing gradients in lengthy sequences, making them ideal for modelling temporal dependencies.²⁰

To increase classification accuracy in EEG tasks, hybrid CNN–RNN architectures that incorporate spatial and temporal learning have been developed. By concentrating on the most pertinent EEG features, attention mechanisms have been added more recently to improve discriminative power.²¹ Wavelet transforms offer an effective method for time–frequency decomposition because of the non-stationary character of EEG. Wavelet-based representations are more popular than Fourier approaches in classification, according to a number of studies.^22–25

Despite these developments, there has not been much research done on combining wavelet-transformed EEG characteristics with CNN, BiLSTM, and attention. Simultaneously, there has been an increase in curiosity about how auditory stimuli, especially mantras and meditative sounds, affect brain dynamics. According to neurophysiological research^{26, 27}, reciting the sacred syllable ‘OM’ causes notable alterations in EEG activity, such as an increase in alpha and theta power and improved cortical area synchronisation. Reduced stress reactions, attentional modulation and relaxation are all associated with these oscillatory alterations.

Furthermore, the significance of limbic and temporal lobe structures in producing altered states of consciousness has been emphasised by EEG investigations on auditory stimuli processing and mantra meditation.²⁸ EEG recordings made during auditory or contemplative exercises have also been subjected to machine learning and deep learning techniques.²⁷ For instance, it was shown that CNN-based models could accurately distinguish EEG states brought on by listening to music.²⁹

Research³⁰ demonstrated that RNNs could be used to consistently discriminate between different brain responses to mantra repetition. These results support deep neural architectures’ capacity to reveal minute neuronal patterns connected to auditory stimuli. However, the existing literature remains fragmented: Traditional handcrafted methods struggle to generalise, while deep learning models often overlook the combined importance of spatial, temporal and discriminative feature extraction. Additionally, very few studies have examined EEG categorisation under thoughtful audio stimuli using advanced hybrid deep learning.

However, aberrations from muscle activity, eye blinks and other physiological or ambient noise sources frequently taint EEG recordings. Classification accuracy is negatively impacted by these aberrations, which alter neuronal signals.³¹ Therefore, removing artefacts effectively is an essential pre-processing step. Although conventional techniques such as adaptive filtering and independent component analysis (ICA) are frequently employed, more sophisticated strategies have recently been created. While deep CNNs have been successfully used for automated identification and removal of different EEG artefacts, wavelet-based filtering in conjunction with meta-heuristic optimisation has been demonstrated to efficiently suppress muscular artefacts.³² These developments emphasise the significance of strong denoising pipelines, especially in EEG investigations pertaining to cognition and meditation where minute oscillatory variations need to be maintained.

EEG analysis continues to benefit greatly from both deep learning and conventional machine learning techniques. Algorithms such as support vector machines, random forests and k-nearest neighbours have been widely employed using handcrafted spectral, temporal and entropy-based features. By using sophisticated signal decomposition methods such as variational mode decomposition, which facilitates the effective extraction of frequency-domain information for classification tasks, recent studies^{33, 34} have improved these frameworks.

By suggesting a hybrid CNN–BiLSTM–Attention architecture trained on wavelet-transformed EEG data, the present study fills this crucial gap. The suggested model provides a strong foundation for EEG analysis in meditation research by integrating spatial, temporal and attention-based mechanisms to attain state-of-the-art performance in differentiating EEG states before to and during mantra-based aural stimuli.

Methods

Data Acquisition

Eleven experienced mantra practitioners provided EEG recordings. Every participant was female, in the age range of 37 ± 6 years and had an average of 5 years of practice. These people were selected from a group of people who chanted mantras. To guarantee a healthy group of participants, people were screened before participating to eliminate those with any documented history of neurological or psychiatric conditions. Informed consent was obtained before beginning any experiment. The Emotiv Flex 32-channel EEG device (Emotiv Inc., USA, 128 Hz sampling rate, based on the 10–20 system) was used to capture the data. The electrode impedance was kept at < 10 kΩ during the recording process, which took place in a silent room. There were two different conditions in the experiment: Before the stimulus (the resting state), participants were instructed to sit comfortably, close their eyes and feel at ease. The EEG was recorded for one minute. EEG recordings were made for two minutes while the participants listened to the mantra ‘Om’ using an audio device during the stimulus (mantra chanting state), which came after the resting state. For each condition, each subject had three trials. A multi-channel EEG matrix (channels × samples) was included in each file (all signals were sampled at 128 Hz and saved in mat format).

Data Pre-processing

The pre-processing pipeline was created to convert unstructured EEG data into deep learning-ready representations. Before feature extraction and classification, the unprocessed EEG data underwent a systematic preprocessing workflow. The actions listed below were taken:

Band-pass filtering: Signals were filtered between 0.5 and 45 Hz to remove slow drifts and high-frequency noise:

x_{b p} (t) = \sum_{k = 0}^{N} b_{k} x (t - k) - \sum_{m = 1}^{M} a_{m} x_{b p} (t - m)

(1)

where x_bp(t) is the band-pass filtered signal, x(t) is the raw EEG signal, a_m and b_k are the infinite impulse response (IIR) filter coefficients, N and M are the filter orders.

Notch filtering: To remove 50 Hz powerline noise, a digital notch filter was applied:

H (z) = \frac{1 - 2 \cos (\frac{2 π f_{0}}{f_{s}}) z^{- 1} + z^{- 2}}{1 - 2 r \cos (\frac{2 π f_{0}}{f_{s}}) z^{- 1} + r^{2} z^{- 2}}

(2)

where H(z) is the notch filter transfer function, f₀ = 50 Hz is the notch frequency, f_s = 128 Hz is the sampling rate and r = 0.95 is the pole radius controlling band-width.

ICA: The observed EEG matrix X was modelled as a linear mixture of statistically independent sources:

X = A S

(3)

where A is the mixing matrix and S contains the independent source components. An unmixing matrix W was estimated such that:

S = W X

(4)

Components corresponding to ocular and muscle artefacts were visually identified and removed.

Segmentation: The continuous EEG signal was divided into non-overlapping epochs:

x_{i} (t) = x (t + i T_{s}), t \in [0, T_{e}]

(5)

where x_i(t) denotes the ith EEG epoch, T_e is the epoch duration (window length) and T_s is the step size.

Baseline correction: For each epoch, the mean baseline was subtracted to remove DC offset:

x_{c o r r} (t) = x (t) - μ, μ = \frac{1}{N} \sum_{t = 1}^{N} x (t)

(6)

where x_corr(t) is the baseline-corrected signal and μ is the mean amplitude of the segment.

Re-referencing: EEG signals were re-referenced using the common average reference (CAR) method:

x_{i}^{C A R} (t) = x_{i} (t) - \frac{1}{M} \sum_{j = 1}^{M} x_{j} (t)

(7)

where $x_{i}^{C A R} (t)$ is the re-referenced signal of channel i and M is the total number of EEG channels.

Segmentation into Fixed Windows

Since EEG is a continuous signal, it was divided into short temporal segments to capture non-stationary dynamics. A window length of 5 seconds (640 samples) was used. The choice of a 5-second window was based on previous studies.^{35, 36} Non-overlapping windows were extracted, producing multiple trials per recording. Each segment retained the full set of EEG channels.

Wavelet Decomposition

The Daubechies-4 (db4) wavelet with five levels of decomposition was used in the discrete wavelet transform (DWT) to extract time–frequency information. Previous EEG research^{37, 38} has frequently adopted db4 due to its capacity to imitate EEG waveform morphology and maintain energy distribution across frequency bands. To capture oscillations at increasingly smaller frequency bands, each channel segment was broken down into five detail coefficients and one approximation coefficient. For consistency, all wavelet coefficients were zero-padded to a constant length because they differ in length between levels. Each segment’s outcome was a 3D tensor of shape: (Channels, padded_length, levels+1). The overall system flow is described in Figure 1.

Figure 1.

Overview of the Proposed Pipeline Used in the Present Study.

Data Storage and Indexing

A NumPy.npy file was used to store each wavelet-decomposed window. A structured comma-separated values (CSV) index file contained the following metadata: Condition, subject, trial, channel count, decomposition levels and window length. Later, this index was employed to effectively load data for model training.

Data Preparation

These wavelet-transformed EEG data were arranged into structured numpy arrays with dimensions (32 × 6 × 323), which correspond to 323 time steps, 32 channels and 6 frequency sub-bands, respectively. The signals under two experimental conditions (before and during) were indexed by the metadata. Using stratified sampling to maintain class proportions, the dataset was divided into 70% training, 15% validation and 15% testing to ensure robust evaluation. After splitting, the dataset comprised 862 training samples, 185 validation samples and 185 test samples.

Handling Class Imbalance

There was a moderate imbalance in the dataset, with 64.53% samples in the during condition and 35.47% in the before condition. To avoid model bias in favour of the majority class, class weights were computed using the compute_class_weight function. The resulting weights were: before: 1.408 and during: 0.775. These weights were added during training to increase the penalty for incorrectly classifying the minority group.

Hybrid CNN–BiLSTM–Attention Model

A model that combines CNN, attention layer and BiL-STM was created to capture both temporal and spatial relationships in EEG signals. The architecture consists of the following modules:

CNN: A 3D-CNN branch extracted spatial and spectral features across channels and frequency sub-bands. Two convolutional layers were employed: (i) Conv3D with 32 filters (kernel: 3 × 3 × 1) followed by batch normalisation and ReLU activation; (ii) Conv3D with 64 filters (kernel: 3 × 3 × 1) followed by batch normalisation and dropout.

BiLSTM: The CNN output was then reshaped and passed through a BiLSTM layer with 128 units. BiL-STMs help in capturing temporal dependencies in both forward and backward directions, which is important for EEG signals.

Attention mechanism: An attention layer was applied on the outputs of BiLSTM to assign higher weights to the most informative time steps, enabling the model to focus on salient temporal patterns.

Classifier: A fully connected dense layer with softmax activation produced the final class probabilities (before vs. during).

Training Strategy

The model was trained with an optimiser (Adam) with an initial learning rate of 1 × 10⁻³, sparse categorical cross-entropy loss and a batch size of 16. To improve the convergence and prevent overfitting, early stopping was employed that monitored validation loss with a patience of 8 epochs and ReduceLROnPlateau halved the learning rate after 4 stagnant epochs and Dropout layers (rate = 0.3) were used after CNN and BiLSTM layers.

Evaluation Metrics

Based on the independent test set, the accuracy, confusion matrix and receiver operating characteristic-area under the curve (ROC-AUC) curve(see Figure 3) were used to assess the performance of the suggested model. Plots of accuracy and loss from training histories were also examined to track convergence and generalisation effectiveness. Additionally, the baseline models in Table 1 were used to ensure a fair comparison. It should be mentioned that the training circumstances for each model were identical.

Table 1.

Performance Comparison of Baseline Models with Classification Metrics.

Model	Precision	Recall	F1-score	Accuracy (%)
CNN	0.75	0.72	0.66	72.00
LSTM	0.71	0.71	0.71	71.00
CNN + LSTM	0.80	0.79	0.78	79.00
CNN + BiLSTM	0.85	0.83	0.82	83.00
Proposed model	0.99	1.00	0.99	99.46

Results

The suggested CNN–BiLSTM–Attention model was trained using adaptive learning rate scheduling and early stopping for a maximum of 20 epochs (see Figure 2 for the model’s architecture). Figure 3 shows the performance of training and validation throughout epochs. By the fifth epoch, the model had rapidly converged and shown constant improvement. The validation loss decreased significantly from 1.0424 to 0.0949 during the first epochs, while the validation accuracy increased from 64.32% to 96.76% by the fourth epoch. The network reached near-saturation performance starting in epoch 5, with little volatility in validation accuracy above 97%. The learning rate reduction, which was started at epoch 14, greatly stabilised the training and guaranteed smooth convergence. The model had a validation loss of 0.0479 and a validation accuracy of 99.46% at the conclusion of training. The model’s accuracy of 99.46% on the independent test set demonstrated that it could effectively generalise to new data. The confusion matrix, displayed in Figure 3 (c), reveals a high level of separability between the before and during circumstances. With an AUC near 1.0, the ROC curve in Figure 3 (a) provides additional evidence of the suggested framework’s discriminative power. Overall, the findings show that the hybrid CNN–BiLSTM–Attention architecture achieves strong classification performance even in the presence of class imbalance by efficiently capturing spatial, temporal and discriminative characteristics from wavelet-transformed EEG signals.

Figure 2.

Overview of the Proposed Model Architecture.

Figure 3.

Performance Evaluation of the Proposed Hybrid CNN–BiLSTM–Attention Model.

A number of baseline tests were carried out to determine reference performance before to assessing the suggested CNN–BiLSTM–Attention architecture with class imbalance handling. The baseline models’ test accuracies are compiled in Table 1. The comparative study of the suggested model with the baseline and other hybrid models is summarized in Table 1, and the comparison of the suggested work with recent literature is summarized in Table 2.

Table 2.

Summary of Contemporary Work Related to the Proposed Work Where OM is Used as Stimuli.

Reference	Accuracy	Models Used	Subjects
¹⁵	70%	SVM and delta band	23
¹⁶	Not reported	Cortical areas activation	20
²⁸	Not reported	Neurohemodynamic analysis	12
²⁷	Not reported	Galvanic skin response	20
²⁹	Not reported	Feature analysis	22

Note: SVM: Support vector machine.

The traditional CNN and LSTM models performed rather well, with accuracies of 72.00%. When CNN was integrated with recurrent layers (CNN+LSTM and CNN+BiLSTM), performance increased to 79.00% and 83.00%, respectively. However, with a test accuracy of 99.46%, the suggested CNN–BiLSTM–Attention model performed noticeably better than all baselines. Even in situations when there is a class imbalance, this significant improvement demonstrates how well the hybrid architecture captures spatial, temporal and discriminative representations of EEG signals. The findings show that whereas CNN and LSTM performed rather well separately, when combined to construct a hybrid CNN+BiLSTM, classification accuracy increased to 79.00%. Although class imbalance remained a limiting problem, the CNN+BiLSTM+Attention model’s performance was further enhanced to 83.00% with the addition of an attention mechanism.

Discussion

This section showcases the comparative analysis that emphasises how important it is to combine attention-based, recurrent and convolutional mechanisms for reliable EEG categorisation (Please refer Table 2). The CNN–BiLSTM–Attention hybrid model outperformed with class imbalance treatment, obtaining a test accuracy of 99.46% as opposed to 72.00% for CNN, 71.00% for LSTM, 79.00% for CNN+LSTM and 83.00% for CNN+BiLSTM. Two key elements are responsible for this significant improvement: (a) The synergistic combination of discriminative focusing (attention), temporal modelling (BiLSTM) and spatial feature extraction (CNN); and (b) the explicit treatment of class imbalance, which guaranteed stable learning across underrepresented conditions. Recent developments in deep learning for EEG analysis align with these results. Recurrent models, especially LSTM and BiLSTM, have shown promise in capturing temporal dependencies of brain activity,³⁰ whereas CNN-based architectures, such as EEGNet, have shown strong generalisation in motor imagery and event-related potential (ERP) research.³⁹ By simultaneously simulating spatial and temporal dynamics, hybrid CNN–RNN models have been shown to improve performance.⁴⁰ The significance of selectively weighting salient features has been further underscored by the integration of attention mechanisms in EEG meditation stage classification.⁴¹ The current findings support this line of evidence by demonstrating that significant gains in discriminative power can be achieved by applying attention on top of CNN–BiLSTM networks. The application of wavelet-based time–frequency characteristics is a further vital component. Since EEG signals are by nature non-stationary, brief oscillatory fluctuations are frequently missed by Fourier-based spectrum approaches. Wavelet transformations have been proven to give higher time–frequency resolution, boosting classification accuracy across a number of cognitive and clinical EEG applications.⁴² The impressive outcomes presented support the usefulness of wavelet-based pre-processing for representing EEG features. Additional relevance is added by the setting of mantra-based auditory stimulation. Chanting the phrase ‘OM’ has been shown in previous neurophysiological investigations to raise alpha and theta power, improve cortical synchronisation, and modify limbic and prefrontal circuits.^{43, 44} The current results, on the other hand, show that deep hybrid models with wavelet features can achieve nearly flawless classification, suggesting that sophisticated neural fingerprints of states caused by mantras can be efficiently captured by contemporary architectures. All things considered, the data points to the constant and discriminative brain responses that aural stimuli such as OM chanting produce, making them ideal for deep learning-based classification. The findings advance the methodological advancement of EEG analysis as well as the neuroscientific comprehension of the brain dynamics associated with mantra. Despite the excellent classification accuracy shown in this work, some uncertainties and restrictions should be noted. Despite pre-processing and artefact rejection, EEG recordings are naturally susceptible to noise, motion artefacts and inter-subject variability, which could cause changes in the derived features. Similar to this, the deep learning models employed here rely on stochastic optimisation, hyperparameter tweaking and parameter initialisation, all of which can affect generalisation and convergence. More extensive validation across a range of demographics and stimuli is required because the study was restricted to a moderate sample size and a particular experimental paradigm utilising OM chanting. In the future, to improve robustness and interpretability, future research should examine cross-subject transfer learning, multimodal fusion (such as EEG–fNIRS or EEG–galvanic skin response (GSR)) and adaptive and explainable architectures. Sensitivity analysis could be used to further quantify the effects of noise and parameter modifications, strengthening the case for the validity of EEG-based classification of meditation states.

Ablation Study Observations

Table 3 summarizes the ablation study of the proposed model. The internal ablation study offers quantifiable proof of the essential components of the suggested hybrid architecture’s functional need. Most significantly, the model’s performance drastically declined when the Attention module was removed, with the Matthews Correlation Coefficient (MCC) falling from 98.82% to 41.90% and the test accuracy falling from 0.9946 (full model) to 0.7459. This result demonstrates that to efficiently aggregate features across the BiLSTM output sequence and attain high discriminative power, the selective temporal weighting strategy is essential. The study found that the dilated convolutions component is not necessary, even though the full model exhibits near-perfect classification performance (accuracy ∼ 0.995) with the fastest training time (777.92 s, excluding the ’No Spatial Mixing’ variant). Removing it resulted in a slight decrease in accuracy (to 0.9892), but an unexpectedly large increase in training duration (to 1013.00 s). This shows that standard convolutional kernel connectivity may be marginally superior or that the specific dilation rate utilised did not boost feature extraction for this wavelet representation. The results collectively demonstrate that the temporal sequence-to-feature mapping controlled by the attention mechanism and the synergistically combined CNN feature extraction are essential structural prerequisites for optimising the model’s capacity to distinguish between the various EEG states recorded in this particular dataset.

Table 3.

Ablation Study Results Summary.

Configuration	Acc	F1_w	ROC	MCC_%	Kappa_%	TT (s)
Full model	0.9946	0.9946	1.0000	98.82	98.81	777.92
No dilated convolution	0.9892	0.9892	0.9997	97.63	97.63	1,013.00
No attention	0.7459	0.7368	0.7941	41.90	41.15	1,271.13
No spatial mixing	0.9676	0.9673	0.9990	92.89	92.78	708.64
No laplacian regularisation	0.9838	0.9838	0.9988	96.44	96.43	1,210.14

Note: TT = Train Time, Acc = Accuracy. Sample sizes: n_train = 862, n_val = 185, n_test = 185.

Conclusion

Using wavelet-based features, a CNN–BiLSTM–Attention hybrid model was created for EEG classification in the presence of mantra stimuli. With a test accuracy of 99.46%, the framework significantly outperformed the hybrid, LSTM and traditional CNN baselines. Attaining state-of-the-art performance required the explicit treatment of class imbalance in conjunction with the integration of spatial, temporal and attention-based methods. The results demonstrate that OM chanting and associated mantra practices result in consistent, objectively quantifiable changes in EEG dynamics from a neurocognitive standpoint. From a computational standpoint, the study shows that a strong and broadly applicable method for non-stationary EEG signals may be obtained by fusing deep hybrid architectures with wavelet-based time–frequency analysis. This approach should be expanded in future studies to include real-time BCI applications, cross-subject generalisation investigations and other meditation modalities. The findings suggest that deep learning techniques can significantly advance applied cognitive monitoring as well as EEG-based meditation research.

Footnotes

Acknowledgment

The final version has been authorised by all of the authors. The authors thank the Department of Information Technology for their support in conducting research.

Authors’ Contributions

Tony Bayan: Formal analysis, conceptualisation, writing–original draft, data curation, editing, analysis.

Daisy Das: Formal analysis, conceptualisation, writing—original draft, data curation, editing.

Nabamita Deb: Conceptualisation, editing, supervision.

Statement of Ethics

The Gauhati University Ethics Committee approved the research presented in the article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The authors received no financial support for the research, authorship and/or publication of this article.

Data Availability

Data may be requested to the corresponding author.

Patient Consent

All the subjects signed informed consent before recording EEG.

References

Balconi

, Acconito

and Angioletti

A preliminary EEG study on persuasive communication towards groupness. Sci Rep 2025; 15(1): 6242. DOI: 10.1038/s41598-025-9 0301-Y

Hameed

, Khan

, Ahmed

, . Enhancing motor imagery EEG signal decoding through Machine Learning: A systematic review of recent progress. Comput Biol Med 2025; 185: 109534. DOI: 10.1016/j.compbiomed.2024.109534.

Gkintoni

, Aroutzidis

, Antonopoulou

, . From neural networks to emotional networks: A systematic review of EEG-based emotion recognition in cognitive neuroscience and real-world applications. Brain Sci 2025; 15(3): 220. DOI: 10.3390/brainsci15030220

Hrtonova

, Jaber

, Nejedly

, . The class imbalance problem in automatic localization of the epileptogenic zone for epilepsy surgery: A systematic review. J Neural Eng 2025; 22(3): 031002. DOI: 10.1088/1741-2552/ade28c.

Shaábani

, Fuad

, Jamal

, . KNN and SVM classification for EEG: A Review. Lecture Notes in Electrical Engineering 2020; 555–565. DOI: 10.1007/978-981-15-2317-547

Wang

, Han

, Li

, . Review of classification methods on unbalanced data sets. IEEE Access 2021; 9: 64606–64628. DOI: 10.1109/access.2021.3074243

, Lee

, Jung

, . Deep learning for EEG data analytics: A survey. Concurrency Comput Pract Exper 2019; 32(18): e5199. DOI: 10.1002/cpe.5199

, Ding

, Zhang

, . Motor imagery EEG classification algorithm based on CNN-LSTM feature Fusion Network. Biomed Signal Process Control 2022; 72: 103342. DOI: 10.1016/j.bspc.2021.103342

J-H

Jeong

, K-H

Shim

, D-J

Kim

, . Brain-controlled robotic arm system based on multi-directional CNN-BILSTM network using EEG signals. IEEE Trans Neural Syst Rehabil Eng 2020; 28(5): 1226–1238. DOI: 10.1109/tn sre.2020.2981659

10.

Das

, Singh

, Kim

, . Enhanced EEG signal classification in brain computer interfaces using hybrid deep learning models. Sci Rep 2025; 15(1): 27161. DOI: 10.1038/s41598-025-07427-2

11.

Dhiman

and Soni

A deep learning framework for network intrusion detection using hybrid CNN-BIGRU-multihead attention model. In: 2025 International Conference on Electronics, AI and Computing (EAIC) . 2025; 1–7. DOI:10.1109/eaic66483.2025.11101300

12.

Zhao

and Zhu

TMSA-Net: A novel attention mechanism for improved motor imagery EEG signal processing. Biomed Signal Process Control 2025; 102: 107189. DOI: 10.1016/j.bspc.2024.107189

13.

Upadhyay

, Padhy

and Kankar

PK.

Alcoholism diagnosis from EEG signals using continuous wavelet transform. In: 2014 Annual IEEE India Conference (INDICON) . 2014; 1–5. DOI: 10.1109/indicon.2014.7030476

14.

Harne

and Hiwale

AS.

EEG spectral analysis on Om mantra meditation: A pilot study. Appl Psychophysiology Biofeedback 2018; 43(2), 123–129. DOI: 10.1007/s10484-018-9391-7

15.

Harne

, Bobade

, Dhekekar

, . SVM classification of EEG signal to analyze the effect of Om mantra meditation on the brain. In: 2019 IEEE 16th India Council International Conference (INDICON) 2019: 1–4. DOI:10.1109/indicon47234.2019.9030339

16.

Saini

, Gurjar

, Muthukrishnan

, . Global effect on cortical activity in young Indian males in response to “om” chanting: A high-density quantitative electro encephalography study. Ann Neurosci 2023; 31(3): 176–185. DOI: 10.1177/0972753123118321915

17.

Singh Malan

, Khajuria

, Bajpai

, . Functional connectivity and power spectral density analysis of EEG signals in trained practitioners of Bhramari Pranayama. Biomed Signal Process Control 2023; 84: 105003. DOI: 10.1016/j. bspc.2023.105003

18.

Young

, Arterberry

and Martin

JP.

Contrasting electroencephalography-derived entropy and neural oscillations with highly skilled meditators. Front Hum Neurosci 2021; 15: 628417. DOI: 10.3389/fnhum.2021.628417

19.

Usgaonkar

, ReddyEdla

and Reddy

RR.

A meditation-based Brain State Classification Framework: An integrated Morlet wavelet transforms and CNN approach with EEG Signals. Multimedia Tools Appl 2025; 1–31. DOI: 10.1007/s1 1042-025-21065-w

20.

Ramekar

, Goel

and Gurjar

A review on effect of Vedic Mantra, Indian classical, and Western music in agriculture. Lecture Notes in Networks and Systems 2024; 425–447. DOI: 10.1007/978-981-97-4149-629

21.

Valderrama

and Sheoran

Identifying relevant EEG channels for subject-independent emotion recognition using attention network layers. Front Psychiatry 2025; 16: 1494369. DOI: 10.3389/fpsyt.2025.1494369

22.

Gudikandula

, Janapati

and Sengupta

Affective state classification from EEG signals using wavelet-based features and support vector machine. In: 2025 8th International Conference on Computing Methodologies and Communication (ICCMC) . 2025: 178–184. DOI: 10.1109/iccmc65190.2025.11140912

23.

Dhake

and Angal

Novel 2-D wavelet-based spectraltemporal representation for deep learning based stress detection using EEG signal. Circuits Syst Signal Process 2025; 1–23 DOI: 10.1007/s00034-025-03230-6

24.

Qian

, Xun

, Yin

, . Recognizing train drivers’ braking intentions with EEG based on a wavelet-based dualattention network. In: 2025 IEEE 14th Data Driven Control and Learning Systems (DDCLS) . 2025; 2224–2229. DOI: 10.1109/ddcls66240.2025.11065761

25.

Mohan

and Anand

RS.

Wavelet augmented phase coherence features for EEG-based imagined speech classification. IEEE Sens Lett 2025; 9(8): 1–4. DOI: 10.1109/lsens.2025.3591964

26.

Das

and Anand

Effect of prayer and “Om” meditation in enhancing galvanic skin response. Psychol Thought 2012; 5(2): 141. DOI: 10.5964/psyct.v5i2.18

27.

Thanneeru

, Sutar

, Singh

, . “Om” chanting and its impact on selected neuropsychological functions: A literature overview. Manipal J Med Sci 2022; 7(2): 5.

28.

Gangadhar

, Kalyani

, Venkatasubramanian

, . Neurohemodynamic correlates of ‘Om’ chanting: A pilot functional Magnetic Resonance Imaging Study. Int J Yoga 2011; 4(1): 3. DOI: 10.4103/0973-6131.78171

29.

Tayade

, Saini

, . Effect of short-term chanting on electroencephalographic microstates. Pan Afr Med J 2024; 49: 76. DOI: 10.11604/pamj.2024.49.76.44648

30.

Algarni

, Saeed

, Al-Hadhrami

, . Deep learning-based approach for emotion recognition using electroen-cephalography (EEG) signals using bi-directional long short-term memory (Bi-LSTM). Sensors 2022; 22(8): 2976. DOI: 10.3390/s22082976

31.

Phadikar

, Sinha

, Ghosh

, . Automatic muscle artifacts identification and removal from single-channel EEG using wavelet transform with meta-heuristically optimized non-local means filter. Sensors 2022; 22(8): 2948. DOI: 10.3 390/s22082948

32.

Lopes

, Leal

, Medeiros

, . Automatic electroen-cephalogram artifact removal using deep convolutional neural networks. IEEE Access 2021; 9: 149955–149970. DOI: 10.1109/access.2021.3125728

33.

Adhikari

, Choudhury

, Bhattacharya

, . Analysis of frequency domain features for the classification of evoked emotions using EEG signals. Exp Brain Res 2025; 243(3): 65. DOI: 10.1007/s00221-025-07002-1

34.

Davarzani

, Masihi

, Panahi

, . A comparative study on machine learning methods for EEG-based human emotion recognition. Electronics 2025; 14(14): 2744. DOI: 10.3390/electronics14142744

35.

Schirrmeister

, Springenberg

, Fiederer

, . Deep learning with convolutional neural networks for EEG decoding and visualization. Hum Brain Mapp 2017; 38(11): 5391–5420. DOI: 10.1002/hbm.23730

36.

and Dong

S-Y.

Deep learning-based self-induced emotion recognition using EEG. Front Neurosci 2022; 16: 985709. DOI: 10.3389/fnins.2022.985709

37.

Subha

, Joseph

, Acharya

, . EEG signal aanalysis: A survey. J Med Syst 2008; 34(2): 195–212. DOI: 10.1007/s10916-008-9231-z

38.

Amin

, Yusoff

and Ahmad

RF.

A novel approach based on Wavelet Analysis and arithmetic coding for automated detection and diagnosis of epileptic seizure in EEG signals using machine learning techniques. Biomed Signal Process Control 2020; 56: 101707. DOI: 10.1016/j.bspc.2019.101707

39.

Lawhern

, Solon

, Waytowich

, . EEG-Net: A compact convolutional neural network for EEG-based brain–computer interfaces. J Neural Eng 2018; 15(5): 056013. DOI: 10.1088/1741-2552/aace8c

40.

Shi

, Wang

, . Hybrid convolutional recurrent neural networks outperform CNN and RNN in task-state EEG detection for Parkinson’s disease. In: 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2019; 939–944. DOI: 10.1109/apsipaasc47483.2019.9023190

41.

Das

, Deb

and Choudhury

SS.

Tetrahedral feature map for analyzing meditative states in prenatal EEG. Eng Res Express 2025; 7(2): 025239. DOI: 10.1088/2631-8695/add4c3

42.

Dis

li F

, Gedikpínar

, Fírat

, . Epilepsy diagnosis from EEG signals using continuous wavelet transform-based depthwise convolutional neural network model. Diagnostics 2025; 15(1): 84. DOI: 10.3390/diagnostics15010084

43.

Das

, Deb

, Kakoti

, . Investigating alpha power variations across meditative states in prenatal women. In: 2024 International Conference on Recent Progresses in Science, Engineering and Technology (ICRPSET) . Published online December 7, 2024: 1–5. DOI: 10.1109/icrpset64863.2024.10955932.

44.

Das

, Kalita

, Deb

, . Brief Mantra meditation increases theta power in frontal regions. In: 2025 3rd International Conference on Intelligent Systems, Advanced Computing and Communication (ISACC) . Published online February 27, 2025: 287–292. DOI:10.1109/isacc65211.2025.10969192