Abstract
Emotion recognition based on EEG signals is a critical component in Human-Machine collaborative environments and psychiatric health diagnoses. However, EEG patterns have been found to vary across subjects due to user fatigue, different electrode placements, and varying impedances, etc. This problem renders the performance of EEG-based emotion recognition highly specific to subjects, requiring time-consuming individual calibration sessions to adapt an emotion recognition system to new subjects. Recently, domain adaptation (DA) strategies have achieved a great deal success in dealing with inter-subject adaptation. However, most of them can only adapt one subject to another subject, which limits their applicability in real-world scenarios. To alleviate this issue, a novel unsupervised DA strategy called Multi-Subject Subspace Alignment (MSSA) is proposed in this paper, which takes advantage of subspace alignment solution and multi-subject information in a unified framework to build personalized models without user-specific labeled data. Experiments on a public EEG dataset known as SEED verify the effectiveness and superiority of MSSA over other state of the art methods for dealing with multi-subject scenarios.
Introduction
Emotion recognition is a critical component in the design of man-machine interaction and clinical applications for the mental disorder. So far, research on emotion recognition has mainly focused on facial expression, speech, and certain physiologic signals [1, 2]; of these, methods based on EEG signals have attracted great interest due to their objectivity and sensitivity to emotional reactivity.
Various types of feature extraction and machine learning methods have been applied with great success to the EEG-based emotion recognition [3, 4]. However, distinguishing EEG-based emotional characteristics across subjects has remained being difficult due to the individual difference and high non-stationarity of EEG signals [5, 6, 7]. In other words, the performance of well-trained classifiers for Subject 1 usually degraded on Subject 2 when using the experimental setup shown in Fig. 1a and b. In recent years, domain adaptation (DA) algorithms, which have strong ability to minimize the domain discrepancy, have been widely studied for cross-subject classification problems [8].
An example to demonstrate domain adaptation algorithm. (a) Shows data distribution from Subject 1. (b) Shows data distribution from Subject 2. (c) Shows the transformed data after domain adaptation. It can be seen that the intention of domain adaptation is reducing the discrepancy in distribution and making classifiers is robust to both Subjects 1 and 2.
Overview of multi-subject subspace alignment.
In general, the domain with labeled data, which is used for training, is called the source domain, and the domain where the target task is conducted, with a different distribution, is called as the target domain. As shown in Fig. 1c, assuming the data distribution between the source and target domain is different, domain adaptation mainly focuses on learning common features across the source and target domains and training new classifiers, such as support vector machine (SVM) or logistic regression (LR), which are appropriate for both of two domains [9, 10, 11, 12].
However, the domain adaptation strategies mentioned above can in fact only adapt one person to another person. In reality, there are many more individuals in the EEG dataset; therefore, concentrating all the subjects together to compose the source domain can make the distribution of training data inconsistent. If only one subject in the EEG dataset is used for training, the prediction for the new subject can be considered a single task. Obviously, this single task will waste the extra information provided by other subjects in the available dataset.
To handle these issues, a new DA strategy named as multi-subject subspace alignment (MSSA) is proposed in this work. The proposed method considers that there are N participants and associated labeled training samples in the training set; each participant can compose a single prediction task with the new participant who needs to be tested. Instead of learning each task independently or concentrating all the subjects together directly, all the information of these N subjects is utilized here to help in training a classifier for the new target subject, as shown in Fig. 2. Owing to an increased sample size for the prediction task, the prediction performance may be improved.
In this study, differential entropy (DE) [13] feature which has been proven effective in EEG-based emotion recognition was used to represent the EEG pattern by extracting frequency domain features from each channel band power, which are used to generate the input of the MSSA model. In MSSA, a domain adaptation method known as subspace alignment [9] was utilized to transform the source subject into the target subspace, and move the source and target subspace features closer together. Then, a set of N classifiers are learnt separately, parameterized by the vectors
Our main contributions are two fold:
Subspace alignment is utilized to directly align the source and target subspaces, and to improve the consistency of training and test EEG data. The proposed method takes advantage of all the information of the
This section introduces a multi-subject subspace alignment (MSSA) algorithm, which can be used to build personalized models without user-specific labeled data, and take advantage of the information of all the subjects in the training set. Given a set of EEG trails from a new subject, the problem to be addressed is to estimate the corresponding emotion state for each sample by using the above N subjects in the training set.
As shown in Fig. 2, MSSA is accomplished by two steps. In the first step, the normalized DE feature vector extracted from the training subject is aligned to the test subject using a subspace alignment strategy. If all of the
Feature extraction and normalization
In this paper, differential entropy (DE) feature which has been proven effective in EEG-based emotion recognition [13], is utilized as the input of the proposed MSSA model. The DE feature proposes a hypothesis that segments of the EEG signal satisfy Gauss distribution after a band-pass filtering in the five frequency bands:
where the segment of the EEG signal
Then, in order to normalize the DE features, the Min-Max strategy is utilized in this paper. The DE features from the source and target sets are denoted as
In this paper,
Assume the training set contains
According to the principal component analysis (PCA) theory,
where
Obviously, when
Then, samples in the subject
The gradient-descent (GD) optimization approach is employed to solve this learning problem.
In order to train a target classifier whose parameters are closer to the subjects that are more similar to the target subject, a new DA strategy named as multi-subject subspace alignment is proposed in this paper. Firstly, a simple strategy is utilized to measure the distance between subject source subject
Since the distance
where
This proves that
Thus our regularization method finds a trade-off between the conventional model and closeness to the average of the
Experimental protocol for emotion recognition based on EEG signal for one subject.
Datasets and experimental setup
In this paper, the performance of the proposed MSSA was evaluated using the emotion EEG dataset known as – SEED [14] (
As shown in Fig. 3, each participant in SEED need watch 15 emotional film clips, and the duration of each film clip is about 4 minutes. In addition, there was a five-second warning before each clip and each participant had 45 seconds to assess their emotional reactions for the feedback. The SEED dataset contains a down-sampled (200 Hz), preprocessed (band-pass filter from 0.3 Hz to 50 Hz) and segmented (1 s without overlapping) version of the EEG data in Matlab (.mat file), and there are 3300 signal segments in each channel for per experiment. As there was 62 channels in total, the total dimension of features extracted from a group EEG signal segments is 310 by using the standard five frequency bands.
Comparison results of each subject on the SEED dataset (Accuracy in %)
Comparison results of each subject on the SEED dataset (Accuracy in %)
The mean results of all of the subjects from same session on the SEED dataset.
In this study, SVM [15] and LR [16] without DA strategy were used in training the baseline. In addition, the proposed approach (MSSA) was compared with three state-of-the-art DA methods: auto-encoder (AE) [10], transfer component analysis (TCA) [12] and transfer joint matching (TJM) [11]. All results were conducted using the leave-one-subject-out cross validation method. For AE, TCA, and TJM, since it is impracticable to concentrate all the signal segments in the SEED dataset as the training data on account of the limits on the memory and time consuming. To avoid bias, 20 samples were selected randomly from each trial and 300 samples were obtained in total. Ultimately, 4200 samples were obtained from 14 subject as training data. Moreover, sample selection procedure was repeated five times to avoid bias in data.
Since each person undergoes three sessions in the SEED dataset, we verify our MSSA method using these three experiments. Table 1 shows the average classification accuracies of the three experiments for each subject. For more accurate illustration, the mean accuracies of the 15 subjects for each experimental session was also provided in Fig. 4.
Evaluation of the statistical significance between the performance of MSSA and other methods
Evaluation of the statistical significance between the performance of MSSA and other methods
Without the domain adaptation strategy, the two standard classifiers SVM and LR serving as the baseline for comparison, only achieved an average classification accuracy of 57.70% and 57.98%, respectively. With the benefits of a deep structure, which can learn domain-invariant feature from both training and test domains, the AE method achieves a mean accuracy of 61.46%, which is slightly better than the baseline methods. TCA and TJM, which are the most commonly used DA methods, indicate a significant improvement with 75.39% and 76.13%. Both of these methods try to minimize the distribution divergence using an MMD constraint. However, without taking the differences among individuals into account, TCA and TJM might not be the best choices for EEG-based emotion recognition. By contrast, our MSSA achieves a mean accuracy of 79.61%. The above result suggests that, this strategy which aims at reducing the multi-subject discrepancy is effective in emotion recognition based on EEG signals. Furthermore, as shown in Table 1, the proposed MSSA method indicated improvements on most subjects, which confirms that MSSA can perform stably. Figure 4 shows the mean results of all of the subjects from same session on the SEED dataset. As shown, our MSSA method also indicated the highest accuracy. Moreover, compared with the transfer learning strategy reported in literature [17], also evaluated on SEED, the mean of 79.61% achieved by our MSSA method is higher than the 76.31% found in the literature [17]. The statistical significance between the proposed MSSA approach and other algorithms is evaluated using Student’s t-test in Table 2. It could be seen that MSSA is significantly better than the methods with very low
In this work, the MSSA strategy is proposed to make better use of multi-subject information in the EEG dataset. In MSSA, a subspace alignment strategy is utilized to reduce the marginal distribution discrepancy between each source subject and target subject. Following this, the distance metric is utilized to measure the discrepancy between the transformed source and target data, and a number of subjects which are more similar to the target subject can be selected. As a result, the parameters of the target classifier can be learnt as being closer to the subject which is more similar to the target subject, and the classification accuracy will be improved. Experiments on SEED dataset verify the effectiveness and superiority of MSSA over other state of the art methods for dealing with multi-subject scenarios.
Footnotes
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant No. 61401117, Grant No. 61402129 and Grant No. 61301012). Furthermore, Ou Bai was sponsored by the National Science Foundation (USA) (CNS-1552163).
Conflict of interest
None to report.
