Abstract
Introduction:
Emotion recognition plays a crucial role in our social interactions and overall well-being. The present cross-sectional study aimed to develop and validate Emotion Laden Sentences Toolbox for Emotion Recognition (ELSTER), that utilizes emotion-laden sentences as stimuli to assess individuals’ ability to perceive and identify emotions conveyed through written language.
Methods:
In Phase I, a comprehensive set of emotion-laden sentences in English language were validated by 25 (eight males and 17 females) qualified mental health professionals (MHPs). In Phase II, the sentences that received high interrater agreement in Phase I were selected and then a Hindi version of the same sentences was also developed. The English and Hindi database was then validated among 50 healthy individuals (30 males and 20 females).
Results:
The percentage hit rate for all the emotions after exclusion of contempt was 84.3% with a mean kappa for emotional expression being 0.67 among MHPs. The percentage hit rate of all emotion-laden sentences across the database was 81.43% among healthy lay individuals. The mean hit rate percentage for English sentences was similar to Hindi sentences with a mean kappa for emotional expression being 0.63 for the combined English and Hindi sentences.
Conclusion:
The ELSTER database would be useful in the Indian context for researching textual emotion recognition. It has been validated among a group of experts as well as healthy lay individuals and was found to have high inter-rater reliability.
Social communication is dependent upon the ability of individuals to interpret the emotional expression of others. The recognition of emotional expressions is based on both verbal and non-verbal stimuli. While there are databases existing for emotion recognition of non-verbal stimuli, none exist purely based on textual form of emotion expression. Emotion Laden Sentences Toolbox for Emotion Recognition (ELSTER) database would be useful in the context of researching emotion recognition for English and Hindi reading populations.Key Message:
Emotions are increasingly being studied in domains such as psychology, neurology, human-computer interaction, linguistics, and so on. They play a profound role in a human’s life. Emotions have aided our ancestors in adaptation and survival; they are a powerful means of communication; and they have a considerable influence on our cognitive functions, such as attention, memory, and decision-making. 1 Being able to recognize these emotions helps increase self-awareness as well as stronger connections and more meaningful interactions with others. It also assists individuals in efficiently managing their own emotions. Therefore, emotion recognition is a valuable skill to possess.
When discussing emotion recognition, it is critical to mention the theory of mind (ToM). ToM involves the capacity to comprehend the thoughts, beliefs, intentions, and desires of other individuals, as well as to acknowledge that these mental states can vary from one’s own. 2 The conventional understanding of ToM is that it can be separated into two systems: cognitive and affective. Affective ToM refers to understanding the other person’s emotions, feelings, and affective states. On the other hand, cognitive TOM, indicates an understanding of the other person’s cognitive states, beliefs, and intentions. 3 While few researchers have stated that the affective component of ToM declines with age4, 5 others have opposed this notion suggesting that age has no effect on affective ToM.6, 7
Failure to recognize emotions can impact an individual’s social and emotional functioning. Understanding and interpreting social cues, such as others’ emotional states and facial expressions, can be particularly difficult for individuals with autism,8, 9 schizophrenia,10–13 depression,14, 15 and social anxiety disorder.16, 17
There are several tasks that are frequently used for assessing emotion recognition, each of which concentrates on a distinct type of emotional signal. While most of these measures are based on recognizing emotions in non-verbal stimuli, particularly facial expressions, 18 there are a few databases that evaluate other forms of non-verbal expressions. The Reading the Mind in the Eyes Test (RMET), evaluates the capacity to attribute emotions by looking at photos of people’s eyes. 19 Geneva Emotion Recognition Test (GERT) assesses the ability to identify emotions using a variety of modalities, such as body language, voice cues, and facial expressions. 20 The Body Action and Posture (BAP) test evaluates videos or pictures of body language and identifies the emotions being conveyed. 21 The Montreal Affective Voices (MAV) database assesses emotion recognition from vocal cues. 22 The Emotion Attribution Task (EAT) asks participants to assign emotional states after listening to written stories of emotional situations read aloud by the researcher. 23 Although assessments that use verbal stimuli do exist, none of them focus on reading and comprehension.
Furthermore, previous studies in the field of emotion research have primarily focused on two key dimensions: valence and arousal. Arousal pertains to the degree of activation within the appetitive or defensive motivation system that is triggered by an emotion. On the other hand, valence refers to the classification of emotions as either positive or negative. 24 Regardless of their emotional valence, it has been noted that words associated with emotions undergo faster processing compared to neutral words. This occurrence is commonly known as the ‘emotion effect’.25, 26
The emotion word type was neglected in a large proportion of earlier studies on emotion word processing.27, 28 Recently, there has been a greater focus on comparing emotion-label words with emotion-laden words. Emotion-laden words, such as ‘successful’, ‘failed’, elicit an individual’s emotions through the word’s implications, as opposed to emotion-label terms, which directly explain or describe one’s affective states like ‘upset’ or ‘happy’.28, 29 Compared to emotion-laden words, emotion-label words have a stronger priming effect and faster response times. 30 It was also observed that emotion-label words had a processing advantage over emotion-laden words regardless of whether it was a first or second language used. 31 But in everyday usage, it is not uncommon that emotion-laden words or sentences are used to attribute emotions. In the ever-growing world of social media, such words or sentences may have a larger implication for recognizing emotional states.
There can be several advantages to assessing emotion recognition using textual stimuli. It will allow a standardised presentation of emotions and multiple people can be assessed at the same time. The use of language may also eliminate any kind of visual bias. Furthermore, using textual stimuli will help us tap into language-based processing of emotion recognition, which has not been studied much. Thus, the development of a text-based emotion recognition tool will provide an alternative perspective on how individuals recognize and identify emotions.
The primary objective of this study is to create a comprehensive assessment database that measures the ability to recognize emotions based on written/textual stimuli. Specifically, the focus is on evaluating individuals’ proficiency in understanding and perceiving emotions conveyed through emotion-laden sentences. By developing such a tool, researchers aim to provide a reliable and valid means of assessing individuals’ competence in recognizing and interpreting emotions expressed through written language. The tool will enable researchers and practitioners to gain insights into individuals’ emotional perception abilities and potentially identify any deficits or variations in emotion recognition skills.
Methodology
Development of the Database
The data presented in the current article on the development of Emotion Laden Sentences Toolbox for Emotion Recognition (ELSTER) database is part of a larger study comprising the development of a larger toolbox for varying facets of emotion recognition.18, 32 This cross-sectional study was conducted from March 2019 to August 2021 at the Department of Psychiatry, All India Institute of Medical Sciences, New Delhi, India. The Institute Ethics Committee approved the study. Based on the previous literature, eight emotion expressions were selected—neutral, happiness, anger, sadness, disgust, contempt, fear, and surprise. 18
The emotion-laden sentences were initially selected in English language from freely available online books, novels, and other textual literature. Ten sentences depicting each emotional state were selected keeping the length to a maximum of 40 words in proper grammatical format by three researchers. Thereafter, two other researchers re-assessed the sentences and reduced the number to six sentences for each emotion keeping the sentence length to a maximum of 25 words in proper grammatical format. Thus, a total of 48 emotion-laden sentences were developed.
Validation of the Database
Participants
In Phase I, the database was validated by 25 qualified mental health professionals (MHPs) from the Department of Psychiatry of a tertiary care hospital (eight males and 17 females). All raters had a normal or corrected-to-normal vision and volunteered for no-cost participation. All participants were proficient in the English language and were educated with at least an undergraduate degree.
In Phase II, the sentences that received high interrater agreement in Phase I were selected and then a Hindi version of the same sentences was also developed by three researchers independently. Thereafter, two other researchers re-assessed the Hindi sentences and a final Hindi set with concurrence was reached at. The English and Hindi database was then validated among 50 healthy individuals presenting as caregivers to patients attending the psychiatry outpatient clinic of a tertiary care hospital (30 males and 20 females). All participants were educated at least up to higher secondary and consented to participation in the study.
Procedure
Raters were presented with randomised sentences of different emotions on a 27-inch computer screen. They were seated 50–70 cm away from the screen and proceeded at their own pace in the presence of a researcher (NPS or NK). Raters were asked to recognize the expression that was the best fit for the emotion depicted in the sentence, choosing one of the eight response categories in Phase I: happiness, anger, sadness, neutral, surprise, fear, contempt, disgust and others. In Phase II, the ‘other’ and ‘contempt’ response category was removed based on Phase I findings.
Analysis
The emotion expression recognition of the sentences was evaluated by hit rate percentage with standard deviation (SD), where the proportion of raters who agree with the intended expression was calculated for each emotion. Initially, for each sentence, hit rates were calculated for emotion recognition. Then, five sentences of each emotion with the highest hit rates were selected and the mean hit rate for the emotion was calculated from these sentence sets. In addition, we also calculated Fleiss’ kappa, which is a chance-corrected measure of agreement between the intended expression and the raters’ labels. The agreement interpretation for Fleiss’ kappa was taken as: <0: poor, 0.0–0.20: slight, 0.21–0.40: fair, 0.41–0.60: moderate, 0.61–0.80: substantial, and 0.81–1.0: almost perfect. 33
Results
Phase I
A total of 48 emotion-laden sentences were presented to MHPs. For each sentence, we calculated how many participants identified the correct emotion. The overall mean hit rate percentage of all emotions across the database was 77.92% (SD = 25.11). The percentage hit rates for each emotion are depicted in Table 1. The hit rate for contempt was very low—36% (SD = 20.70); hence, the contempt emotion was removed from further analysis. The new mean hit rate percentage of all the emotions after the exclusion of contempt was 84.3% (SD = 8.67). We calculated the hit rates for each sentence and finally selected five sentences with the highest hit rates for each emotion to be included in the database—ELSTER. The hit rate of each selected sentence for individual emotion is provided in supplementary Table S1.
Hit Rate for Emotion Recognition in Phase I and Phase II.
ELS, emotion-laden sentences; MHPs, mental health professionals; SD, standard deviation.
There was substantial agreement (Fleiss Kappa = 0.67) between the raters (n = 25) for overall emotion recognition of the sentences. The agreement for individual emotion was also found to be substantial (0.59–0.78) except for contempt (Kappa = 0.34) as shown in Table 2. The kappa for sad emotion sentence recognition was at almost perfect agreement.
Agreement on Individual Emotion Categories in Phase I with MHPs.
MHPs-Mental health professionals.
Phase II
After removing sentences having low hit rates (less than 75%) for each emotion and removing contempt-expressing sentences, the remaining 35 emotion-laden sentences (five sentences for each of the seven emotions—happiness, anger, sadness, neutral, surprise, fear and disgust) were presented to healthy lay individuals.
The overall mean hit rate percentage of all emotion-laden sentences across the database was 81.43% (SD = 10.38). The mean hit rate percentage for English sentences was 84.28% (SD = 13.83) and for Hindi sentences was 80.32% (SD = 11.46). The percentage hit rates for each emotion are depicted in Table 1. The hit rate of each selected sentence for individual emotion is provided in supplementary Table S2.
There was substantial agreement (Fleiss Kappa = 0.63) between the raters (n = 50) for overall emotion recognition for the combined English and Hindi sentences. The agreement for individual emotion was also found to be moderate to substantial (0.56–0.68) as shown in Table 3.
Agreement on Individual Emotion Categories in Phase II.
CI, confidence interval; ELS, emotion laden sentences; All Kappa values were significant at p < .001
Standard Error value for English, Hindi and combined sentences was 0.018, 0.007 & 0.005, respectively.
Among the participants, the distribution of English emotion-laden sentences (n = 14) and Hindi emotion-laden sentences (n = 36), was similar among the genders (Males vs. Females: 30 vs. 20, Chi-square = 0.15, p = .70). The mean age was also comparable among the genders (Male: 36.93 ± 11.98 years, Female: 34.25 ± 12.64 years, t: 0.76, p = .45).
When assessing inter-rater reliability individually for language-dependent performance, there was a substantial agreement for both English and Hindi sentences for overall emotion recognition (Fleiss Kappa = 0.68 & 0.61, respectively). The agreement for individual emotion was also found to be moderate to substantial for both English (0.53–0.80) and Hindi (0.57–0.65) language emotion-laden sentences (Table 3).
Discussion
This study proposes an assessment toolbox that measures textual emotion recognition using emotion-laden sentences. The current database contains emotion-laden sentences in English and Hindi language. A total of 35 emotion-laden sentences were finalised, five from each of the seven emotions (happiness, anger, sadness, neutral, surprise, fear and disgust). Fleiss’ kappa and hit rates were calculated to evaluate the tool’s reliability and validity. These sentences were validated initially by MHPs in Phase I and subsequently by healthy individuals in Phase II, for the intent of standardisation.
Like Ekman’s research, the initial phase of the study included the emotion of ‘contempt’. 34 However, just like Ekman, this category along with ‘others’, was eliminated due to low accuracy rates in phase I. Similarly, our findings from the previous study on emotion recognition of facial emotion in All India Institute of Medical Sciences (AIIMS) Facial Toolbox for Emotion Recognition (AFTER) database, 18 also found that the contempt emotion had poor hit rates among the study population. The same was also true for the Dynamic Emotion Laden Inventory of Videos for Emotion Recognition (DELIVER) database that assessed emotion recognition through videos of emotion expressions. 32 Thus, it appears that contempt may not be recognizable as an easily identifiable emotion, particularly for the Indian population. This may be true for other cultures too as most other emotion databases also do not include contempt.
Phase I MHPs had a higher average hit rate for most emotion categories compared to Phase II healthy lay individuals, who scored a higher hit rate than MHPs in fear and neutral emotion categories only. It may suggest a potential influence of professional training on emotion recognition abilities for most emotions.
Among the healthy individuals, there was substantial agreement for overall emotion recognition for the combined as well as individual English ELS and Hindi ELS. There was generally a good agreement among raters for categorising emotions, with moderate to almost perfect Kappa values observed for most emotion categories. This indicates a reliable and consistent categorization of emotions within each category across different conditions. The Kappa values were all statistically significant, indicating a high level of agreement beyond chance. Except for disgust and surprise emotion, the hit rates were higher for English ELS than for Hindi ELS. This may indicate that language does play a role in emotion recognition.
Numerous studies suggest that there is a difference in language processing in a native language and second language.35–37 It is observed that there’s a difference in terms of affective processing in those who acquired a second language later in life and use it as a foreign language. The second language appears to be processed in a semantic way rather than affectively. This distinction is responsible for the reduction of framing biases observed in the processing of the second language. 38
In the current study, gender was not found to have any notable impact on emotion recognition. Many studies have underlined the gender advantage in decoding emotions with females showing better recognition over males.39, 40 However, research does not always confirm the maintenance of gender differences across the age span suggesting larger differences for teenagers and young adults.39, 41 Studies have reported that females seem to show significantly higher scores in anger recognition and also seem to show more ability than males to recognize paired emotions. Studies in affective valence processing have shown sex-specificity with females displaying hypersensitivity to negative valence. 42 But these studies were conducted using facial expression databases, thus limiting their generalizability for emotion recognition across different stimuli.
As previously mentioned, several tests and assessments are available to evaluate emotion recognition. Emotion recognition from face tasks essentially assesses the capacity to discriminate facial expressions. The tasks that use auditory stimuli test prosodic information discrimination. The suggested ELSTER database uses emotion-laden sentences, which can provide a higher level of linguistic complexity. Furthermore, ELSTER corresponds to real-world communication situations. Moreover, the database is available in a bilingual language unlike any other to provide a more comprehensive assessment, particularly for the Hindi reading population. No such database is currently available.
The Emotion Attribution Task (EAT) was not immune to bias. Even though the stimuli used were written language, the task required the examiner to read those stories out loud to the participants. 23 This defeats the point of using written language. Furthermore, this does not guarantee standard and consistent presentation of stimuli each time as there are studies that suggest the role of auditory prosody in recognition or appreciation of emotion too. ELSTER will eliminate these biases by making the participant use their own cognitive and language processing skills.
Also, using emotion-laden words instead of emotion-label words also has major advantages. Recognition of emotion using emotion-laden sentences would require conceptual knowledge of the emotions. It would also require an understanding of semantics. Whereas emotion-label sentences may be easier to identify or recall because emotion-label words would be a part of explicit memory. Therefore, researchers also suggest that emotion-label words are recognised faster as compared to emotion-laden words. 30
With changing times, as human-machine interaction is increasing and becoming more common, research is being done in the field of natural language processing (NLP). Affective computing refers to techniques for detecting, recognizing, and predicting human emotions with the goal of adapting computational systems to these states. 43 The toolkit EmoTxt has been shown to achieve high accuracy in emotion recognition from text. 44 Colnerič & Demšar 45 used a collection of tweets to explore the use of deep learning for emotion detection on Twitter. Mohammad & Kiritchenko 46 studied emotion-word hashtags in tweets using them as emotion labels to generate a large lexicon of word-emotion associations from these hashtags. The authors also suggested improving emotion classification accuracy in a different non-tweet’s domain using the lexicon. The traditional machine learning can only learn from a fixed-size vector of features and thus features are commonly built upon bag-of-words. Even though computational techniques such as deep learning have yielded considerable performance improvements for a variety of tasks in NLP, naive network architectures struggle with the task of emotion recognition. 47
To the best of our knowledge, there is no existing database that provides a reliable and validated emotion recognition database for textual stimuli. The English as well as Hindi version of the ELSTER database has high hit rates and good inter-rater reliability both among a cohort of experts and healthy individuals for emotion recognition. The emotion-laden sentences from this database can be used freely for emotion research.
The development of the ELSTER database utilizing emotion-laden sentences represents a significant advancement in the field of emotion research. The database provides a standardized and psychometrically sound instrument for assessing individuals’ ability to recognize and interpret emotions conveyed through written language. The database has significant prospects in both research and clinical contexts. It offers valuable utility in evaluating emotion recognition abilities among clinical populations, including individuals diagnosed with autism spectrum disorders or mood disorders. Moreover, it can contribute to comprehending emotional processing and its correlation with social functioning and interpersonal communication. Future research should focus on further validation and refinement of ELSTER, as well as exploring its utility in various contexts and populations.
Conclusion
The ELSTER database would be useful in the context of conducting research in the field of textual emotion recognition. ELSTER has been validated in a cohort of experts and healthy lay individuals. Based on the hit rates and the good inter-rater reliability, it might be concluded that the ELSTER database offers a valid set of affective stimuli for recognising emotions.
Supplemental Material
Supplemental material for this article is available online.
Supplemental Material
Supplemental material for this article is available online.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The study is sponsored through the Cognitive Science Research Initiative (CSRI) grant of Department of Science and Technology (DST) (D.O. No. DST/CSRI/2017/186 dated April 12, 2018) to Dr Rohit Verma.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
