FS-DSN: A feature segmentation-based dual-stream network for robust facial expression recognition in uncontrolled environments

Abstract

In uncontrolled environments, facial expression recognition encounters challenges such as poor image quality, uneven lighting, facial occlusions, and head pose variations. To address the challenges of facial occlusions and head pose variations, this paper introduces the Feature Segmentation-Based Dual-Stream Network (FS-DSN). The network consists of four components: a feature pre-extraction module, a feature segmentation module, a global feature extraction module, and a local feature extraction module. The pre-extraction module extracts mid-level features from facial expression images, which are then segmented into three areas: left eye, right eye, and mouth. The global feature extraction module uses the full set of features to extract global expression features, while the local feature extraction module focuses on the segmented regions. This dual-stream approach captures both broad and subtle expression changes, enhancing the semantic interpretation of facial expressions. Empirical tests show FS-DSN's robust performance, achieving accuracies of 88.82%, 60.09%, 78.33%, and 74.17% on the RAF-DB, SFEW 2.0, FED-RO, and FER-2013 datasets, respectively.

Keywords

uncontrolled environments facial expression recognition dual-stream network feature segmentation

Get full access to this article

View all access options for this article.

References

Aquib

Verma

Akhtar

(2024) Enhancing facial expression recognition by integrating global dependencies with modified non-local convolutional neural networks. In: 2024 IEEE International Conference on Computer Vision and Machine Intelligence (CVMI), pp.1–6.

Benamara

Val-Calvo

Álvarez-Sánchez

, et al. (2021) Real-time facial expression recognition using smoothed deep neural network ensemble. Integrated Computer-Aided Engineering 28: 97–111.

Dalal

Triggs

(2005) Histograms of oriented gradients for human detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), vol. 881, pp. 886–893.

Darwin

(2009) The Expression of the Emotions in Man and Animals. Cambridge: Cambridge University Press.

Deng

Guo

Ververas

, et al. (2020) Retinaface: Single-shot multi-level face localisation in the wild. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp .5202–5211.

Dhall

Goecke

Lucey

, et al. (2012) Collecting large, richly annotated facial-expression databases from movies. IEEE MultiMedia 19: 34–41.

Dillague

JDO

Juico

JHA

NSL

, et al. (2024) Detection of facial expressions based on three feature points using image processing with artificial neural networks. In: 2024 5th International Conference on Industrial Engineering and Artificial Intelligence (IEAI), pp.29–33.

Ding

Zhou

Chellappa

(2020) Occlusion-Adaptive deep network for robust facial expression recognition. In: 2020 IEEE International Joint Conference on Biometrics (IJCB), pp.1–9.

Farzaneh

(2021) Facial expression recognition in the wild via deep attentive center loss. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.2401–2410.

10.

Goodfellow

Erhan

Carrier

, et al. (2013) Challenges in representation learning: A report on three machine learning contests. In: Neural Information Processing. Berlin, Heidelberg: Springer Berlin, pp. 117–124.

11.

Guo

, et al. (2025) Research on facial expression recognition based on wide attention and multi-scale fusion mechanism. Journal of Ambient Intelligence and Smart Environments: 18761364241296439.

12.

Zhang

Ren

, et al. (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.770–778.

13.

Hou

Zhou

Feng

(2021) Coordinate attention for efficient Mobile network design. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.13708–13717.

14.

Irani

Nasrollahi

Simon

, et al. (2015) Spatiotemporal analysis of RGB-D-T facial images for multimodal pain level recognition. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.88–95.

15.

Islam

Mahmud

Hossain

(2018) High performance facial expression recognition system using facial region segmentation, fusion of HOG & LBP features and multiclass SVM. In: 2018 10th International Conference on Electrical and Computer Engineering (ICECE), pp.42–45.

16.

Jeong

(2018) Driver’s facial expression recognition in real-time for safe driving. Sensors 18(12): 4270. doi: https://doi.org/10.3390/s18124270.

17.

Mehta

Aneja

, et al. (2019a) A facial affect analysis system for autism Spectrum disorder. In: 2019 IEEE International Conference on Image Processing (ICIP), pp.4549–4553.

18.

Wang

, et al. (2021) LBAN-IL: A novel method of high discriminative representation for facial expression recognition. Neurocomputing 432: 159–169.

19.

Jin

Akram

, et al. (2020) Facial expression recognition with convolutional neural networks via a new face cropping and rotation strategy. The Visual Computer 36: 391–404.

20.

Liu

Gong

, et al. (2019b) INDReview on facial expression analysis and its application in education. In: 2019 Chinese Automation Congress (CAC), pp.4526–4530.

21.

Deng

(2017) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2584–2593.

22.

, et al. (2023) Facial expression recognition in the wild using multi-level features and attention mechanisms. IEEE Transactions on Affective Computing 14: 451–462.

23.

Chen

, et al. (2022) Learning informative and discriminative features for facial expression recognition in the wild. IEEE Transactions on Circuits and Systems for Video Technology 32: 3178–3189.

24.

Zeng

Shan

, et al. (2019c) Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Transactions on Image Processing 28: 2439–2450.

25.

Liao

Zhu

Zheng

, et al. (2022) FERGCN: Facial expression recognition based on graph convolution network. Machine Vision and Applications 33: 40.

26.

Liu

Cai

Lin

, et al. (2022a) Adaptive multilayer perceptual attention network for facial expression recognition. IEEE Transactions on Circuits and Systems for Video Technology 32: 6253–6266.

27.

Liu

Lin

Meng

, et al. (2022b) Point adversarial self-mining: A simple method for facial expression recognition. IEEE Transactions on Cybernetics 52: 12649–12660.

28.

Liu

Zhang

Zhou

, et al. (2021) SG-DSN: A semantic graph-based dual-stream network for facial expression recognition. Neurocomputing 462: 320–330.

29.

Luan

Chen

Zhang

, et al. (2018) Gabor convolutional networks. IEEE Transactions on Image Processing 27: 4357–4366.

30.

Lucey

Cohn

Kanade

, et al. (2010) The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp.94–101.

31.

Lyons

Akamatsu

Kamachi

, et al. (1998) Coding facial expressions with Gabor wavelets. In: Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, pp.200–205.

32.

Mehrabian

Ferris

(1967) Inference of attitudes from nonverbal communication in two channels. Journal of Consulting Psychology 31: 248–252.

33.

Mohseni

Zarei

Ramazani

(2014) Facial expression recognition using anatomy based facial graph. In: 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp.3715–3719.

34.

Mollahosseini

Hasani

Mahoor

(2019) Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing 10: 18–31.

35.

Qian

Tian

, et al. (2023) Facial expression recognition based on strong attention mechanism and residual network. Multimedia Tools and Applications 82: 14287–14306.

36.

Rathod

Vanazara

Pandya

(2023) Improved group facial expression recognition using super-resolved local facial multi scale features. In: 2023 11th International Conference on Intelligent Systems and Embedded Design (ISED), pp.1–6.

37.

Roberson

Kikutani

Döge

, et al. (2012) Shades of emotion: What the addition of sunglasses or masks to faces reveals about the development of facial expression processing. Cognition 125: 195–206.

38.

Shabbir

Rout

(2023) FgbCNN: A unified bilinear architecture for learning a fine-grained feature representation in facial expression recognition. Image and Vision Computing 137: 104770.

39.

Shahid

Yan

(2023) Squeezexpnet: Dual-stage convolutional neural network for accurate facial expression recognition with attention mechanism. Knowledge-Based Systems 269: 110451.

40.

Shao

Cheng

(2021) E-FCNN for tiny facial expression recognition. Applied Intelligence 51: 549–559.

41.

She

Shi

, et al. (2021) Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.6244–6253.

42.

Singh

Ramanujam

(2025) M, MMSAD—A multi-modal student attentiveness detection in smart education using facial features and landmarks. Journal of Ambient Intelligence and Smart Environments: 18761364251315239.

43.

Sripian

Anuardi

MNAM

Ito

, et al. (2021) Emotion-sensitive voice-casting care robot in rehabilitation using real-time sensing and analysis of biometric information. Journal of Ambient Intelligence and Smart Environments 13: 413–431.

44.

Starostenko

Cortés

Sánchez

, et al. (2015) Unobtrusive emotion sensing and interpretation in smart environment. Journal of Ambient Intelligence and Smart Environments 7: 59–83.

45.

Takalkar

Thuseethan

Rajasegarar

, et al. (2021) LGAttNet: Automatic micro-expression detection using dual-stream local and global attentions. Knowledge-Based Systems 212: 106566.

46.

Tang

Zhang

, et al. (2021) Facial expression recognition using frequency neural network. IEEE Transactions on Image Processing 30: 444–457.

47.

Tiong

LCO

Kim

(2020) Multimodal facial biometrics recognition: Dual-stream convolutional neural networks with multi-feature fusion layers. Image and Vision Computing 102: 103977.

48.

Ullah

Hasan

, et al. (2022) Improved deep CNN-based two stream super resolution and hybrid deep model-based facial emotion recognition. Engineering Applications of Artificial Intelligence 116: 105486.

49.

Wang

Xue

, et al. (2022) Light attention embedding for facial expression recognition. IEEE Transactions on Circuits and Systems for Video Technology 32: 1834–1847.

50.

Wang

Peng

Yang

, et al. (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing 29: 4057–4069.

51.

Wang

Zhu

Chen

, et al. (2019) Perceptual learning and recognition confusion reveal the underlying relationships among the six basic emotions. Cognition and Emotion 33: 754–767.

52.

Weng

Yang

Tan

, et al. (2021) Attentive hybrid feature with two-step fusion for facial expression recognition. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp.6410–6416.

53.

Widen

Christy

Hewett

, et al. (2011) Do proposed facial expressions of contempt, shame, embarrassment, and compassion communicate the predicted emotion? Cognition and Emotion 25: 898–906.

54.

Woo

Park

Lee

J-Y

, et al. (2018) CBAM: Convolutional block attention module. In: Ferrari

Hebert

Sminchisescu

Weiss

(eds) Computer vision – ECCV 2018. Cham: Springer International Publishing, pp. 3–19.

55.

Xie

Tian

, et al. (2022) Triplet loss with multistage outlier suppression and class-pair margins for facial expression recognition. IEEE Transactions on Circuits and Systems for Video Technology 32: 690–703.

56.

Zhao

Liu

Wang

(2021) Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Transactions on Image Processing 30: 6544–6556.

57.

Zhou

Khosla

Lapedriza

, et al. (2016) Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2921–2929.