Abstract
In uncontrolled environments, facial expression recognition encounters challenges such as poor image quality, uneven lighting, facial occlusions, and head pose variations. To address the challenges of facial occlusions and head pose variations, this paper introduces the Feature Segmentation-Based Dual-Stream Network (FS-DSN). The network consists of four components: a feature pre-extraction module, a feature segmentation module, a global feature extraction module, and a local feature extraction module. The pre-extraction module extracts mid-level features from facial expression images, which are then segmented into three areas: left eye, right eye, and mouth. The global feature extraction module uses the full set of features to extract global expression features, while the local feature extraction module focuses on the segmented regions. This dual-stream approach captures both broad and subtle expression changes, enhancing the semantic interpretation of facial expressions. Empirical tests show FS-DSN's robust performance, achieving accuracies of 88.82%, 60.09%, 78.33%, and 74.17% on the RAF-DB, SFEW 2.0, FED-RO, and FER-2013 datasets, respectively.
Keywords
Get full access to this article
View all access options for this article.
