Abstract
To solve the problem of limitations of current attention mechanisms in extracting key facial expression features and the problem of low accuracy of facial expression recognition due to insufficient consideration of feature information fusion in the receptive field by convolutional neural networks. In this paper, we propose a facial expression recognition network based on wide attention (WA) and a multi-scale fusion (MF) mechanism (wide attention and multi-scale fusion [WAMF]). WA by extracting the background information of facial expression images while focusing on texture information, thus achieving better feature extraction. The MF mechanism is added at the connection points of layers in ResNet, where features extracted from each upper layer are fused using different-sized convolutional kernels and input into the lower layer. Finally, a viewpoint-invariant Capsule Net is used as the classification network after receiving the feature maps. The proposed WAMF model was applied to two publicly available datasets, CK+ and Jaffe, achieving excellent recognition rates of 98.98% and 98.46%, respectively.
Get full access to this article
View all access options for this article.
