OFMT-net: Advancing micro-expression recognition through optical flow and multi-task network with self-attention

Abstract

Micro-expression is difficult to recognize due to short duration and subtle action range, but it contains rich and real psychological information, which has important research value in criminal investigation, teaching and other fields. In response to issues like limited facial expression dynamics, suboptimal feature extraction, and susceptibility to overfitting, we proposed a micro-expression recognition method based on optical flow and multi-task convolutional neural network (OFMT-Net). It capitalizes on optical flow data from onset to apex frames as input. Feature extraction is conducted through a shared-parameter network, funnelling outputs into a dual-tower network designed for emotional and Action Unit (AU) recognition. This network incorporates a self-attention mechanism for effective classification, driven by a dual weighted loss function. The method fully extracts the relevant information contained in the facial action unit, and uses the implicit data enhancement advantages of the multi-task framework to improve the recognition accuracy and reduce the sample dependence problems. Cross-validation results on the joint dataset demonstrate that the model achieves an accuracy rate of 79.89%, an unweighted average recall rate of 75.05%, and an unweighted F1 score of 75.08%, surpassing many mainstream models. The related code is publicly available at https://github.com/WenyuanLi001/OFMT-Net

Keywords

Micro-expression recognition optical flow multi-task learning self-attention

Get full access to this article

View all access options for this article.

References

Sariyanidi

Gunes

Cavallaro

. Learning bases of activity for facial expression recognition. IEEE Trans Image Process 2017; 26: 1965–1978.

Frank

Herbasz

Sinuk

, et al. I see how you feel: training laypeople and professionals to recognize fleeting emotions. In: Proceedings of the 2009 Annual meeting of the international communication association, 2009, pp.3515–3522.

Pfister

Huang

, et al. A spontaneous micro-expression database: Inducement, collection and baseline. In: 2013 10th IEEE International conference and workshops on automatic face and gesture recognition (FG), 2013, pp.1–6.

Yan

W-J

Liu

Y-J

, et al. Casme database: a dataset of spontaneous micro-expressions collected from neutralized faces. In: 2013 IEEE International conference and workshops on automatic face and gesture recognition (FG), 2013, pp.1–7.

Yan

W-J

Wang

S-J

, et al. Casme ii: an improved spontaneous micro-expression database and the baseline evaluation. PLoS ONE 2014; 9: e86041.

Wang

S-J

Yan

W-J

, et al. Cas(me)(2): a database of spontaneous macro-expressions and micro-expressions. In: International conference on human-computer interaction, 2016, pp.424–436.

Chaudhry

Ravichandran

Hager

. Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: 2009 IEEE conference on computer vision and pattern recognition, 2009, pp.1932–1939.

Mostafa

MKAE

Levine

. Fully automated recognition of spontaneous facial expressions in videos using random forest classifiers. IEEE Trans Affect Comput 2014; 5: 141–154.

Happy

Routray

. Fuzzy histogram of optical flow orientations for micro-expression recognition. IEEE Trans Affect Comput 2019; 10: 394–406.

10.

Liu

Y-J

Zhang

J-K

Yan

W-J

, et al. A main directional mean optical flow feature for spontaneous micro-expression recognition. IEEE Trans Affect Comput 2016; 7: 299–310.

11.

Teed

Deng

. Raft: Recurrent all-pairs field transforms for optical flow. In: European conference on computer vision (ECCV), 2020, pp.402–419.

12.

Zheng

Geng

Tao

, et al. A multi-task model for simultaneous face identification and facial expression recognition. Neurocomputing 2016; 171: 515–523.

13.

Zheng

Wang

R-L

W-T

, et al. Discriminative deep multi-task learning for facial expression recognition. Inf Sci (Ny) 2020; 533: 60–71.

14.

Luvizon

Picard

Tabia

. 2D/3D pose estimation and action recognition using multitask deep learning. In: Computer vision and pattern recognition (CVPR), 2018, pp.5137–5146.

15.

Chowdhuri

Pankaj

Zipser

. Multinet: multi-modal multi-task learning for autonomous driving. In: Winter conference on applications of computer vision (WACV), 2017, pp.1496–1504.

16.

Zhe

, et al. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In: Knowledge discovery & data mining (KDD), 2018, pp.1930–1939.

17.

Vaswani

Shazeer

Parmar

, et al. Attention is all you need. In: Proceedings of the 31st International conference on neural information processing systems (NIPS), 2017, pp.6000–6010.

18.

Liong

S-CT

See

Wong

K-L

. Less is more: micro-expression recognition from video using apex frame. Sig Process: Image Commun 2018; 62: 82–92.

19.

Liu

Yang

Xie

, et al. Adaptive activation network and functional regularization for efficient and flexible deep multi-task learning. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 4, 2020, pp.4924–4931.

20.

Gan

Liong

S-CT

Yau

, et al. Off-apexnet on micro-expression recognition system. Sig Process: Image Commun 2019; 74: 129–139.

21.

Xie

H-X

Shuai

H-H

, et al. Mer-gcn: micro expression recognition based on relation modeling with graph convolutional network. In: 2020 IEEE Conference on multimedia information processing and retrieval (MIPR), 2020, pp.79–84.

22.

Leong

Noman

Phan

, et al. Graphex: facial action unit graph for micro-expression classification. In: 2022 IEEE International conference on image processing (ICIP), 2022, pp.3296–3300.