Abstract
Emotion recognition from facial expressions has gained much interest over the last few decades. In the literature, the common approach, used for facial emotion recognition (FER), consists of these steps: image pre-processing, face detection, facial feature extraction, and facial expression classification (recognition). We have developed a method for FER that is absolutely different from this common approach. Our method is based on the dimensional model of emotions as well as on using the kriging predictor of Fractional Brownian Vector Field. The classification problem, related to the recognition of facial emotions, is formulated and solved. The relationship of different emotions is estimated by expert psychologists by putting different emotions as the points on the plane. The goal is to get an estimate of a new picture emotion on the plane by kriging and determine which emotion, identified by psychologists, is the closest one. Seven basic emotions (Joy, Sadness, Surprise, Disgust, Anger, Fear, and Neutral) have been chosen. The accuracy of classification into seven classes has been obtained approximately 50%, if we make a decision on the basis of the closest basic emotion. It has been ascertained that the kriging predictor is suitable for facial emotion recognition in the case of small sets of pictures. More sophisticated classification strategies may increase the accuracy, when grouping of the basic emotions is applied.
Keywords
Introduction
Recently, a fast growth of emotion recognition research has been observed in various types of communication such as text (Shivhare and Khethawat, 2012; Calvo and Kim, 2013; Ramalingam et al., 2018), speech (Tamulevičius et al., 2017, 2019; Sailunaz et al., 2018), body gestures (Stathopoulou and Tsihrintzis, 2011; Metcalfe et al., 2019), and facial expressions (Revina and Emmanuel, 2018; Ko, 2018; Shao and Qian, 2019; Sharma et al., 2019).
Facial expressions are one of the most important means of interpersonal communication, since a facial expression says a lot without speaking. Therefore, research on facial emotions has received much attention in recent decades in applications in the perceptual and cognitive sciences (Purificación and Pablo, 2019). Facial emotion recognition (FER) is widely used in distinct areas such as: neurology (Adolphs and Anderson, 2018; Metcalfe et al., 2019), clinical psychology (Su et al., 2017), artificial intelligence (Ranade et al., 2018), intelligent security (Wang and Fang, 2008), robotics manufacturing (Weiguo et al., 2004), behavioural sciences (Vorontsova and Labunskaya, 2020), multimedia (Mariappan et al., 2012), educational software (Ferdig and Mishra, 2004; Filella et al., 2016), etc.
In the literature, the common approach to facial emotion recognition consists of these steps: image pre-processing (noise reduction, normalization), face detection, facial feature extraction, and facial expression classification (recognition). Numerous techniques have been made for FER by using different methods in these steps (Bhardwaj and Dixit, 2016; Deshmukh et al., 2017; Ko, 2018; Revina and Emmanuel, 2018; Shao and Qian, 2019; Sharma et al., 2019). In the literature, recognition accuracy of this approach varies from approximately 48% to 98% (Deshmukh et al., 2017; Revina and Emmanuel, 2018; Shao and Qian, 2019; Nonis et al., 2019; Sharma et al., 2019). However, the common approach has some drawbacks (Shao and Qian, 2019): a) recognition accuracy is highly dependent on the methods used and the data set analysed; b) methods are often difficult, because of many unknown parameters and/or long computation time.
Recently, deep-learning-based algorithms have been employed for feature extraction, classification, and recognition tasks. The convolutional neural networks and the recurrent neural networks have been applied in many studies including object recognition, face recognition, and facial emotion recognition as well. However, deep-learning-based techniques are available with big data (Nonis et al., 2019). A brief review of conventional FER approaches as well as deep-learning-based FER methods is presented in Ko (2018). It is shown that the average recognition accuracy of six conventional FER approaches is equal to 63.2% and the average recognition accuracy of six deep-learning-based FER approaches is 72.65%, i.e. deep-learning based approaches outperform conventional approaches. In Gan et al. (2019), a novel FER framework via convolutional neural networks with soft labels that associate multiple emotions to each expression image is proposed. Investigations are made on the FER-2013 (35 887 face images) (Goodfellow et al., 2013), SFEW (1766 images) (Dhall et al., 2015) and RAF (15 339 images) (Li et al., 2017) databases, and the proposed method achieves accuracy of 73.73%, 55.73% and 86.31%, respectively.
In this paper, we focus on emotion recognition by facial expression. We have developed an approach, based on the two-dimensional model of emotions as well as using the kriging predictor of Fractional Brownian Vector Field (Motion) (FBVF). The classification problem, related to the recognition of facial emotions, is formulated and solved. The relationship of different emotions is estimated by expert psychologists by putting different emotions as the points on the plane. The kriging predictor allows us to get an estimate of a new picture emotion on the plane. Then, we determine which emotion, identified by psychologists, is the closest one. Seven emotions (Joy, Sadness, Surprise, Disgust, Anger, Fear, and Neutral) have been chosen for recognition.
The advantage of our method is that it is focused on small data sets. In the literature, seven basic emotions (e.g. Joy, Sadness, Surprise, Disgust, Anger, Fear, and Neutral) are usually used. However, sometimes specific emotions are measured. In this case, classical databases with basic emotions cannot be used for training of classifier. If we have little data for the study and cannot adapt other databases, then methods such as CNN will not give good accuracy with a small data set. This is an advantage of the kriging method. Our approach can be easily extended to other emotions.
Computational Models of Emotions
Emotions can be expressed in a variety of ways, such as facial expressions and gestures, speech, and written text. There are two models to recognize emotions: the categorical model and the dimensional one. In the first model, emotions are described with a discrete number of classes, affective adjectives, and, in the second model, emotions are characterized by several perpendicular axes, i.e. by defining where they lie in a two, three or higher dimensional space (Grekow, 2018). The review of these models is made in Sreeja and Mahalakshmi (2017), Grekow (2018).
There are many attempts in the literature to visualize similarities of emotions. This allows them to be compared not only qualitatively but also quantitatively. Such visualizations, namely the quantitative correspondence of emotions to points on the 2D plane, are reviewed below. We rely on this in the proposed new method of recognizing and classifying facial emotions.
Categorical Models of Emotions
Emotions are recognized with the help of words that denote emotions or class tags (Sreeja and Mahalakshmi, 2017). The categorical model either uses some basic emotion classes (Ekman, 1992; Johnson-Laird and Oatley, 1989; Grekow, 2018) or domain-specific expressive classes (Sreeja and Mahalakshmi, 2017). A various set of emotions may be required for different fields, for instance, in the area of instruction and education (D’mello and Graesser, 2007), five classes such as Boredom, Confusion, Joy, Flow, and Frustration are proposed to describe affective states of students.

Hevner’s adjectives arranged into 8 groups (Hevner, 1936).
Regarding categorical models of emotions, there are a lot of concepts about class quantity and grouping methods in the literature. Hevner was one of the first researchers who focused on finding and grouping terms pertaining to emotions (Hevner, 1936). He created a list of 66 adjectives arranged into eight groups distributed on a circle (Fig. 1). Adjectives inside a group are close to each other, and the opposite groups on the circle are the furthest apart by emotion. Farnsworth (1954) and Schubert (2003) modified Hevner’s model by decreasing the number of adjectives to 50 and 46, grouped them into nine groups. Recently, many researchers have been using the concept of six basic emotions (Happiness, Sadness, Anger, Fear, Disgust, and Surprise) presented by Ekman (1992, 1999), which was developed for facial expression. Ekman described features that enabled differentiating six basic emotions. Johnson-Laird and Oatley (1989) indicated a smaller group of basic emotions: Happiness, Sadness, Anger, Fear, and Disgust. In Hu and Downie (2007), five mood clusters were used for song classification. In Hu et al. (2008), etc., a deficiency of this categorical model was indicated, i.e. a semantic overlap among five clusters was noticed, because some clusters were quite similar. In Grekow (2018), a set of 4 basic emotions: Happy, Angry, Sad and Relaxed, corresponding to the four quarters of Russell’s model (Russell, 1980), were used for the analysis of music recordings using the categorical model. More categories of emotions, used by various researchers, are indicated in Sreeja and Mahalakshmi (2017).
The main disadvantage of the categorical model is that it has poorer resolution by using categories than the dimensional model. The number of emotions and their shades met in various types of communication is much richer than the limited number of categories of emotions in the model. The smaller the number of groups in the categorical model, the greater the simplification of the description of emotions (Grekow, 2018).
Emotions can be defined according to one or more dimensions. For example, Wilhelm Max Wundt, the father of modern psychology, proposed to describe emotions by three dimensions: pleasurable versus unpleasurable, arousing versus subduing, and strain versus relaxation (Wundt, 1897).
In the dimensional model, emotions are identified according to their location in a space with a small number of emotional dimensions. In this way, the human emotion is represented as a point on an emotion space (Grekow, 2018). Since all emotions can be understood as changing values of the emotional dimensions, the dimensional model, in contrast to the categorical one, enables us to analyse the larger number of emotions and their shades. Commonly emotions are defined in a two (valence and arousal) or three (valence, arousal, and power/dominance) dimensional space. The valence dimension (emotional pleasantness) describes the positivity or negativity of an emotion and ranges from unpleasant feelings to a pleasant feeling (sense of happiness). The arousal dimension (physiological activation) denotes the level of excitement that the emotion depicts, and it ranges from Sleepiness or Boredom to high Excitement. The dominance (power, influence) dimension represents a sense of control or freedom to act. For example, while Fear and Anger are unpleasant emotions, Anger is a dominant emotion, and Fear is a submissive one (Mehrabian, 1980, 1996; Grekow, 2018).
The two-dimensional models such as the Russell’s circumplex model (Russell, 1980) (Section 2.2.1), Thayer’s model (Thayer, 1989) (Section 2.2.2), the vector model (Bradley et al., 1992) (Section 2.2.3), the Positive Affect – Negative Affect (PANA) model (Watson and Tellegen, 1985; Watson et al., 1999) (Section 2.2.4), Whissell’s model (Whissell, 1989) (Section 2.2.5), and Plutchik’s wheel of emotions (Plutchik and Kellerman, 1980; Plutchik, 2001) (Section 2.2.6) are the most prevalent in emotion research. Among the three-dimensional models, Plutchik’s cone-shaped model (Plutchik and Kellerman, 1980; Plutchik, 2001) (Section 2.2.6), the Pleasure–Arousal–Dominance (PAD) model (Mehrabian and Russell, 1974) (Section 2.2.7), and Lövheim cube of emotion (Lövheim, 2011) (Section 2.2.8) are the most dominant and commonly used in emotion recognition field. Researchers have noticed that, in particular cases, two or three dimensions cannot adequately describe human emotions. Consequently, four or more dimensions are necessary to identify affective states. The number of dimensions, required to represent emotions, depends on the problem the researcher is solving (Fontaine et al., 2007; Cambria et al., 2012). The Hourglass Model (Cambria et al., 2012) (Section 2.2.9) is an interesting combination of the categorical and four-dimensional models.
The description of emotions by using dimensions has some advantages. Dimensions ensure a unique identification and a wide range of the emotion concepts. It is possible to identify fine emotion concepts (shades of an emotion) that differ only to a small extent. Thus, a dimensional model of emotions is a useful representation capturing all relevant emotions and providing a means for measuring the similarity between emotional states (Sreeja and Mahalakshmi, 2017). The categorical model is more general and simplified in describing emotions, and the dimensional model is more detailed and able to detect shades of emotions (Grekow, 2018).
Russell’s Circumplex Model

Russell’s circumplex model (Russell, 1980).
The first two-dimensional model was developed by Russell (1980) and is known as the Russell’s circumplex model (the circumplex model of affect) (Fig. 2). Russell identified two main dimensions of an emotion: arousal (physiological activation) and valence (emotional pleasantness). Arousal can be treated as high or low and valence may be positive or negative.
The circumplex model is formed by dividing a plane by two perpendicular axes. Valence represents the horizontal axis (negative values to the left, positive ones to the right) and arousal represents the vertical axis (low values at the bottom, high ones at the top). Emotions are mapped as points in a circumplex shape. The centre of this circle represents a neutral value of valence and a medium level of arousal, i.e. the centre point depicts a neutral emotional state. In this model, all emotions can be represented as points at any values of valence and arousal or at a neutral value of one or both of these dimensions.
The four basic categories of emotions can be highlighted regarding the quarters of Russell’s model as follows: 1) Happy – high valence, high arousal (top-right), 2) Angry – low valence, high arousal (top-left), 3) Sad – low valence, low arousal (bottom-left), 4) Relaxed – high valence, low arousal (bottom-right) (Wilson et al., 2016; Grekow, 2018).
Thayer’s model (Thayer, 1989) is a modification of Russell’s circumplex model. Thayer proposed to describe emotions by two separate arousal dimensions: energetic arousal and tense arousal, also named energy and stress, correspondingly. Valence is supposed to be a varying combination of these two aforementioned dimensions. For example, in Thayer’s model, Satisfaction and Tenderness take up a position in a part of low energy-low stress; Astonishment, Surprise position in high energy-low stress part; Anger, Fear belong to a high energy – high stress part, and Depression, Sadness take up a position in a part of low energy-high stress, correspondingly. Figure 3 presents a visual perception of both Russell’s circumplex model and Thayer’s one.

Schematic diagram of the two-dimensional models of emotions with common basic emotion categories overlaid (Eerola and Vuoskoski, 2011).
The vector model of emotion (Bradley et al., 1992) holds that emotions are structured in terms of valence and arousal, but they are not continuously related or evenly distributed along these dimensions (Wilson et al., 2016). This model assumes that there is an underlying dimension of arousal and a binary choice of valence that determines a direction in which a particular emotion lies. Thus, two vectors are obtained. Both of them start at zero arousal and neutral valence and proceed as straight lines, one in a positive, and one in a negative valence direction (Rubin and Talarico, 2009). Figure 4 exhibits the Russell’s circumplex (left) and vector (right) models assuming valence is varying in the interval

Instantiations of the Russell’s circumplex (left) and vector (right) two-dimensional models (Wilson et al., 2016).
The Positive Affect – Negative Affect (also known as Positive Activation – Negative Activation) (PANA) model (Watson and Tellegen, 1985; Watson et al., 1999) characterizes emotions at the most general level. Figure 5 accurately generalizes the relations among the affective states. Terms of affect within the same octant are highly positively correlated, meanwhile, the ones in adjacent octants are moderately positively correlated. Terms 90° apart are substantially unrelated to one another, whereas those 180° apart are opposite in meaning and highly negatively correlated.

The basic two-factor structure of affect (Watson and Tellegen, 1985).
Figure 5 schematically depicts the two-dimensional (two-factor) affective spaces. In the basic two-factor space, the axes are displayed as solid lines. The horizontal and vertical axes represent Negative Affect and Positive Affect, respectively. The first factor, Positive Affect (PA), represents the extent (from low to high) to which a person shows enthusiasm in life. The second factor, Negative Affect (NA), is the extent to which a person is feeling upset or unpleasantly aroused. At first sight, the terms Positive Affect and Negative Affect can be perceived as opposite ones, i.e. negatively correlated. However, they are independent and uncorrelated dimensions. We can notice from Fig. 5 that many affective states are not pure markers of either Positive or Negative Affect as these concepts are described above. For instance, the Pleasantness includes terms representing a mixture of high Positive Affect and low Negative Affect, and Unpleasantness contains emotions between high Negative Affect and low Positive Affect. Terms denoting Strong Engagement have moderately high values of both factors PA and NA, whereas emotions representing Disengagement reflect low values of each dimension PA and NA. Thus, Fig. 5 also depicts an alternative rotational scheme that is indicated by the dotted lines. The first factor (dimension) represents the Pleasantness-Unpleasantness (valence), while the second factor (dimension) represents Strong Engagement-Disengagement (arousal).
Thus, the PANA model is commonly understood as a 45-degree rotation of the Russell’s circumplex model as it is a circle and the dimensions of valence and arousal lay at a 45-degree rotation over the PANA model axes NA and PA, respectively (Watson and Tellegen, 1985). In Rubin and Talarico (2009), it is noticed that the PANA model is more similar to the vector model than a circumplex one. The similarity between the PANA and vector models is explained as follows. In the vector model, low arousal emotions are more likely to be neutral and high arousal ones are differentiated by their valence. Most affective states cluster in the high Positive Affect and high Negative Affect octants (Watson and Tellegen, 1985; Watson et al., 1999). This corresponds to the prediction of the vector model, i.e. an absence of high arousal and neutral valence emotions. In conclusion, the PANA model can be employed while exploring emotions of high levels of activation like in the vector model (Rubin and Talarico, 2009).
Similarly to the Russell’s circumplex model, Whissell represents emotions in a two-dimensional continuous space, the dimensions of which are evaluation and activation (Whissell, 1989). The evaluation dimension is a measure of human feelings, from negative to positive. The activation dimension measures whether a human is less or more likely to take some action under the emotional state, from passive to active. Whissell has made up the Dictionary of Affect in Language by assigning a pair of values to each of the approximately 9000 words with affective connotations. Figure 6 depicts the position of some of these words in the two-dimensional circular space (Cambria et al., 2012).

The two-dimensional representation of emotions by the Whissell’s model (Cambria et al., 2012).
In 1980, Robert Plutchik created a wheel of emotions seeking to illustrate different emotions and their relationship. He proposed a two-dimensional wheel model and a three-dimensional cone-shaped model (Plutchik and Kellerman, 1980; Plutchik, 2001).
In order to make the wheel of emotions, Plutchik used eight primary bipolar emotions such as Joy versus Sadness, Anger versus Fear, Trust versus Disgust, and Surprise versus Anticipation, as well as eight advanced, derivative emotions (Optimism, Love, Submission, Awe, Disapproval, Remorse, Contempt, and Aggressiveness), each composed of two basic ones. This circumplex two-dimensional model combines the idea of an emotion circle with a colour wheel. With the help of colours, primary emotions are presented at different intensities (for instance, Joy can be expressed as Ecstasy or Serenity) and can be mixed with one another to form different emotions, for example, Love is a mixture of Joy and Trust. Emotions, obtained from two basic emotions, are shown in blank spaces. In this two-dimensional model, the vertical dimension represents intensity and the radial dimension represents degrees of similarity among the emotions (Cambria et al., 2012). The three-dimensional model depicts relations between emotions as following: the cone’s vertical dimension represents intensity, and the circle represents degrees of similarity among the emotions (Maupome and Isyutina, 2013). Both models are shown in Fig. 7.

Plutchik’s two-dimensional wheel of emotions and the cone-shaped model, three-dimensional wheel of emotions, demonstrating relationships between basic and derivative emotions (Maupome and Isyutina, 2013).
The Mehrabian and Russell’s Pleasure-Arousal-Dominance (PAD) model (Mehrabian and Russell, 1974) was developed seeking to describe and measure a human emotional reaction to the environment. This model identifies emotions by using three dimensions such as pleasure, arousal, and dominance. Pleasure represents positive (pleasant) and negative (unpleasant) emotions, i.e. this dimension measures how pleasant an emotion is. For example, Joy is a pleasant emotion, and Sadness is unpleasant one. Arousal shows a level of energy and stimulation, i.e. measures the intensity of an emotion. For instance, Joy, Serenity, and Ecstasy are pleasant emotions, however, Ecstasy has a higher intensity and Serenity has a lower arousal state in comparison with Joy. Dominance represents a sense of control or freedom to act. For example, while Fear and Anger are unpleasant emotions, Anger is a much more dominant emotion than Fear (Mehrabian, 1980, 1996; Grekow, 2018). The PAD model is similar to the Russell’s model, since two dimensions, arousal and pleasure that resembles valence, are the same. These models differ because of the third dominance dimension that is been used to perceive whether a human feels in control of the state or not (Sreeja and Mahalakshmi, 2017).
Lövheim Cube of Emotion
In 2011, Lövheim revealed that the monoamines such as serotonin, dopamine and noradrenaline greatly influence human mood, emotion and behaviour. He proposed a three-dimensional model for monoamine neurotransmitters and emotions. In this model, the monoamine systems are represented as orthogonal axes and the eight basic emotions, labelled according to Silvan Tomkins, are placed in the eight corners of a cube. According to Lövheim model, for instance, Joy is produced by the combination of high serotonin, high dopamine and low noradrenaline (Fig. 8). As neither the serotonin nor the dopamine axis is identical to the valence dimension, the cube seems somewhat rotated in comparison to aforementioned models. This model may help perceive human emotions, psychiatric illness and the effects of psychotropic drugs (Lövheim, 2011).

Lövheim cube of emotion (Lövheim, 2011).
Cambria et al. (2012) proposed a biologically inspired and psychologically motivated emotion categorization model that combines categorical and dimensional approaches. The model represents emotions both through labels and through four affective dimensions (Cambria et al., 2012). This model, also called the Hourglass of Emotions, reinterprets Plutchik’s model (Plutchik, 2001) by organizing primary emotions (Joy, Sadness, Anger, Fear, Trust, Disgust, Surprise, Anticipation) around four independent but concomitant affective dimensions such as pleasantness, attention, sensitivity, and aptitude, whose different levels of activation make up the total emotional state of the mind.
These dimensions measure how much: the user is amused by interaction modalities (pleasantness), the user is interested in interaction contents (attention), the user is comfortable with interaction dynamics (sensitivity), and the user is confident in interaction benefits (aptitude). Each dimension is characterized by six levels of activation (measuring the strength of an emotion). These levels are also labelled as a set of 24 emotions (Plutchik, 2001). Therefore, the model specifies the affective information associated with the text both in a dimensional and in a discrete form. The model has an hourglass shape because emotions are represented according to their strength (from strongly positive to null to strongly negative) (Fig. 9).

The 3D model and the net of the hourglass of emotions (Cambria et al., 2012).
In our research, the two-dimensional circumplex space model of emotions (Fig. 10), based on the Russell’s model (Russell, 1980) and Scherer’s structure of the semantic space for emotions (Scherer, 2005) as well as employing numerical proximities of human emotions (Gobron et al., 2010), is used for facial emotion recognition. Figure 10 is taken from Paltoglou and Thelwall (2013). Its obtainment is described below. A set of emotions is visualized on a 2D plane, giving a particular place for each emotion.

The two-dimensional circumplex space model of emotions. Upper-case notation denotes the terms used by Russell, lower-case notation denotes the terms used by Scherer. Figure is taken from Paltoglou and Thelwall (2013).
Figure 10 illustrates the alternative two-dimensional structures of the semantic space for emotions. In Scherer (2005), a number of frequently used and theoretically interesting emotion categories were arranged in a two-dimensional space that is formed (constructed) by goal conduciveness versus goal obstructiveness on the one hand and high versus low control/power on the other. Scherer used the Russell’s circumplex model that locates emotions by a circumplex way in the two-dimensional valence – arousal space. In Fig. 10, upper-case notation denotes the terms used by Russell (1980). Onto this representation, Scherer superimposed the two-dimensional structure based on similarity ratings of 80 German emotion terms (lower-case terms, translated to English). The exact location of the terms (emotions) in a two-dimensional space is indicated by the plus (+) sign. It was noticed that this simple superposition yielded a remarkably good fit (Scherer, 2005).
In Fig. 10, every emotion is represented as a point that has two coordinates: valence and arousal. The coordinates of the mapped emotions (values of valence and arousal) are taken from Gobron et al. (2010) and are given in Paltoglou and Thelwall (2013). The valence parameter is determined by using the four parameters (two lexical, two language), derived from the data mining model that is based on a very large database (4.2 million samples). The arousal parameter is based on the intensity of the vocabulary. The valence and arousal values were generated from lexical and language classifiers and the probabilistic emotion generator (the Poisson distribution is used). A statistically good correlation with James Russell’s circumplex model of emotion was obtained. The control mechanism was based on Ekman’s Facial Action Coding System (FACS) action units (Ekman and Friesen, 1978).
The Russell’s circumplex model is widely used in various areas of emotion recognition. Gobron et al. transferred lexical and language parameters, extracted from database, into coherent intensities of valence and arousal, i.e. parameters of Russell’s circumplex model. Paltoglou and Thelwall (2013) employed these values of valence and arousal to the emotion recognition from segments of a written text in blog posts. We have decided to use this two-dimensional model of emotions (Fig. 10) and the derived emotion coordinates for the facial emotion recognition. To our knowledge, it has not been done before.
Recently, Fractional Brownian Vector Field (Motion) (FBVF) has been very popular among mathematicians and physicists (Yancong and Ruidong, 2011; Tan et al., 2015). The created model for FER is based on modelling valence and arousal dimensions in Russell’s model by the two-dimensional FBVF. Hereinafter, these dimensions are called coordinates as well.
Stochastic model of facial emotions on pictures should incorporate uncertainty about quantities in unobserved points and to quantify the uncertainty associated with the kriging estimator. Namely, the emotion at each facial picture is considered as a realization of FBVF
which for every point in the variables space
Thus, assume, the set
Degree d is a perfect parameter of FBVF as well, which can be estimated according to observation data. The maximal likelihood estimate
Novelty of our method is as follows: 1) We evaluate the Hurst parameter d by the maximum likelihood method; 2) We use a posteriori expectations and covariance matrix for kriging prediction of emotion model dimensions (coordinates); 3) We apply kriging predictor to FER in pictures.
Assume one has to predict the value of response vector surface Z at some point
This prediction is stochastic, its uncertainty is described by the conditional variance:
Regarding the kriging model, the resent novelty is the introduction of
In this paper, the kriging predictor has been employed for emotion recognition from facial expression and explored experimentally because the kriging predictor performs simple calculations and has only one unknown parameter d, as well as because this method works very well with small data sets.
Warsaw set of emotional facial expression pictures (WSEFEP) (Olszanowski et al., 2015) has been used in the experiments. This set contains 210 high-quality pictures (photos) of 30 individuals (14 men and 16 women). They display six basic emotions (Joy, Sadness, Surprise, Disgust, Anger, Fear) and Neutral display. Examples of each basic emotion displayed by one woman are shown in Fig. 11.
The original size of these pictures was
Each picture has been digitized, i.e. a data point consists of colour parameters of pixels, and, therefore, it is of very large dimensionality. The number of pictures (data points) is

Examples of each basic emotion displayed by one woman (original pictures).

Examples of each basic emotion displayed by one woman (cropped and resized pictures).
Before presenting the kriging algorithm, some mathematical notations are introduced below. Suppose that the analysed data set
Since the two-dimensional circumplex space model of emotions (Fig. 10) is used for facial emotion recognition in the investigations, every emotion is represented as a point that has two coordinates: valence and arousal. The coordinates of the seven basic emotions (values of valence and arousal) are taken from Gobron et al. (2010) and are given in Paltoglou and Thelwall (2013). These coordinates are presented in Table 1.
The valence and arousal coordinates of seven basic emotions in the two-dimensional circumplex emotion space.
The valence and arousal coordinates of seven basic emotions in the two-dimensional circumplex emotion space.
As a picture emotion is known in advance, each data point
The kriging predictor algorithm is as follows: The Euclidean distance matrix D between all the data points This matrix is normalized by dividing each element from the largest one. Denote the Hurst parameter by d, where d is a real number, Elements of the normalized distance matrix D are raised to the power of ( The kriging prediction of a new (testing) picture emotion is made by using a posteriori expectation:
Here,
The kriging predictor algorithm has only one unknown parameter d. The first investigation is performed seeking to find the optimal value of d. At first, the maximum likelihood (ML) function of picture emotion features
In the next step, values of the ML function f are calculated for various values of the parameter d, i.e.

Dependence of the maximum likelihood function f on the parameter d.
The first investigation is pursued in order to recognize an emotion of a particular picture and evaluate the result obtained, as well as to verify that the optimal value
In fact, we have a problem of classification into seven classes. Let the analysed picture data set
The efficiency of classifier will be estimated after such a run through all N experiments with picking different ith pictures for testing (N runs). Since the true picture emotions are known in advance, it is possible to find out how many picture emotions from the whole picture set (
Figure 15 illustrates the dependence of the picture emotion classification accuracy (CA) (%) on the parameter d, as

The basic emotions, depicted in the analysed model of emotions. The coordinates of points are given in Table 1.

The dependence of the picture emotion classification accuracy on the parameter d.
Figure 16 shows the mapping of predicted coordinates (valence and arousal) of all the 210 picture emotions in the two-dimensional circumplex space. It is obvious that Joy is predicted most precisely. However, the remaining emotions overlap quite strongly.

The mapping of predicted coordinates of all the 210 picture emotions in the two-dimensional circumplex space.
For deeper analysis of this classification, a confusion matrix of the seven basic emotions is given in Table 2. The highest true positive rates were observed for Joy (80%), Neutral (76.7%), and Disgust (60%). The highest false positive rates (the numbers are written in red) were observed for Anger (56.7% of pictures with Anger emotion were classified as Disgust), Fear (36.7% as Surprise), Sadness (36.7% as Neutral), and Surprise (33.3% as Fear).
Confusion matrix of the seven basic emotions.
The second investigation is similar to the first one because the ith picture emotion (
Facial emotion recognition (FER) is an important topic in computer vision and artificial intelligence. We have developed the method for FER, based on the dimensional model of emotions as well as using the kriging predictor of Fractional Brownian Vector Field. The classification problem, related to the recognition of facial emotions, is formulated and solved. We use the knowledge of expert psychologists about the similarity of various emotions in the plane. The goal is to get an estimate of a new picture emotion on the plane by kriging and determine which emotion, identified by psychologists, is the closest one. Seven basic emotions (Joy, Sadness, Surprise, Disgust, Anger, Fear, and Neutral) have been chosen. The experimental exploration has shown that the best classification accuracy corresponds to the optimal value of Hurst parameter, estimated by the maximum likelihood method. The accuracy of classification into seven classes has been obtained approximately 50%, if we make a decision on the basis of the closest basic emotion. It has been ascertained that the kriging predictor is suitable for facial emotion recognition in the case of small sets of pictures. More sophisticated classification strategies may increase the accuracy, when grouping of the basic emotions is applied.
