A novel pedestrian detection algorithm based on data fusion of face images

Abstract

In order to facilitate effective crime prevention and to issue timely warnings for the sake of public security, it is important to pinpoint the accurate position of particular pedestrians in crowded areas. Face recognition is the most popular method to detect and track pedestrian movement. During the face recognition process, feature classification ability and reliability are determined by the feature extraction methods. The primary challenge for researchers is to obtain a stable result while the targeted face is subject to varying conditions—particularly of illumination. To address this issue, we propose a novel pedestrian detection algorithm with multisource face images, which involves a face recognition algorithm based on the conjugate orthonormalized partial least-squares regression analysis under a complex lighting environment. Statistical learning theory is a research specialization of machine learning, especially applicable to small samples. Building upon the theoretical principles used to solve small-sample statistical problems, a new hypothesis has been developed; using this concept, we integrate the conjugate orthonormalized partial least-squares regression with the revised support vector machine algorithm to undertake the solution of the facial recognition problem. The experimental result proves that our algorithm achieves better performance when compared with other state-of-the-art methodologies, both numerically and visually.

Keywords

Pedestrian detection face recognition data fusion conjugate orthonormalized partial least squares

Introduction

Owing to the rapid development of the economy, an increasing number of people like to participate in various activities in public areas. Therefore, the issue of public security is attracting increasing attention. In order to achieve effective crime prevention and to facilitate timely warnings to the public when there are security risks, a pedestrian detection and tracking system is widely used in crowded areas, such as airports, rail stations, shopping centers, and football stadiums.¹ Existing pedestrian detection can be categorized into two groups: one group is characterized by a manual detection of pedestrians through visual inspection techniques and algorithms such as local binary patterns (LBP)² and histogram of oriented gradients (HOG);³ another group is characterized by data-driven features which use learning models based on data to distinguish pedestrian features such as convolutional neural network (CNN)⁴ and deep belief network (DBN).⁵ Face recognition is the most important technique in the implementation of pedestrian surveillance and tracking. It will help find particular pedestrians of interest in crowded areas. In recent years, face recognition research has made great progress.^6–9 Face recognition extracts, and then uses, personal features from a facial image to identify the person. A simple automatic face recognition system consists of the following four aspects: (1) face image standardization, which performs feature point calibration and image cutting; (2) face detection, which uses a variety of different scenarios to detect the existence of the human face and determine its position; (3) face representation, which uses some way to detect the human face and a database of known faces to complete a match; and (4) face recognition, which is the matching of the face against the database of known faces to obtain the relevant information.

In face recognition, the factors to consider in determining the feature extraction method are the feature classification ability, algorithmic complexity, and its feasibility. The extracted features have a decisive impact on the final classification results, while the upper limit of the resolution that the classifier can achieve determines the maximum degree of differentiation between different kinds of characteristics. Regarding the feature extraction modes for faces, we can summarize the existing methodologies as follows:^10–19

The nonlinear projection feature extraction. The kernel method is an effective tool to transform the low-dimensional space point sets into high-dimensional space sets, to attain the goal of lineal separability. It also has some weaknesses, in that the geometric meaning is not clear, owing to which samples cannot be explicitly mapped into distribution patterns; furthermore, the selection of the kernel function has no well-defined selection criteria, but instead relies on experience to select parameters, which is not suitable for a large training sample.

Linear projection feature extraction. The subspace method based on linear projection is the most influential step in the feature extraction process of facial recognition. It extracts original sample distribution information or a low-dimensional characteristic of basic classified information contained within an image matrix or vector, through the corresponding algebraic method, and thus completes the process of face recognition.

Feature extraction based on prior knowledge. The feature extraction method based on prior knowledge mainly utilizes the shape of the face and characteristics such as the distance between the various facial organs to help classify facial characteristics. This is one of the earliest, most traditional, and effective methods in use.

Even with the same target object, the image acquisition system may capture entirely different images due to various factors. In particular, lighting conditions play an important role in the images. The appearance of the face under different lighting conditions is shown in Figure 1. Given a certain device or sensor, it is difficult for the human eye to distinguish if the two images from that sensor represent the same object under different lighting conditions, or represent two entirely different objects. Therefore, in automatic target recognition system design, the elimination or suppression of the effect of light is a challenging task.

Figure 1.

Demonstration of facial appearances under different lighting conditions.

In this situation, the traditional algorithms deal with the challenge through the following approaches: (1) A single face sample image is derived using the gray histogram equalization algorithm for the object. Histogram equalization converts the gray-scale distribution of an image to become more even. Through histogram equalization, the gray-level distribution is more balanced and the details of the image are more clearly visible. However, after the gray-level histogram equalization, the illumination in the facial sample has no correlation with the light in the images used to train the system. (2) Then the training sample gray-scale distribution mean and variance based on gray normalization were used. For the faces to be recognizable, the mean and variance of their image’s gray-level distribution must be consistent with that of the face training sample’s gray-scale distribution mean and variance; only then can we achieve accurate identification by matching the brightness of the face image to that of the training sample in the library. However, in this method, the images themselves may be overwritten due to the impact of gray distribution and the illumination normalization of pixels. A method to match the content of the corresponding pixels of the same image after compensating for the illumination normalization effect is better. In Figure 1, we show the sample demonstration of the appearance of the face under different lighting conditions. Pedestrian detection is a crucial task in crowded areas. Even though intensive studies about human detection have been carried out for decades, most of the existing algorithms are not appropriate for pedestrian detection under complex illumination conditions. In this study, to deal with the illumination change challenges, we propose a novel face recognition algorithm based on the conjugate orthonormalized partial least-squares regression analysis algorithm under a complex lighting environment. The experimental results indicate that the facial recognition rate of the proposed method is better than that of other methods.

Illumination modeling

The nonparametric kernel density model is one of the most common background models in use. It provides a description of the background sample space distribution pattern of the effective means and overcomes the traditional issues with parameter estimation. The problem here is the dependency on an a priori knowledge of the background; however, often there are not enough background samples for the premise. This issue will result in an increased amount of calculation at the probability density estimation stage. The method employed to weigh the detection accuracy and computational efficiency of the algorithm is the key to an accurate kernel density estimate.

Consider an image sequence with basic variation in illumination. Its observed brightness values change along with the change in time that together constitute a time series $X_{t} (i, j)$ . According to the time series stationarity and pure randomness, a time series can be divided into different types; we need to choose different models for analysis. Nonparametric kernel density estimation using the kernel function of the data points in the window of the data points is calculated using the probability density distribution. It is a type of estimation from the dataset used to study the characteristics of probability distribution of the sample itself. Background modeling is often performed with multiple Gaussian models to represent the distribution of the change of each pixel in the image, according to which the illumination distribution could be expressed as

p (X_{t}) = \sum_{i = 1}^{K} w_{i, t} N (X_{t}, μ_{i, t}, \sum_{i, t})

(1)

where $w_{i, t}$ represents the weights of Gaussian distribution, K represents the number of the adopted Gaussian distribution terms, $μ_{i, t} and \sum_{i, t}$ denote the mean and covariance matrix, respectively. Each Gaussian covariance matrix of the model and the mean vector need to be updated, and the weights of each model are the weights of normalized processing. In the following equations, we demonstrate the updating of references for parameters

M_{i, t} = {\begin{matrix} 1, & | x_{t} - μ_{i, t} | ⩽ 2.5 σ_{i, t} \\ 0, & else \end{matrix}

(2)

w_{i, t} = (1 - α) w_{i, t - 1} + α (M_{i, t})

(3)

According to the update mechanism of the Gaussian mixture model (GMM) that usually incorporates Gaussian scene matching, the reference weight will keep increasing; it does not match the values derived from the model which predict a decreasing weight. In this case, we conclude that creating a new Gaussian distribution rather than maintaining an old Gaussian distribution will take longer. Therefore, the deletion of the Gaussian distribution is not reasonable when the weight of the Gaussian model is the sole deciding factor. However, if other factors are taken into account, it will greatly increase the time complexity of the algorithm. The Gaussian probability density function can be expressed as follows

\begin{matrix} N (X_{t}, μ_{i, t}, \sum_{i, t}) = \frac{1}{{(2 π)}^{n / 2} {| \sum_{i, t} |}^{1 / 2}} \\ \exp (- \frac{1}{2} {(X_{t} - μ_{i, t})}^{T} \sum_{i, t}^{- 1} (X_{t} - μ_{i, t})) \end{matrix}

(4)

We may conclude that, from a subset of the original sample set sampling, when subset capacity and window width are suitable, the subset probability density can be close to the original probability density and may have nothing to do with the original sample set capacity. Based on the above theory, if key background information is selected, we can completely remove the participation of the general original sample in the background estimation calculation, resulting in the implementation of a low-cost high-precision approximate estimation. In the situation where color features are not considered, the histogram gray classification of the Gaussian kernel density estimation of probability density can be expressed as follows

p (I_{t}) = \sum_{i = 1}^{Number} \frac{1}{\sqrt{2 π h^{2}}} \exp [- \frac{1}{2} \frac{{(I_{t} - S_{i})}^{2}}{h^{2}}]

(5)

where h represents the bandwidth, $I_{t}$ represents the face image, and $S_{i}$ denotes the scaled kernel.

Facial feature extraction paradigms

Gabor feature extraction

A Gabor filter is sensitive to direction and its choice of spatial local features is not sensitive to posture, both of which are barriers rather than global features.²⁰ Extracting the characteristics of the training data can obtain a larger compression dictionary that can be defined by

\begin{matrix} ψ_{u, v} (z) = \frac{{‖ k_{u, v} ‖}^{2}}{σ^{2}} \exp (- \frac{{‖ k_{u, v} ‖}^{2} {‖ z ‖}^{2}}{2 σ^{2}}) \\ [\exp (z \cdot i k_{u, v}) - \exp (- \frac{σ^{2}}{2})] \end{matrix}

(6)

In the formula, $z (x, y)$ represents the pixel and the parameters can be defined as follows

k_{v} = \frac{k_{\max}}{f^{v}}; k_{u, v} = k_{v} \exp (j δ_{μ})

(7)

In order to improve the robustness of sparse representation in face recognition, we assume that the encoded residual meets the criteria of a random distribution. Suppose $I (x, y)$ is the facial image; the Gabor filtered term can be expressed as follows

Gabo r_{u, v} (x, y) = I (x, y) * Ψ_{u, v} (z)

(8)

Revised LBP feature

The LBP operator, based on pixels of the image, is described in this section. A size of 3 × 3 squares is assumed to represent a cell; the central pixel values are taken as the baseline reference. Approximately eight of the central pixels have a value greater than the reference value; if the value is greater than the baseline, then the LBP operator assumes it to be binary 1, or else the operator reduces the value to binary 0

LB P_{pure} = \sum_{i = 1}^{8} s (g_{i} - g_{c}) \times 2^{i}

(9)

The LBP extraction has high integrity and less faults; however, it is sensitive to noise. Hence, this article has adopted the consistent pattern of block LBP of the facial image that is described here. To improve the recognition performance, consistent patterns of LBP 0 to 1 or 1 to 0 that occur at least twice in succession are considered. This reduces the characteristics of noise sensitivity. Accordingly, we show the local pattern of the faces in Figure 2.

Figure 2.

General illustration of the local pattern of the faces.

HOG feature extraction

HOG is the gradient direction histogram and is based on pixel points; each pixel point is calculated and the gradient in the direction of the gradient of amplitude²¹ is computed. The HOG calculation method of the image is performed according to the size of a 16 × 16 block of pixels; the sub-blocks are divided into 2 × 2 cells, while each pixel point within the cell block gradient direction histogram is calculated. It will block all the small-cell block histograms in turn that are connected into the large block histogram. Next, we derive each block histogram for the combination and finally obtain the image of the whole histogram. Accordingly, we show the HOG calculation term as follows

m (x, y) = \sqrt{{I (x, y + 1) - I (x, y - 1)}^{2} + {I (x - 1, y) - I (x + 1, y)}^{2}}

(10)

θ (x, y) = \tan^{- 1} [\frac{I (x, y + 1) - I (x, y - 1)}{I (x + 1, y) - I (x - 1, y)}]

(11)

Extension locality preserving projection

This projection is located in the high-dimensional Euclidean space $R^{D}$ sample set of $X = {x_{1}, x_{2},$ $x_{3}, \dots, x_{N}}$ , seeking projection matrix A; we want to map these samples to the relatively low-dimensional feature space $R^{D}$ . Locality preserving projection (LPP) algorithm is also a dimension reduction, with the purpose of the sample inherent to the local manifold structure being unchanged with respect to the paradigm of formula (12)

\sum_{ij} {(y_{i} - y_{j})}^{2} W_{ij}

(12)

The traditional LPP that is used in unsupervised learning methods failed to extract the discriminant information for the samples. Therefore, we define the enhanced locality preserving projection (ELPP) criterion function as

θ \sum_{ij} {(y_{i} - y_{j})}^{2} W_{ij} + (1 - θ) S_{w} \frac{1}{S_{b}}

(13)

Local and global combined feature extraction method

Many scholars at home and abroad showed great concern for learning manifolds, proposed from mapping the locally linear embedding. Laplace characteristics and the local projection of the manifold learning method can learn low-dimensional manifold structures hidden in the high-dimensional data space. However, these methods are trying to stay in the low-dimensional subspace local manifold structure of high-dimensional data space and ignored the study of discriminant information in high-dimensional data space.²²

In the later experiment in this study, we will adopt this method as the feature extraction step. The same approach by building the intrinsic relation graph $G^{c}$ and punish graph $G^{p}$ to describe classes of data within the tightness and the degree of separation between classes is adopted. However, when the map is built to punish, using all of the core different classes of sample data, we need to properly emphasize that the smaller spacing is not the same in the sample data on the role of $G^{c}$ as the weight between the nodes. This can be expressed as follows

W_{ij} = {\begin{matrix} \exp (- {‖ x_{i} - x_{j} ‖}^{2} / t), & i \in N_{k 1} (j) | j \in N_{k 1} (i) \\ 0, & others \end{matrix}

(14)

Conjugate orthonormalized partial least-squares regression analysis

Principles of partial least-squares regression analysis

Suppose that we have the dependent variable y and the general independent variable set $X = (X_{1}, X_{2}, \dots, X_{p})$ . All the elements are standardized variables that represent a mean of 0 and a variance of 1. The partial least-squares (PLS) regression method is first used to extract a principal component in the independent variable set X with the primary component $t_{1}$ . We must meet the following two conditions:

$t_{1}$ and y should carry in their data as much information as possible for further analysis;

$t_{1}$ and y hold the maximized degree of correlation.

Accordingly, the question could be solved by

Max cov (y, t_{i}) = \bar{var (t_{1})} r (t_{1}, y) s . t . t_{1} = X w_{1}

(15)

If the above conditions are met, then $t_{1}$ is most likely to contain the data table of the information. After the first component $t_{1}$ and implementation of X for y are returned, respectively, if the regression equation satisfies preset accuracy targets in the algorithm that determines whether to stop or continue, the use of X after $t_{1}$ explains the extracted element of the residual information for 1 s. This is repeated until the precision meets the requirements.

In Figure 3, we demonstrate the use of the PLS regression model. After the prior modeling, we express the regression as follows

{\begin{matrix} X = t_{1} P_{1} + X_{1} \\ Y = t_{1} R_{1} + Y_{1} \end{matrix}

(16)

where the regression coefficient can be generally expressed as follows

P_{1} = \frac{X^{T} t_{1}}{{‖ t_{1} ‖}^{2}}

(17)

R_{1} = \frac{Y^{T} t_{1}}{{‖ t_{1} ‖}^{2}}

(18)

Figure 3.

Demonstration of the partial least-squares regression.

Accordingly, we build the least-squares regression model as follows

{\begin{matrix} X = t_{1} {P'}_{1} + t_{2} {P'}_{2} + t_{3} {P'}_{3} + \dots + t_{m} {P'}_{m} + X_{m} \\ Y = a_{1} t_{1} + a_{2} t_{2} + a_{3} t_{3} + \dots + a_{m} t_{m} + Y_{m} \end{matrix}

(19)

For that model, $t_{i}$ represents the linear combination of $x_{i}$ , and the Y-axis could be rewritten as the regression equation related to $x_{i}$ as follows

Y = β_{1} X_{1} + β_{2} X_{2} + \dots + β_{m} X_{m}

(20)

While using the PLS method for face recognition, training of the concentration of each face image is in accordance with the method that employs column stacking into a vector $X_{i}$ . We set the dimension of the vector to be M, and the entire training set constitutes the independent variable set X.

The least-squares support vector machine

By the least-squares curve fitting, and with about the same amount of error in the determination of good performance as in the objective function of standard support vector machine (SVM), we introduce the error variance law that puts forward the concept of least-squares SVM.²³ This method uses the least-squares linear system as the loss function, in contrast to the classical SVM. Owing to all the equality constraints, the solution reduces to understanding a set of equations. Such a quick solution provides a new train of thought.

Statistical learning theory systematically studied the various types of basic function sets as the relationship between the empirical risk and actual risk, namely, as an extension of the boundary conditions. Investigating the two kinds of classification problems leads to the same conclusions: the indication functions focus all the functions and meet at the certain probability between empirical risk and the actual risk type as shown in formula (21)

R (w) ⩽ R_{emp} (w) + \sqrt{\frac{k (\ln \frac{2 l}{h} + 1) - \ln \frac{η}{4}}{l}}

(21)

The SVM scheme has good classification performance, but it can only act on two kinds of samples that are to be classified; practical applications often need multiple categories in the classification, and therefore we attempt to enhance it via (1) “Binary tree” method, (2) “One against many” method, and (3) directed acyclic graph (DAG) method.

With the background as discussed previously, we combine the PLS regression with the SVM scheme to provide a better model. The optimization problem can be represented as follows

\min φ (w, b, e) = \frac{1}{2} w^{t} w + γ \frac{1}{2} \sum_{N}^{N} e_{k}^{2}

(22)

The restriction condition can be expressed as follows

y_{k} [w^{T} φ (x_{k}) + b] = 1 - e_{k}

(23)

For the solution of the optimization problem, we convert equations (22) and (23) into the Lagrange function as follows

Lagrange (w, b, e, a) = ψ (w, b, e) - N \sum_{k = 1}^{K} a_{k} [\cdot]

(24)

With the integration being completed, we propose the PLS regression combined function that can be expressed as

y (x) = sgn (\sum_{i = 1}^{N} a_{i} y_{i} ψ (x, x_{i}) + b)

(25)

For this target, the training procedure can be expressed as

[\begin{matrix} I & 0 & 0 & - Z^{T} \\ 0 & 0 & 0 & - Y^{T} \\ 0 & 0 & γ I & - I \\ Z & Y & I & 0 \end{matrix}] [\begin{matrix} w \\ b \\ e \\ a \end{matrix}] = [\begin{matrix} 0 \\ 0 \\ 0 \\ I \end{matrix}]

(26)

Illumination adjustment and influence elimination

To deal with light, there are two basic ideas: remove the lighting or accept the lighting. We use the difference of Gaussian (DoG) face image expansion method to solve the problem of face recognition under the condition of extreme light, rather than accepting the conditions of extremely bright light

I (x, y) = R (x, y) \times L (x, y)

(27)

Luminance component $L (x, y)$ directly determines the pixels in an image that can achieve the basic dynamic range. The reflection components $R (x, y)$ determine the intrinsic nature of the image. The original images are used to remove or reduce the effects of irradiation light to adhere to the nature of the theory of reflection properties. Multiscale algorithms can solve the sum of squares due to regression (SSR) in dealing with problems arising from the nonuniform illumination image. We demonstrate the corresponding features in formula (28)

R_{Mi} (x, y) = \sum_{n = 1}^{N} w_{n} {\ln I_{i} (x, y) - \ln [F_{n} (x, y) * I_{i} (x, y)]}

(28)

We must now consider the question of how to solve the problem of severe changes in the gray illumination estimation. In order to evaluate the relative intensities of the light entering both the eyes and the number of visual features, this article puts forward a logarithmic form of the conduction function as follows

g (d, K) = \ln (1 + {(\frac{d}{K})}^{2}) + 1

(29)

The DoG acts like a band-pass filter and it can also reduce the use of facial image information. The picture shows a face under normal lighting conditions. Using the DoG filter can reduce the quality of the general image itself. In Figure 4, we illustrate the appearance of the face under different conditions of illumination.

Figure 4.

Facial appearance under different illumination.

Experiment and results

In order to illustrate the effectiveness of the proposed method, we experimentally simulate the pedestrian detection algorithms with numerical analysis. The sample databases used for calibration were obtained from four public datasets: AT&T, Georgia, University of California—San Diego (UCSD), and MoBo. A total of 20 different facial images from each database were used in the experiment, to simulate the effectiveness of detection under different illumination conditions. In Figure 5, we show the sample database used for experiment. In Figure 6, we show the systematic implementation of the proposed recognition algorithm. It can be concluded that our algorithm can recognize the face well within the red labeled regions. In Table 1, we numerically simulate the proposed algorithm under complex illumination conditions with the dataset from AT&T, and the result indicates that our algorithm performs well with satisfactory robustness and efficiency. The proposed method achieves a mean recognition rate of 90.9%. In Table 2, we show the result of the comparison of simulation performance. We conclude that our method outperforms the principle component analysis (PCA), SVM, neural networks (NN), sparse-based representation classification (SRC), GMM, and LBP methods. The average recognition rate of our method is 92.7%.

Figure 5.

Sample database used for our experiment.

Figure 6.

Systematic implementation of the proposed recognition algorithm.

Table 1.

Simulation results of the proposed algorithm from the AT&T database.

Experimental set	Recognition rate (%)	Training time (s)	Recognition time (s)
1	89.3	334	18
2	88.5	356	15
3	889	362	21
4	89.9	359	21
5	85.8	356	23
6	91.2	398	25
7	92.8	396	29
8	92.0	385	28
9	91.4	389	25
10	90.8	388	25
11	91.9	387	21
12	92.7	389	24
13	89.9	397	23
14	88.7	405	18
15	93.1	400	19
16	92.5	369	19
17	95.4	358	23
18	91.6	398	25
19	90.9	400	25
20	90.6	408	24

Table 2.

Comparison of simulation performance between our method and other algorithms.

Algorithm	Name of the dataset
Algorithm	AT&T	Georgia	UCSD	MoBo
PCA	42.3%	48.2%	40.2%	44.1%
SVM	52.6%	55.1%	51.6%	57.6%
NN	55.1%	57.3%	60.3%	54.3%
SRC	63.5%	64.8%	63.2%	65.0%
GMM	69.1%	70.1%	75.8%	72.5%
LBP	71.9%	72.3%	73.5%	78.3%
Proposed method	90.9%	93.5%	91.2%	95.1%

UCSD: University of California—San Diego; PCA: principle component analysis; SVM: support vector machine; NN: neural networks; SRC: sparse-based representation classification; GMM: Gaussian mixture model; LBP: local binary patterns.

Conclusion

In order to prevent crime and warn the public of security risks in crowded areas, effective surveillance and tracking methods to find particular individuals assume great importance. Face recognition is widely used in airports, shopping centers, and football stadiums. With the rapid development of computer science techniques, face recognition research has been an active research area. In this study, we propose a novel face recognition algorithm based on the conjugate orthonormalized partial least-squares regression analysis, which is suitable for the detection of pedestrians under complex lighting environments. The experimental simulation reflects that our algorithm achieves better recognition rate under complex illumination conditions compared with other state-of-the-art approaches; the average recognition rate of the proposed method reached 92.7%. Pedestrian detection is widely used in crowded areas to find particular individuals, such as criminals. Besides, it is expected that the detection of humans via facial recognition will be extensively used in the field of intelligent transportation systems. In future research, we plan to use mathematical optimization to enhance the conjugate orthonormalized partial least-squares regression analysis to obtain a better model for more in-depth analysis.

Footnotes

Handling Editor: Daming Zhou

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Science and Technology Project of Minjiang University (MYK18039), Fujian College’s Research Base of Humanities and Social Science for Internet Innovation Research Center, Minjiang University (IIRC20180103), and the Natural Science Foundation of Fujian Province (2016J01757).

ORCID iD

Jianhu Zheng

References

Hung

Chen

CY.

Crowd pedestrian detection using expectation maximization with weighted local features. In: 15th IAPR international conference on machine vision applications, Nagoya, Japan, 8–12 May 2017, pp.156–159. New York: IEEE.

Kwon

Taeuk

Seongyoun

et al . Octagonal prism LBP representation for face recognition. Multimedia Tools Applicat 2018; 77(16): 21751–21770.

Lamba

Nain

Chahar

. A robust multi-model approach for face detection in crowd. In: International conference on signal-image technology and internet-based systems, Naples, 28 November–1 December 2016. New York: IEEE.

Jourabloo

Liu

Pose-invariant face alignment via CNN-based dense 3D model fitting. Int J Computer Vision 2017; 124(2): 187–203.

Uddin

Hassan

Almogren

et al . Facial expression recognition utilizing local direction-based robust features and deep belief network. IEEE Access 2017; 5: 4525–4536.

Ding

Tao

Multi-task pose-invariant face recognition. IEEE Trans Image Process 2015; 24(3): 980–993.

Yang

Zhou

Balasubramanian

et al . Fast-minimization algorithms for robust face recognition. IEEE Trans Image Process 2013; 22(8): 3234–3246.

Shen

Zuo

et al . A novel pixel neighborhood differential statistic feature for pedestrian and face detection. Pattern Recognition 2017; 63: 127–138.

Ghiass

Arandjelovic

Bendada

et al . Infrared face recognition: a comprehensive review of methodologies and databases. Pattern Recognition 2014; 47(9): 2807–2824.

10.

Vanijzendoorn

Bakermans-Krnenburg

MJ.

A sniff of trust: meta-analysis of the effects of intranasal oxytocin administration on face recognition, trust to in-group, and trust to out-group. Psychoneuroendocrinology 2012; 37(3): 438–443.

11.

Gopalan

Model-driven domain adaptation on product manifolds for unconstrained face recognition. Int J Computer Vision 2014; 109(1–2): 110–125.

12.

Klare

Jain

AK.

Heterogeneous face recognition using kernel prototype similarities. IEEE Trans Pattern Anal Machine Intell 2013; 35(6): 1410–1422.

13.

Wen

Liu

Yang

et al . Structured occlusion coding for robust face recognition. Neurocomputing 2016; 178: 11–24.

14.

Yamazato

Kinoshita

Arai

et al . Vehicle motion and pixel illumination modeling for image sensor based visible light communication. IEEE J Select Areas Comm 2015; 33(9): 1793–1805.

15.

Chen

Veksler

Bersuker

et al . Modeling illumination effects on N-and P-Type InGaAs MOS at room and low temperatures. IEEE Trans Electron Devices 2014; 61(5): 1483–1487.

16.

Liu

Chai

et al . Maximal likelihood correspondence estimation for face recognition across pose. IEEE Trans Image Process 2014; 23(10): 4587–4600.

17.

Peng

Wang

Long

et al . Discriminative graph regularized extreme learning machine and its application to face recognition. Neurocomputing 2015; 149: 340–353.

18.

Bereta

Pedrycz

Reformat

Local descriptors and similarity measures for frontal face recognition: a comparative analysis. J Visual Comm Image Represent 2013; 24(8): 1213–1231.

19.

Zhou

Al-Durra

Zhang

et al . Online remaining useful lifetime prediction of proton exchange membrane fuel cells using a novel robust methodology. J Power Sources 2018; 399: 314–328.

20.

Riaz

Hassan

Rehman

et al . Texture classification using rotation-and scale-invariant Gabor texture features. IEEE Signal Process Lett 2013; 20(6): 607–610.

21.

Qian

Yang

Local structure-based image decomposition for feature extraction with applications to face recognition. IEEE Trans Image Process 2013; 22(9): 3591–3603.

22.

Zhou

Nguyen

Breaz

et al . Global parameters sensitivity analysis and development of a two-dimensional real-time model of proton-exchange-membrane fuel cells. Energ Convers Manage 2018; 162: 276–292.

23.

Yamada

Miyawaki

Kamitani

Inter-subject neural code converter for visual image representation. Neuroimage 2015; 113: 289–297.