Sage Journals: Discover world-class research

Abstract

In order to enhance the performance of image recognition, a sparsity augmented probabilistic collaborative representation based classification (SA-ProCRC) method is presented. The proposed method obtains the dense coefficient through ProCRC, then augments the dense coefficient with a sparse one, and the sparse coefficient is attained by the orthogonal matching pursuit (OMP) algorithm. In contrast to conventional methods which require explicit computation of the reconstruction residuals for each class, the proposed method employs the augmented coefficient and the label matrix of the training samples to classify the test sample. Experimental results indicate that the proposed method can achieve promising results for face and scene images. The source code of our proposed SA-ProCRC is accessible at https://github.com/yinhefeng/SAProCRC

Keywords

Image recognition probabilistic collaborative representation based classification sparse representation sparsity augmented

Introduction

Image recognition remains one of the hottest topics in the communities of computer vision and pattern recognition. During the past decade, sparse representation has been successfully applied in various domains. In face recognition, the pioneering work is the sparse representation based classification (SRC).¹ Concretely, SRC employs all the training samples as a dictionary, and a test sample is sparsely coded over the dictionary, then the classification is performed by checking which class yields the least reconstruction error. SRC can achieve promising recognition results even when the test samples are occluded or corrupted. To further promote the robustness of SRC, Wang et al.² proposed a correntropy matching pursuit (CMP) method for robust sparse representation based recognition. CMP can adaptively assign small weights on severely corrupted entries of data and large weights on clean ones, thus reducing the effect of large noise. Wu and Ding³ presented a gradient direction-based hierarchical adaptive sparse and low-rank algorithm to tackle the real-world occluded face recognition problem. Gao et al.⁴ developed a robust and discriminative low-rank representation method by exploiting the low-rankness of both the data representation and each occlusion-induced error image simultaneously. Keinert et al.⁵ designed a group sparse representation-based method for face recognition which introduces a non-convex sparsity-inducing penalty and a robust non-convex loss function.

Apart from classifier design, feature extraction is also a crucial stage in image recognition. The most classic subspace learning-based approaches are principal component analysis and linear discriminant analysis. Motivated by the recent development of sparse representation, Qiao et al.⁶ presented a dimensionality reduction technique called sparsity preserving projections. To make SRC efficiently deal with high-dimensional data, Cui et al.⁷ proposed an integrated optimization algorithm to implement feature extraction, dictionary learning, and classification simultaneously. To tackle the corrupted data, Xie et al.⁸ explored a dimensionality reduction method termed low-rank sparse preserving projections by combining the manifold learning and low-rank sparse representation.

Recently, sparse representation has been applied to a wide range of tasks. Zhang et al.⁹ developed a structural sparse representation model for visual tracking. Liu et al.¹⁰ introduced the convolutional sparse representation into image fusion. Guo et al.¹¹ proposed a sparse and dense hybrid representation-based target detector for hyperspectral imagery (HSI).

Another critical issue in sparse representation is how to solve the $ℓ_{1}$ -norm constraint problem. Zhang et al.¹² presented a survey of sparse representation algorithms and found that Homotopy and augmented Lagrange multiplier (ALM) can achieve better recognition performance and have relatively lower computational cost.

Akhtar et al.¹³ revealed that sparseness explicitly contributes to improved classification. And they proposed a sparsity augmented collaborative representation based classification (SA-CRC) which employs both dense and sparse collaborative representations to recognize a test sample. However, collaborative representation based classification (CRC)¹⁴ utilizes all the training samples to represent the input test sample, which neglects the relationship between the test sample and each of the multiple classes. To overcome the drawback of SA-CRC, first we obtain a dense representation by probabilistic collaborative representation based classification (ProCRC),¹⁵ then we augment the representation of ProCRC with a sparse representation to further promote the sparsity of ProCRC. Moreover, different from conventional representation based classification methods that use class-wise reconstruction error for classification, we utilize the label matrix of training data and the augmented coefficient of a test sample for final classification. The proposed method is termed as sparsity augmented probabilistic collaborative representation based classification (SA-ProCRC). In summary, our contributions are as follows:

We promote the sparsity of ProCRC by augmenting the representation of ProCRC with a sparse representation.

We employ an efficient classification rule to recognize the test sample, in which the explicit computation of residuals class by class is avoided.

Experimental results on diverse datasets validate the efficacy of our proposed method.

Related work

Given n training samples belonging to C classes, and the training data matrix is denoted by $X = [X_{1}, X_{2}, \dots, X_{C}] = [x_{1}, x_{2}, \dots, x_{n}] \in ℝ^{m \times n}$ , where $X_{i}$ is the data matrix of the i-th class. The i-th class has n_i training samples and $\sum_{i = 1}^{C} n_{i} = n, i = 1, 2, \dots, C$ , m is the dimensionality of vectorized samples.

Sparse representation based classification

In SRC,¹ a test sample $y \in ℝ^{m}$ is first represented as a sparse linear combination of all the training data, then the classification is performed by checking which class leads to the least reconstruction error, the objective function of SRC is formulated as

min_{α} {‖ α ‖}_{1}, s . t . {‖ y - X α ‖}_{2}^{2} \leq ε

(1)

where ε is a given error tolerance. When we obtain the coefficient vector

α

of y , the test sample y is classified according to the following formulation

identity(y) = \arg \min_{i} {‖ y - X_{i} α_{i} ‖}_{2}

(2)

where

α_{i}

is the coefficient vector that corresponds to the i-th class.

Collaborative representation based classification

SRC and its extensions have achieved encouraging results in a variety of pattern classification tasks. However, Zhang et al.¹⁴ argued that it is the collaborative representation mechanism rather than the $ℓ_{1}$ -norm sparsity that makes SRC powerful for classification. And they presented CRC algorithm, which replaces the $ℓ_{1}$ -norm in SRC with the $ℓ_{2}$ -norm constraint, the objective function of CRC is formulated as follows

\min_{α} {‖ y - X α ‖}_{2}^{2} + λ {‖ α ‖}_{2}^{2}

(3)

CRC has the following closed-form solution

α = {(X^{T} X + λ I)}^{- 1} X^{T} y

(4)

where I is the identity matrix. Let

P = {(X^{T} X + λ I)}^{- 1} X^{T}

, one can see that P is determined by the training data matrix X. Therefore, when given all the training data, P can be pre-computed, which makes CRC very efficient. CRC employs the following regularized residual for classification

identity (y) = \arg \min_{i} \frac{{‖ y - X_{i} α_{i} ‖}_{2}}{{‖ α_{i} ‖}_{2}}

(5)

Probabilistic CRC

Inspired by the work of probabilistic subspace approaches, Cai et al.¹⁵ explored the classification mechanism of CRC from a probabilistic perspective and developed a probabilistic collaborative representation based classifier (ProCRC), and the objective function of ProCRC is formulated as

\min_{\overset{ˇ}{α}} {‖ y - X \overset{ˇ}{α} ‖}_{2}^{2} + λ {‖ \overset{ˇ}{α} ‖}_{2}^{2} + \frac{γ}{C} \sum_{i = 1}^{C} {‖ X \overset{ˇ}{α} - X_{i} {\overset{ˇ}{α}}_{i} ‖}_{2}^{2}

(6)

where λ and γ are two balancing parameters. One can see that ProCRC is reduced to CRC when γ = 0. Suppose

X_{i}^{'}

is a matrix that has the same size as X, and

X_{i}^{'}

only contains the samples from the i-th class, namely

X_{i}^{'} = [0, \dots, X_{i}, \dots, 0]

. Let

{\bar{X}}_{i}^{'} = X - X_{i}^{'}

, after some deductions, we can obtain the following closed-form solution to ProCRC

\overset{ˇ}{α} = T y

(7)

where

T = {(X^{T} X + \frac{γ}{C} \sum_{i = 1}^{C} {({\bar{X}}_{i}^{'})}^{T} {\bar{X}}_{i}^{'} + λ I)}^{- 1} X^{T}

and I is the identity matrix.

SA-ProCRC

In our proposed SA-ProCRC, the dense representation of ProCRC is augmented by a sparse representation computed by OMP,¹⁶ and the optimization problem for sparse representation is given by

\min_{\hat{α}} {‖ y - X \hat{α} ‖}_{2}, s . t . {‖ \hat{α} ‖}_{0} \leq k

(8)

where k is the sparsity level.

The augmented coefficient $\overset{˚}{α}$ can be obtained according to the following formulation

\overset{˚}{α} = \frac{\hat{α} + \overset{ˇ}{α}}{{‖ \hat{α} + \overset{ˇ}{α} ‖}_{2}}

(9)

where

\hat{α}

is the sparse coefficient computed by OMP, and

\overset{ˇ}{α}

is the coefficient obtained by ProCRC.

Let $L = [l_{1}, l_{2}, \dots, l_{n}] \in ℝ^{C \times n}$ be the label matrix of the training data, and $l_{j} = {[0, 0, \dots, 1, \dots, 0, 0]}^{T} \in ℝ^{C \times 1}$ denotes the label vector of the j-th training sample. For the i-th class, L consists of n_i non-zero elements in its i-th row, at the indices associated with the columns of $X_{i}$ . Remember that $X_{i}$ is the subset of dictionary atoms belonging to the i-th class. Therefore, the i-th entry of the vector $q = L \overset{°}{α}$ expresses the sum of coefficients in $\overset{˚}{α}$ which correspond to the atoms in $X_{i}$ , and q is dubbed as the score of each class. Consequently, the test sample is designated into the class that leads to the largest score.

Our proposed SA-ProCRC has the following procedures. First, the dense coefficient and sparse coefficient are obtained by solving equations (6) and (8), respectively. Second, the dense coefficient is augmented by the sparse coefficient. Finally, the test sample is recognized according to the augmented coefficient vector and the label matrix of the training data. Algorithm 1 presents our proposed scheme.

Algorithm 1. SA-ProCRC

Input: Training data matrix $X = [X_{1}, X_{2}, \dots, X_{C}] \in ℝ^{m \times n}$ and label matrix L , test data $y \in ℝ^{m}$ , parameters λ and γ for ProCRC, sparsity level k for SRC.

Output: $label (y) = \arg \max_{i} (q_{i})$

1. Compute the coefficient $\overset{ˇ}{α}$ of ProCRC by using equation (7)

2. Obtain the sparse coefficient $\hat{α}$ of SRC by solving equation (8)

3. Compute the augmented coefficient $\overset{˚}{α} = \frac{\hat{α} + \overset{ˇ}{α}}{{‖ \hat{α} + \overset{ˇ}{α} ‖}_{2}}$

4. Compute $q = L \overset{˚}{α}$

Analysis of SA-ProCRC

In this section, we present some experimental results on the Extended Yale B database to illustrate the effectiveness of SA-ProCRC. The Extended Yale B database contains 38 individuals and there are about 64 images for each individual. We randomly select 20 images per subject as the training data; therefore, the dictionary contains 760 atoms. We select a test image which belongs to the first subject, and the sparse coefficients and corresponding residual for each class are plotted in Figures 1 and 2. It can be seen from Figure 1 that coefficients belong to the first class are prominent. From Figure 2, we can clearly see that the first class has the least residual, which indicates that the test sample is correctly classified by SRC. Figure 3 shows the coefficients derived by ProCRC, and we can see that the coefficients are rather dense. Figure 4 presents the residual of ProCRC, one can see that the 26th class has the least residual, thus the test sample is wrongly classified to the 26th class. Coefficients obtained by SA-ProCRC are shown in Figure 5, and we can see that coefficients from the first class are dominant. Figure 6 plots the score of SA-ProCRC for each class, it can be seen that the first class delivers the largest value. As a result, the test sample is designated to the first class by SA-ProCRC. From the above experimental results, we can find that the dense representation of ProCRC may lead to misclassification. By augmenting the dense representation with a sparse representation, the misclassification can be alleviated. This validates the superiority of our proposed SA-ProCRC.

Figure 1.

Coefficients obtained by SRC.

Figure 2.

The residual of SRC for each class, and the first class has the least residual.

Figure 3.

Coefficients computed by ProCRC.

Figure 4.

The residual of ProCRC for each class, one can see that the 26th class has the minimal residual.

Figure 5.

Coefficients obtained by SA-ProCRC.

Figure 6.

The score of SA-ProCRC for each class, it is evident that the first class has the largest value.

Experiments

In this section, we conduct experiments on four benchmark datasets: the Yale database, the Extended Yale B database, the AR database, and the Scene 15 dataset, the details of these datasets are listed in Table 1. We compare the proposed method with state-of-the-art representation based classification methods and several dictionary learning approaches, such as SRC,¹ CRC,¹⁴ ProCRC,¹⁵ discriminative K-SVD (D-KSVD),¹⁷ label consistent K-SVD (LC-KSVD),¹⁸ fisher discrimination dictionary learning (FDDL),¹⁹ dictionary learning based on commonalities and particularities (COPAR),²⁰ joint discriminative Bayesian dictionary and classifier learning (JBDC),²¹ and SA-CRC.¹³ For SRC, we solve the problem in equation (1) as in Wright et al.¹ For CRC, LC-KSVD, FDDL, COPAR, JBDC, and SA-CRC, we use the publicly available codes. We adapted the code of LC-KSVD to implement D-KSVD. For SA-CRC and our proposed SA-ProCRC, OMP is utilized to obtain the sparse representation. We utilize the same value of sparsity level (k = 50) as in SA-CRC.¹³ All experiments are run with MATLAB R2019a under Windows 10 on PC equipped with 3.60 GHz CPU and 16 GB RAM.

Table 1.

Details of datasets used in our experiments.

Dataset	# Sample	# Class	# Feature
Yale	165	15	576
EYaleB	2414	38	504
AR	2600	100	540
Scene 15	4485	15	3000

Note: The columns from left to right are the names of datasets, total number of samples, number of classes, and the dimensionality of features.

Experiments on the Yale database

There are 165 images for 15 subjects in the Yale database, each has 11 images. These images have illumination and expression variations, Figure 7 shows some example images from this database. All the images are resized to 24 × 24 pixels, leading to a 576-dimensional vector. In our experiments, six images per subject are randomly selected for training and the rest for testing. The error tolerance ε of SRC is 0.05, and the balancing parameter λ of CRC is 0.001. The sparsity level and number of atoms for D-KSVD and LC-KSVD are 30 and 60, respectively. Sparsity level k and λ of SA-CRC are set to be 50 and 0.002, respectively. Experimental results are summarized in Table 2, in which the best result is highlighted by bold number. It can be observed that SA-ProCRC achieves the highest recognition accuracy, with a 17% reduction in the error rate of ProCRC, and 12% reduction in that of SA-CRC.

Figure 7.

Example images from the Yale database.

Table 2.

Recognition accuracy on the Yale database.

Methods	Accuracy (%)
SRC	95.06 ± 3.32
CRC	94.53 ± 2.97
ProCRC	95.33 ± 2.82
D-KSVD	94.26 ± 2.88
LC-KSVD	94.53 ± 0.03
FDDL	95.73 ± 3.00
COPAR	91.33 ± 4.23
JBDC	94.93 ± 2.72
SA-CRC	95.60 ± 2.59
SA-ProCRC	96.13 ± 2.84

SRC: sparse representation based classification; CRC: collaborative representation based classification; ProCRC: probabilistic collaborative representation based classification; SA-CRC: sparsity augmented collaborative representation based classification; SA-ProCRC: sparsity augmented probabilistic collaborative representation based classification. Bold value signifies the best recognition accuracy.

Experiments on the Extended Yale B database

The Extended Yale B face database is composed of 2414 images of 38 individuals. Each individual has 59–64 images taken under different illumination conditions, example images from this dataset are shown in Figure 8. In our experiments, each 192 × 168 image is projected onto a 504-dimensional space via random projection. Twenty images per person are selected for training and the remaining for testing. We use the error tolerance of 0.05 for SRC, and the regularization parameter λ = 0.001 for CRC. The sparsity level and number of atoms for D-KSVD and LC-KSVD are 50 and 400, respectively. Sparsity level k and λ of SA-CRC are set to be 50 and 0.005, respectively. Table 3 lists the recognition accuracy of the comparison methods. It can be seen that our proposed SA-ProCRC is superior to its competing approaches.

Figure 8.

Example images from the Extended Yale B database.

Table 3.

Recognition accuracy on the Extended Yale B database.

Methods	Accuracy (%)
SRC	93.18 ± 0.55
CRC	94.77 ± 0.48
ProCRC	94.82 ± 0.49
D-KSVD	90.79 ± 0.51
LC-KSVD	91.48 ± 0.69
FDDL	92.32 ± 0.68
COPAR	90.81 ± 0.55
JBDC	94.74 ± 0.83
SA-CRC	95.52 ± 0.73
SA-ProCRC	95.64 ± 0.78

Experiments on the AR database

The AR database has more than 4000 face images of 126 subjects with variations in facial expression, illumination conditions, and occlusions, Figure 9 shows example images from this database. We use a subset of 2600 images of 50 male and 50 female subjects from the database. Each 165 × 120 face image is projected onto a 540-dimensional vector by random projection. Ten images per person are randomly selected for training and the remaining for testing. The error tolerance of SRC is 0.05, and the balancing parameter of CRC is 0.0014. The sparsity level and number of atoms for D-KSVD and LC-KSVD are 50 and 600, respectively. Sparsity level k and λ of SA-CRC are set to be 50 and 0.002, respectively. Experimental results are shown in Table 4. We can see that the best classification result is achieved by our proposed SA-ProCRC, with a 23% reduction in the error rate of ProCRC.

Figure 9.

Example images from the AR database.

Table 4.

Recognition accuracy on the AR database.

Methods	Accuracy (%)
SRC	91.25 ± 1.17
CRC	92.04 ± 0.83
ProCRC	93.03 ± 0.64
D-KSVD	90.31 ± 1.13
LC-KSVD	89.31 ± 1.27
FDDL	91.01 ± 0.99
COPAR	89.06 ± 1.54
JBDC	90.97 ± 0.79
SA-CRC	93.74 ± 0.84
SA-ProCRC	94.67 ± 0.66

Experiments on the Scene 15 dataset

This dataset contains 15 natural scene categories including a wide range of indoor and outdoor scenes, such as bedroom, office, and mountain, example images from this dataset are shown in Figure 10. For fair comparison, we employ the 3000-dimensional scale invariant feature transform (SIFT)-based features used in LC-KSVD.¹⁸ We randomly select 50 images per category as training data and use the rest for testing. The error tolerance of SRC is 1e-6, and the balancing parameter of CRC is 1. Fifty atoms are used for D-KSVD and LC-KSVD. Sparsity level k and λ of SA-CRC are set to be 50 and 1, respectively. Recognition accuracy of different approaches on this dataset is presented in Table 5. Again, SA-ProCRC outperforms the comparison methods.

Figure 10.

Example images from the Scene 15 dataset.

Table 5.

Recognition accuracy on the Scene 15 dataset.

Methods	Accuracy (%)
SRC	95.41 ± 0.13
CRC	96.15 ± 0.33
ProCRC	96.56 ± 0.35
D-KSVD	95.12 ± 0.18
LC-KSVD	96.37 ± 0.28
FDDL	94.08 ± 0.43
COPAR	96.02 ± 0.28
JBDC	97.36 ± 0.32
SA-CRC	97.18 ± 0.25
SA-ProCRC	97.56 ± 0.20

Conclusions

It has been argued that it is the collaborative representation mechanism rather that the sparsity constraint that makes SRC powerful for pattern classification. As a result, sparsity is ignored to some extent in CRC and its extensions. To address this problem, we present a SA-ProCRC method to promote the sparsity in ProCRC. The proposed SA-ProCRC is computationally efficient due to the fact that ProCRC has closed-form solution. Meanwhile, discriminative information containing in the resulting sparse coefficient can be exploited in SA-ProCRC. In essence, SA-ProCRC is a classifier, thus it can be applied to other pattern classification tasks. In our future work, we will evaluate SA-ProCRC with deep features and develop new representation based classification algorithm.

Footnotes

Acknowledgements

The authors would like to thank Prof. Naveed Akhtar for providing the source code of SA-CRC at

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (Grant No. 61672265).

ORCID iDs

Xiao-Yun Cai

He-Feng Yin

References

Wright

Yang

Ganesh

, et al. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 2009; 31: 210–227.

Wang

Tang

. Correntropy matching pursuit with application to robust digit and face recognition. IEEE Trans Cybern 2016; 47: 1354–1366.

Ding

. Occluded face recognition using low-rank regression with generalized gradient direction. Pattern Recognit 2018; 80: 256–268.

Gao

Yang

Jing

, et al. Learning robust and discriminative low-rank representations for face recognition with occlusion. Pattern Recognit 2017; 66: 129–143.

Keinert

Lazzaro

Morigi

. A Robust group-sparse representation variational method with applications to face recognition. IEEE Trans Image Process 2019; 28: 2785–2798.

Qiao

Chen

Tan

. Sparsity preserving projections with applications to face recognition. Pattern Recognit 2010; 43: 331–341.

Cui

Jiang

Lai

, et al. An integrated optimisation algorithm for feature extraction, dictionary learning and classification. Neurocomputing 2018; 275: 2740–2751.

Xie

Yin

, et al. Low-rank sparse preserving projections for dimensionality reduction. IEEE Trans Image Process 2018; 27: 5261–5274.

Zhang

Yang

. Robust structural sparse tracking. IEEE Trans Pattern Anal Mach Intell 2018; 41: 473–486.

10.

Liu

Chen

Ward

, et al. Image fusion with convolutional sparse representation. IEEE Signal Process Lett 2016; 23: 1882–1886.

11.

Guo

Luo

Zhang

, et al. Target detection in hyperspectral imagery via sparse and dense hybrid representation. IEEE Geosci Remote Sensing Lett 2019; 17: 716–720.

12.

Zhang

Yang

, et al. A survey of sparse representation: algorithms and applications. IEEE Access 2015; 3: 490–530.

13.

Akhtar

Shafait

Mian

. Efficient classification with sparsity augmented collaborative representation. Pattern Recognit 2017; 65: 136–145.

14.

Zhang L, Yang M and Feng X. Sparse representation or collaborative representation: which helps face recognition? In: CVPR, Colorado Springs, CO, 20–25 June 2011, pp.471–478.

15.

Cai S, Zhang L, Zuo W, et al. A probabilistic collaborative representation based approach for pattern classification. In: CVPR, Las Vegas, NV, 27–30 June 2016, pp.2950–2959.

16.

Tropp

Gilbert

. Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans Inform Theory 2007; 53: 4655–4666.

17.

Zhang Q and Li B. Discriminative K-SVD for dictionary learning in face recognition. In: CVPR, San Francisco, CA, 13–18 June 2010, pp.2691–2698.

18.

Jiang Z, Lin Z and Davis LS. Learning a discriminative dictionary for sparse coding via label consistent K-SVD. In: CVPR, Colorado Springs, CO, 20–25 June 2011, pp.697–1704.

19.

Yang M, Zhang L, Feng X, et al. Fisher discrimination dictionary learning for sparse representation. In: CVPR, Colorado Springs, CO, 20–25 June 2011, pp.543–550.

20.

Kong S and Wang D. A dictionary learning approach for classification: separating the particularity and the commonality. In: ECCV, Florence, Italy, 7–13 October 2012, pp.186–199.

21.

Akhtar N, Mian A and Porikli F. Joint discriminative Bayesian dictionary and classifier learning. In: CVPR, Honolulu, HI, 21–26 July 2017, pp.1193–1202.

A sparsity augmented probabilistic collaborative representation based classification method

Abstract

Keywords

Introduction

Related work

Sparse representation based classification

Collaborative representation based classification

Probabilistic CRC

SA-ProCRC

Analysis of SA-ProCRC

Experiments

Experiments on the Yale database

Experiments on the Extended Yale B database

Experiments on the AR database

Experiments on the Scene 15 dataset

Conclusions

Footnotes

Acknowledgements

Declaration of Conflicting Interests

Funding

ORCID iDs

References