Sage Journals: Discover world-class research

Abstract

During the past decade, representation based classification method has received considerable attention in the community of pattern recognition. The recently proposed non-negative representation based classifier achieved superb recognition results in diverse pattern classification tasks. Unfortunately, discriminative information of training data is not fully exploited in non-negative representation based classifier, which undermines its classification performance in practical applications. To address this problem, we introduce a decorrelation regularizer into the formulation of non-negative representation based classifier and propose a discriminative non-negative representation based classifier for pattern classification. The decorrelation regularizer is able to reduce the correlation of representation results of different classes, thus promoting the competition among them. Experimental results on benchmark datasets validate the efficacy of the proposed discriminative non-negative representation based classifier, and it can outperform some state-of-the-art deep learning based methods. The source code of our proposed discriminative non-negative representation based classifier is accessible at https://github.com/yinhefeng/DNRC.

Keywords

Image recognition non-negative representation representation based classification decorrelation regularizer

Introduction

Recent years have witnessed the success of representation based classification method (RBCM) in a variety of classification tasks. In face recognition, the most influential work is the sparse representation based classification (SRC).¹ SRC treats all the training samples as a dictionary, and a test sample is sparsely coded over the dictionary, then the classification is performed by checking which class yields the least reconstruction error. Naseem et al.² proposed a linear regression classification (LRC) algorithm which represents the test sample as a linear combination of class-specific training samples. The essence of LRC is nearest subspace classifier (NSC)³ with down-sampled features. Due to the principle of $ℓ_{1}$ -minimization, the coefficients obtained by SRC may vary a lot even for similar test samples. Therefore, SRC tends to lose locality information. Yu et al.⁴ argued that locality is more essential than sparsity. Inspired by this finding, Lu et al.⁵ presented a weighted SRC (WSRC) method which imposes the locality on the $ℓ_{1}$ regularization. Similarly, Wang et al.⁶ proposed a locality-constrained linear coding (LLC) scheme. Different from the $ℓ_{1}$ -minimization used in WSRC, LLC employs the $ℓ_{2}$ -norm based regularizer, thus LLC has closed-form solution. In SRC, all the training samples are used to construct the over-complete dictionary; however, not all the training samples have positive contributions in representing the test sample. Xu et al.⁷ developed a two-phase test sample representation (TPTSR) method which determines a certain number of nearest neighbors for the test sample. Zhang et al.⁸ argued that it is the collaborative representation mechanism rather than the $ℓ_{1}$ -norm sparsity that makes SRC successful for face classification, and they presented a collaborative representation based classification (CRC) algorithm. Similar to the idea of WSRC, Timofte and Van Gool⁹ designed a weighted CRC (WCRC) method. To fully exploit multiple collaborative representations in their formulations, Chi and Porikli¹⁰ proposed a collaborative representation optimized classifier (CROC) which strikes a balance between NSC and CRC. The label information of training data is not considered in SRC. To alleviate this problem, Lai and Jiang¹¹ developed a class-wise sparse representation (CSR) method which minimizes the number of training classes in representing the test sample. The $ℓ_{1}$ -norm minimization in SRC is computationally expensive, to design efficient sparse representation method, Xu et al.¹² presented a discriminative sparse representation method via $ℓ_{2}$ regularization. Although Zhang et al.⁸ offered a geometric interpretation of CRC, it is not intuitive to uncover the principle of CRC. Cai et al.¹³ analyzed the classification mechanism of CRC from a probabilistic perspective and proposed a probabilistic CRC (ProCRC).

Xu et al.¹⁴ pointed out that there exist negative elements in the coefficients obtained by SRC or CRC, which may result in the misclassification of the test sample. Motivated by non-negative matrix factorization (NMF)¹⁵, Xu et al.¹⁴ proposed a non-negative representation based classifier (NRC) which imposes a non-negative constraint on the coding vector. Extensive experiments on diverse classification tasks demonstrate the superiority of NRC over many existing RBCM, including SRC, CRC and ProCRC. Nevertheless, NRC ignores the discriminative information of training data which limits its classification performance. To tackle this problem, we propose a discriminative NRC (DNRC) which incorporates a decorrelation regularizer into the formulation of NRC. The decorrelation regularizer can reduce the correlation of representation results of different classes, thus promoting the competition among them. Competition means that when training samples of a class have a great contribution in representing the test sample, other classes have relatively less contribution.

In summary, our contributions can be summarized as follows:

A discrimintive nonnegative representation based classifier (DNRC) is presented by introducing a decorrelation regularizer into the formulation of NRC.

Alternating direction method of multipliers (ADMM)¹⁶ algorithm is employed to efficiently solve the optimization problem of DNRC.

Experiments on diverse standard datasets indicate that our proposed DNRC outperforms conventional RBCM as well as some deep learning based methods on face databases, human action dataset and fine-grained visual datasets.

Related work

Suppose we have $n$ training samples belonging to $K$ classes, and the training data matrix is denoted by $X = [X_{1}, X_{2}, \dots, X_{K}] = [x_{1}, x_{2}, \dots, x_{n}] \in R^{d \times n}$ , where $X_{i}$ is the data matrix of the $i$ -th class. The $i$ -th class has $n_{i}$ training samples and $\sum_{i = 1}^{K} n_{i} = n$ , $i = 1, 2, \dots, K$ , $d$ is the dimensionality of vectorized samples.

Sparse representation based classification

In SRC,¹ a test sample $y \in R^{d}$ is firstly expressed as a sparse linear superposition of all the training data, then the classification is performed by checking which class leads to the least reconstruction error, and the objective function of SRC is formulated as

\min_{c} {∥ c ∥}_{1}, s . t . {∥ y - X c ∥}_{2}^{2} \leq ε

(1)

where

ε

is a given error tolerance. When we obtain the coefficient vector

c

y

, the test sample

y

is classified according to the following formulation:

identity (y) = \arg min_{i} {∥ y - X_{i} c_{i} ∥}_{2}

(2)

where

c_{i}

is the coefficient vector that corresponds to the

i

-th class.

Collaborative representation based classification

SRC and its variants have achieved encouraging results in diverse pattern classification tasks. Nevertheless, Zhang et al.⁸ argued that it is the collaborative representation mechanism rather than the $ℓ_{1}$ -norm sparsity that makes SRC powerful for classification. And they presented CRC algorithm, which replaces the $ℓ_{1}$ -norm in SRC with the $ℓ_{2}$ -norm constraint, the objective function of CRC is formulated as follows:

min_{c} {∥ y - X c ∥}_{2}^{2} + λ {∥ c ∥}_{2}^{2}

(3)

CRC has the following closed-form solution:

c = (X^{T} X + λ I)^{- 1} X^{T} y

(4)

where

I

is the identity matrix. Let

P = (X^{T} X + λ I)^{- 1} X^{T}

, we can see that

P

is only determined by the training data matrix

X

. Therefore, when given all the training data,

P

can be pre-computed, which makes CRC very efficient. CRC employs the following regularized residual for classification:

identity (y) = \arg min_{i} \frac{{∥ y - X_{i} c_{i} ∥}_{2}}{{∥ c_{i} ∥}_{2}}

(5)

Non-negative representation based classification

SRC and CRC have become two representative methods in RBCM. However, the coding vector of conventional RBCM contains negative entries. The test sample should be better expressed by homogeneous samples with non-negative representation coefficients. Moreover, Lee and Seung¹⁵ pointed out that it is unsuitable to approximate the test sample by allowing the training samples to cancel each other out with complex additions and subtractions. Therefore, Xu et al.¹⁴ proposed the following non-negative representation model by imposing the non-negative constraint on the coding vector:

\min_{c} {∥ y - X c ∥}_{2}^{2}, s . t . c \geq 0

(6)

Similar to SRC, NRC employs the class specific residual to classify the test sample, that is,

identity (y) = \arg min_{i} ∥ y - X_{i} c_{i} ∥_{2}

Discriminative non-negative representation based classifier

In this section, first we introduce the formulation of our proposed DNRC, then we present its optimization procedures. Finally, we give the classification scheme of DNRC.

Proposed model

According to the mechanism of RBCM, we know that $X_{i} c_{i}$ denotes the representation result of the test sample by using the training samples from the $i$ -th class, and $(X_{i} c_{i})^{T} (X_{j} c_{j})$ measures the correlation between the representation results of the training samples from the $i$ -th and $j$ -th classes in expresing the test sample. Minimization of $(X_{i} c_{i})^{T} (X_{j} c_{j})$ means that the representation results of the $i$ -th and $j$ -th classes have the lowest correlation, which encourages the representation results of distinct classes to be discriminative.

By minimizing $\sum_{i = 1}^{K} \sum_{j = 1}^{K} (X_{i} c_{i})^{T} (X_{j} c_{j})$ , the decorrelation effect for different classes can be achieved. Moreover, we have $\sum_{i = 1}^{K} \sum_{j = 1}^{K} (X_{i} c_{i})^{T} (X_{j} c_{j}) = ∥ X c ∥_{2}^{2}$ . To promote the competition between different classes by decorrelating the representation results of them in NRC, we incorporate $∥ X c ∥_{2}^{2}$ into the formulation of NRC and propose the following DNRC model:

\min_{c} {∥ y - X c ∥}_{2}^{2} + λ {∥ X c ∥}_{2}^{2}, s . t . c \geq 0

(7)

where

λ > 0

is a balancing parameter. The first term in equation (7) is the reconstruction error term and the second term is the decorrelation term. One can see that when

λ = 0

, DNRC is degenerated to NRC. Consequently, NRC can be viewed as a special version of DNRC.

Optimization

We adopt an alternative strategy to solve the DNRC model. By introducing an auxiliary variable $z$ , equation (7) can be reformulated as

\min_{c, z} {∥ y - X c ∥}_{2}^{2} + λ {∥ X c ∥}_{2}^{2}, s . t . z = c, z \geq 0

(8)

Equation (8) can be solved by the ADMM ¹⁶ algorithm, and the Lagrangian function of equation (8) is

\begin{aligned} L (c, z, δ, μ) = & {∥ y - X c ∥}_{2}^{2} + λ {∥ X c ∥}_{2}^{2} \\ + ⟨ δ, z - c ⟩ + \frac{μ}{2} {∥ z - c ∥}_{2}^{2} \end{aligned}

(9)

where

δ

is the Lagrange multiplier and

μ > 0

is a penalty parameter. The optimization of equation (9) can be solved iteratively by updating

c

and

z

once at a time. The detailed updating procedures are presented as follows.

Update $c$ : Fix the other variables and update $c$ by solving the following problem:

\min_{c} {∥ y - X c ∥}_{2}^{2} + λ {∥ X c ∥}_{2}^{2} + \frac{μ}{2} {∥ z_{t} - c + \frac{δ_{t}}{μ} ∥}_{2}^{2}

(10)

Setting the partial derivative of equation (10) with respect to

c

to zero, we can obtain the following closed-form solution to

c

c_{t + 1} = [(1 + λ) X^{T} X + \frac{μ}{2} I]^{- 1} [X^{T} y + \frac{μ z_{t} + δ_{t}}{2}]

(11)

Update

z

: To update

z

, we fix variables other than

z

and solve the following problem accordingly:

\min_{z} {∥ z - (c_{t + 1} - \frac{δ_{t}}{μ}) ∥}_{2}^{2}, s . t . z \geq 0

(12)

The solution to

z

is given by

z_{t + 1} = \max (0, c_{t + 1} - \frac{δ_{t}}{μ})

(13)

where the max operator performs element-wisely.

Update $δ$ : The Lagrange multiplier $δ$ is updated according to the following formulation:

δ_{t + 1} = δ_{t} + μ (z_{t + 1} - c_{t + 1})

(14)

The detailed procedures of solving equation (7) are summarized in Algorithm 1.

Algorithm 1

Solve Equation (7) via ADMM.

Input: Test sample y, training data matrix X, balancing parameter

λ

tol > 0

μ > 0

and the maximum iteration number T.

1: Initialize

z_{0} = c_{0} = δ_{0} = 0

;

2: while not converged do

3: Update

c

by equation (11);

4: Update

z

by equation (13);

5: Update

δ

by equation (14);

6: end while

Output: Coding vectors

z

and

c

Classification

For the test sample $y \in R^{d}$ , first we obtain its coding vector $c$ over the entire training data $X$ , then the test sample is classified into the class that yields the least residual, i.e.,, $identity (y) = \arg \min_{i} ∥ y - X_{i} c_{i} ∥_{2}$ , where $c_{i}$ is the coding vector that belongs to the $i$ -th class. The complete process of our proposed DNRC is summarized in Algorithm 2.

Algorithm 2

Our proposed discriminative non-negative representation based classifier (DNRC) algorithm.

Input: Training data matrix

X = [X_{1}, X_{2}, \dots, X_{K}] \in R^{d \times n}

, test data

y \in R^{d}

and balancing parameter

λ

1: Normalize the columns of

X

and

y

to have unit

ℓ_{2}

norm;

2: Obtain the coding vector

c

y

X

by solving the DNRC model in (7);

3: Compute the class-specific residuals

r_{i} = {∥ y - X_{i} c_{i} ∥}_{2}

;

Output:

label (y) = \arg min_{i} (r_{i})

Rationale of DNRC

Our proposed DNRC model (7) introduces a decorrelation regularizer into the formulation of NRC. As mentioned earlier, the decorrelation regularizer $∥ X c ∥_{2}^{2}$ can be rewritten as $\sum_{i = 1}^{K} \sum_{j = 1}^{K} (X_{i} c_{i})^{T} (X_{j} c_{j})$ , which is actually the pairwise class competition term. Therefore, DNRC takes the non-negative constraint and pairwise class competition into consideration simultaneously. Compared with (6), the second term in (7) impacts all pairs of classes, which promotes the competition among them and further enhances the discrimination of representation. Since the test sample is represented as a weighted sum (i.e., a linear superposition) of the training samples of all classes, each class has a contribution to expressing the test sample. Competition means that when training samples of one class have a great contribution to representing the test sample, other classes have relatively less contribution.

Since DNRC is closely related to NRC, we make a comparison between the two approaches. On the AR database, seven non-occluded images per subject in Session 1 are used for training, and one face image in Session 2 from the 98th class is used for testing. Figures 1 and 2 show the representation results (i.e., $X_{i} c_{i}$ ) of all classes produced by NRC and DNRC, respectively. As can be seen in Figure 1, the representation result of the correct class is not dominant. In contrast, from Figure 2, we can observe that DNRC significantly depresses the irrelevant classes and achieves more discriminative representation, which is beneficial for classification.

Figure 1.

Representation results of all classes of non-negative representation based classifier (NRC).

Figure 2.

Representation results of all classes of discriminative non-negative representation based classifier (DNRC).

Experiments

In this section, we assess the classification performance of DNRC on diverse benchmark datsets: two face databases including AR ¹⁷ and Extended Yale B ¹⁸ databases, Stanford 40 Actions dataset ¹⁹ for action recognition, three fine-grained object datasets which includes the Oxford 102 Flowers dataset, ²⁰ the Aircraft dataset ²¹ and the Cars dataset,²² the details of these datasets are summarized in Table 1. We compare the classification accuracy of DNRC with NSC,³ linear SVM, SRC,¹ CRC,⁸ CROC,¹⁰ ProCRC¹³ and NRC.¹⁴ In addition, on the Aircraft and Cars datasets, we also compare DNRC with some state-of-the-art deep methods, such as Symbiotic,²³ FV-FGC²⁴ and B-CNN.²⁵ The parameter of our method, that is, $λ$ in equation (7), is tuned to achieve the best performance via fivefold cross validations from the candidate set ${1 \times 10^{- 3}$ , $1 \times 10^{- 2}$ , $0.1, 1}$ . Recognition accuracy is used as the evaluation metric, which is the ratio of correctly classified samples to the total number of test samples, and it gives the value in percentage.

Table 1.

Details of datasets used in our experiments. The columns from left to right are the names of datasets, total number of samples, number of classes, number of training samples, and number of test samples.

Dataset	# Sample	# Class	# Training	# Test
AR	1400	100	700	700
EYaleB	2414	38	1216	1198
Stanford 40	9532	40	4000	5532
Oxford 102	8189	102	2040	6149
Aircraft	10000	100	6667	3333
Cars	16185	196	8144	8041

Experiments on the AR database

The AR database¹⁷ contains more than 4000 color images of 126 subjects (70 men and 56 women), these images have variations in facial expressions, illumination conditions and occlusions, example images from this database are shown in Figure 3. Following the experimental settings by Xu et al.,¹⁴ in our experiments, we use a subset with only illumination and expression changes that contains 50 male subjects and 50 female subjects from the AR database. For each subject, seven images from Session 1 are used as training samples, and the other seven images from Session 2 as test samples. All the images are firstly cropped to 60 $\times$ 43 pixels and projected to a subspace of dimensions 54, 120 and 300 by PCA. Experimental results are summarized in Table 2, the balancing parameter $λ$ of DNRC under dimensions 54, 120 and 300 is set to be 0.1, 0.1 and 1, respectively. We can observe that our proposed DNRC consistently outperforms the others under all the three reduced dimensions.

Figure 3.

Example images from the AR database.

Table 2.

Recognition accuracy (%) of competing approaches on the AR database.

Dim.	54	120	300
NSC³	70.2	75.1	76.1
SVM	77.8	83.1	89.8
SRC¹	82.4	89.8	93.0
CRC⁸	80.3	90.1	93.8
CROC¹⁰	81.9	89.2	93.1
ProCRC¹³	81.2	89.0	93.1
NRC¹⁴	85.2	91.3	93.3
DNRC	85.4	91.4	93.8

SRC: sparse representation based classification; NSC: nearest subspace classifier; SVM: support vector machine; CRC: collaborative representation based classification; CROC: collaborative representation optimized classifier; DNRC: discriminative non-negative representation based classifier; ProCRC: probabilistic collaborative representation based classifier; NRC: non-negative representation based classifier.

In order to demonstrate the statistical significance of our proposed DNRC compared with the other methods, we conduct a significance test, McNemar’s test (non-parametric),^26,27 for the results shown in Table 2. The significance level, that is, $p$ -value is set as 0.05, which means that the performance difference between two methods is statistically significant, if the estimated $p$ -value is lower than 0.05. Table 3 lists the $p$ -values between DNRC and the other methods. From Table 3, one can see that the performance differences between DNRC and the methods (NSC, SVM, SRC, CROC and ProCRC) are statistically significant in all cases. The performance differences between DNRC and CRC/NRC are not statistically significant; however, the recognition accuracy of DNRC is higher than that of CRC/NRC. The above experimental results validate the effectiveness of our proposed DNRC.

Table 3.

$p$ -value between DNRC and the other methods on the AR database. $*$ indicates that the difference between the two methods is statistically significant when $p = 0.05$ .

Dim.	54	120	300
NSC³	9.09 $\times 10^{- 19 *}$	1.78 $\times 10^{- 24 *}$	1.44 $\times 10^{- 28 *}$
SVM	${0.0169}^{*}$	8.58 $\times 10^{- 11 *}$	1.53 $\times 10^{- 4 *}$
SRC¹	${0.0169}^{*}$	${0.0376}^{*}$	${0.0295}^{*}$
CRC⁸	4.33 $\times 10^{- 7 *}$	${0.0487}^{*}$	1
CROC¹⁰	3.46 $\times 10^{- 4 *}$	${0.0090}^{*}$	${0.0449}^{*}$
ProCRC¹³	4.40 $\times 10^{- 4 *}$	${0.0039}^{*}$	${0.0316}^{*}$
NRC¹⁴	0.5	0.5	${0.0266}^{*}$

SRC: sparse representation based classification; NSC: nearest subspace classifier; SVM: support vector machine; CRC: collaborative representation based classification; CROC: collaborative representation optimized classifier; ProCRC: probabilistic collaborative representation based classifier; NRC: non-negative representation based classifier.

Experiments on the extended yale B database

The Extended Yale B database¹⁸ consists of 2414 face images from 38 individuals, each having 59–64 images. These images have illumination variations, example images from this database are shown in Figure 4. The original images are of 192 $\times$ 168 pixels. In our experiments, all the images are resized to 54 $\times$ 48 pixels. 32 images per subject are randomly chosen for training and the remaining for testing. The resized images are projected to a subspace of dimensions 84, 150 and 300 by PCA. Table 4 lists the classification accuracy of the competing approaches, the balancing parameter $λ$ of DNRC under dimensions 84, 150 and 300 is set to be 1, 1 and 0.1, respectively. It can be seen that DNRC achieves better performance than all the comparison methods in all dimensionalities.

Figure 4.

Example images from the Extended Yale B database.

Table 4.

Recognition accuracy (%) of competing approaches on the Extended Yale B database.

Dim.	84	150	300
NSC³	91.2	95.3	96.6
SVM	93.4	95.8	96.9
SRC¹	95.5	96.9	97.7
CRC⁸	95.0	96.3	97.8
CROC¹⁰	95.5	97.1	98.2
ProCRC¹³	93.4	95.3	96.2
NRC¹⁴	96.7	97.2	98.4
DNRC	96.8	97.4	98.4

Experiments on the stanford 40 actions dataset

The Stanford 40 Actions dataset¹⁹ contains 40 different classes of human actions, for example, brushing teeth, cleaning the floor, reading book and throwing a Frisbee, example images from this dataset are shown in Figure 5. It contains 9532 images in total, 180–300 images per action. Following the training-test split settings scheme in Yao et al.,¹⁹ we randomly select 100 images per class as the training images and employ the remaining images as the testing set. Features are extracted by using the pre-trianed VGG19 network,²⁸ and the dimension of the feature for each image is 4096. Experimental results are shown in Table 5, the balancing parameter $λ$ of DNRC is $1 \times 10^{- 3}$ . One can see that DNRC achieves higher accuracy than previous RBCM, that is, SRC, CRC and NRC. This indicates that by introducing the decorrelation regularizer, discriminative information of the training data can be explored. As a result, our proposed DNRC is more effective than previous RBCM.

Figure 5.

Example images from the Stanford 40 Actions dataset.

Table 5.

Recognition accuracy (%) of competing approaches on the Stanford 40 Actions dataset.

Methods	SVM	NSC³	SRC¹	CRC⁸
Accuracy	79.0	74.7	78.7	78.2
Methods	CROC¹⁰	ProCRC¹³	NRC¹⁴	DNRC
Accuracy	79.2	80.9	81.1	81.3

Experiments on the oxford 102 flowers dataset

The Oxford 102 Flowers dataset²⁰ includes 8189 images of 102 different flowers. Each flower class has over 40 images. These flower images are captured under diverse lighting conditions, flower poses, and image scales, example images from this dataset are shown in Figure 6. Features are extracted by employing the pre-trianed VGG19 network. Table 6 summarizs the classification accuracy of comparison methods, the balancing parameter $λ$ of DNRC is 0.1. It can be seen that DNRC achieves better performance when compared to the other competing methods.

Figure 6.

Example images from the Oxford 102 Flowers dataset.

Table 6.

Recognition accuracy (%) of competing approaches on the Oxford 102 Flowers dataset.

Methods	SVM	NSC³	SRC¹	CRC⁸
Accuracy	90.9	90.1	93.2	93.0
Methods	CROC¹⁰	ProCRC¹³	NRC¹⁴	DNRC
Accuracy	93.1	94.8	95.3	95.5

Experiments on the aircraft dataset

The Aircraft dataset²¹ includes 10,000 images of 100 different aircraft model variants, 100 images for each class. These aircrafts appear at diverse appearances, scales, and design structures, making this dataset very challenging for visual recognition. Example images from this dataset are shown in Figure 7. We adopt the training-testing split protocol provided by Maji et al.²¹ to design our experiments, features are extracted via a pre-trained VGG-16 network.²⁸ We also compare with the methods of Symbiotic,²³ FV-FGC,²⁴ B-CNN.²⁵ Recognition accuracy of competing approaches is presented in Table 7, the balancing parameter $λ$ of DNRC is $1 \times 10^{- 3}$ . We can observe that our proposed DNRC achieves higher accuracy than NRC and B-CNN. This demonstrates that DNRC can outperform not only the traditional RBCM such as SRC, CRC and NRC, but also the CNN based approaches.

Figure 7.

Example images from the Aircraft dataset.

Table 7.

Recognition accuracy (%) of competing approaches on the Aircraft dataset.

Methods	NSC³	SRC¹	CRC⁸	CROC¹⁰	ProCRC¹³
Accuracy	85.5	86.1	86.7	86.9	86.8
Methods	Symbiotic²³	FV-FGC²⁴	B-CNN²⁵	NRC¹⁴	DNRC
Accuracy	72.5	80.7	84.1	87.3	87.4

SRC: sparse representation based classification; NSC: nearest subspace classifier; CRC: collaborative representation based classification; CROC: collaborative representation optimized classifier; DNRC: discriminative non-negative representation based classifier; ProCRC: probabilistic collaborative representation based classifier; FV-FGC: fisher vector for fine-grained classification; B-CNN: bilinear CNN; NRC: non-negative representation based classifier.

Experiments on the cars dataset

The Cars dataset²² has 16,185 images of 196 car classes. Each car class contains about 80 images at different scales and heavy clutter background, making this dataset very challenging for visual recognition. Example images from this dataset are shown in Figure 8. We use the same split scheme provided by Krause et al.,²² in which 8144 images are employed as the training samples and the other 8041 images are employed as the testing samples. Features are extracted via a pre-trained VGG-16 network. We also compare with Symbiotic,²³ fisher vector for fine-grained classification (FV-FGC),²⁴ bilinear CNN (B-CNN).²⁵ Recognition accuracy of competing approaches on this dataset is recorded in Table 8, the balancing parameter $λ$ of DNRC is $1 \times 10^{- 3}$ . Again, our proposed DNRC performs the best among all the competing methods, and it makes an improvement of 0.3% and 0.2% over B-CNN and NRC, respectively.

Figure 8.

Example images from the Cars dataset.

Table 8.

Recognition accuracy (%) of competing approaches on the Cars dataset.

Methods	NSC³	SRC¹	CRC⁸	CROC¹⁰	ProCRC¹³
Accuracy	88.3	89.2	90.0	90.3	90.1
Methods	Symbiotic²³	FV-FGC²⁴	B-CNN²⁵	NRC¹⁴	DNRC
Accuracy	78.0	82.7	90.6	90.7	90.9

Parameter analysis

In this subsection, we investigate how the balancing parameter $λ$ influences the classification performance of DNRC. Experiments are conducted on the AR database, and the dimensionality of PCA is 300. Figure 9 shows the variation of recognition accuracy with $λ$ . It can be seen that the classification performance of our proposed DNRC is stable in quite a wide range.

Figure 9.

Classification accuracy (%) of DNRC with varying parameter $λ$ on the AR database.

Conclusions

In this paper, we designed a DNRC for pattern classification. By incorporating the decorrelation regularizer into the formulation of NRC, training samples of different classes are enforced to compete in representing the test sample. Therefore, the coefficient vector obtained by DNRC contains more discriminative information than that of NRC. Our proposed DNRC is solved elegantly via the ADMM algorithm. Experimental results on face databases, human action dataset and fine-grained datasets demonstrate that DNRC is superior to NRC and traditional RBCM, and it also outperforms some deep learning based approaches.

DNRC is an improvement of NRC, and they both belong to shallow model. In our future work, we will compare DNRC with graph convolutional network and design the deep version of DNRC.

Footnotes

Acknowledgements

The authors thank the editor and the anonymous reviewers for their constructive and valuable comments and suggestions, and thank Dr. Jun Xu for providing the source code of NRC at .

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Key Project of Jiangsu Vocational College of Information Technology (grant no. JSITKY201905), project of high-level specialty group construction of higher vocational education of Jiangsu province (grant no. [2021] 1), general project fund for natural science research of colleges and universities in Jiangsu province (grant no. 18KJD510011), project of high-level backbone specialty construction of Jiangsu province (grant no. [2017] 17), and in part by the National Natural Science Foundation of China (grant no. 61672263).

ORCID iDs

Kai-Jun Hu

He-Feng Yin

References

Wright

Yang

Ganesh

, et al. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 2009; 31: 210–227.

Naseem

Togneri

Bennamoun

. Linear regression for face recognition. IEEE Trans Pattern Anal Mach Intell 2010; 32: 2106–2112.

Lee

Kriegman

. Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 2005; 27: 684–698.

Zhang

Gong

. Nonlinear learning using local coordinate coding. In: NeurIPS, Vancouver, Canada, 6–12 December 2009, pp.2223–2231.

Min

Gui

, et al. Face recognition via weighted sparse representation. J Visual Commun Image Represent 2013; 24: 111–116.

Wang

Yang

, et al. Locality-constrained linear coding for image classification. In: Conference on computer vision and pattern recognition (CVPR), San Francisco, California, 13–18 June 2010, pp.3360–3367.

Zhang

Yang

, et al. A two-phase test sample sparse representation method for use with face recognition. IEEE Trans Circuits Syst Video Technol (TCSVT) 2011; 21: 1255–1262.

Zhang

Yang

Feng

. Sparse representation or collaborative representation: Which helps face recognition?. In: International conference on computer vision (ICCV), Barcelona, Spain, 6–13 November 2011, pp.471–478.

Timofte

Van Gool

. Weighted collaborative representation and classification of images. In: ICPR, Tsukuba, Japan, 11–15 November 2012, pp.1606–1610.

10.

Chi

Porikli

. Classification and boosting with multiple collaborative representations. IEEE Trans Pattern Anal Mach Intell 2013; 36: 1519–1531.

11.

Lai

Jiang

. Classwise sparse and collaborative patch representation for face recognition. IEEE Trans Image Process 2016; 25: 3261–3272.

12.

Zhong

Yang

, et al. A new discriminative sparse representation method for robust face recognition via L2 regularization. IEEE Trans Neural Netw Learn Syst 2016; 28: 2233–2242.

13.

Cai

Zhang

Zuo

, et al. A probabilistic collaborative representation based approach for pattern classification. In: Conference on computer vision and pattern recognition (CVPR), Las Vegas, USA, 27–30 June 2016, pp.2950–2959.

14.

Zhang

, et al. Sparse, collaborative, or nonnegative representation: Which helps pattern classification?. Pattern Recognit 2019; 88: 679–688.

15.

Lee

Seung

. Learning the parts of objects by non-negative matrix factorization. Nature 1999; 401: 788–791.

16.

Boyd

Parikh

Chu

, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 2011; 3: 1–122.

17.

Martinez

Benavente

. The AR face database, CVC Tech. Rep. no. 24, June, 1998.

18.

Georghiades

Belhumeur

Kriegman

. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 2001; 23: 643–660.

19.

Yao

Jiang

Khosla

, et al. Human action recognition by learning bases of action attributes and parts. In: International conference on computer vision (ICCV), Barcelona, Spain, 6–13 November 2011, pp.1331–1338.

20.

Nilsback

Zisserman

. Automated flower classification over a large number of classes. In: Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), Bhubaneswar, India, 16–19 December 2008, pp.722–729.

21.

Maji

Rahtu

Kannala

, et al. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013.

22.

Krause

Stark

Deng

, et al. 3D object representations for fine-grained categorization. In: International conference on computer vision workshop (ICCVW), Sydney, Australia, 1–8 December 2013, pp.554–561.

23.

Chai

Lempitsky

Zisserman

. Symbiotic segmentation and part localization for fine-grained categorization. In: International conference on computer vision (ICCV), Sydney, Australia, 1–8 December 2013, pp.321–328.

24.

Gosselin

Murray

Jégou

, et al. Revisiting the fisher vector for fine-grained classification. Pattern Recognit Lett 2014; 49: 92–98.

25.

Lin

Roy Chowdhury

Maji

. Bilinear CNN models for fine-grained visual recognition. In: International conference on computer vision (ICCV), Santiago, Chile, 7–13 December 2015, pp.1449–1457.

26.

Dietterich

. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 1998; 10: 1895–1923.

27.

Wen

Fang

, et al. Low-rank representation with adaptive graph regularization. Neural Netw 2018; 108: 83–96.

28.

Simonyan

Zisserman

. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

Discriminative non-negative representation based classifier for image recognition

Abstract

Keywords

Introduction

Related work

Sparse representation based classification

Collaborative representation based classification

Non-negative representation based classification

Discriminative non-negative representation based classifier

Proposed model

Optimization

Classification

Rationale of DNRC

Experiments

Experiments on the AR database

Experiments on the extended yale B database

Experiments on the stanford 40 actions dataset

Experiments on the oxford 102 flowers dataset

Experiments on the aircraft dataset

Experiments on the cars dataset

Parameter analysis

Conclusions

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iDs

References