Fabric defect detection based on deep-feature and low-rank decomposition

Abstract

Fabric defect detection plays an important role in controlling the quality of textile production. In this article, a novel fabric defect detection algorithm is proposed based on a multi-scale convolutional neural network and low-rank decomposition model. First, multi-scale convolutional neural network, which can extract the multi-scale deep feature of the image using multiple nonlinear transformations, is adopted to improve the characterization ability of fabric images with complex textures. The effective feature extraction makes the background lie in a low-rank subspace, and a sparse defect deviates from the low-rank subspace. Then, the low-rank decomposition model is constructed to decompose the feature matrix into the low-rank part (background) and the sparse part (salient defect). Finally, the saliency maps generated by the sparse matrix are segmented based on an improved optimal threshold to locate the fabric defect regions. Experimental results indicate that the feature extracted by the multi-scale convolutional neural network is more suitable for characterizing the fabric texture than the traditional hand-crafted feature extraction methods, such as histogram of oriented gradient, local binary pattern, and Gabor. The adopted low-rank decomposition model can effectively separate the defects from the background. Moreover, the proposed method is superior to state-of-the-art methods in terms of its adaptability and detection efficiency.

Keywords

Multi-scale convolutional neural network fabric image low-rank decomposition defect detection

Introduction

Fabric defect detection is the key phase of textile quality control. Traditional fabric defect detection is mainly performed through a visual inspection of skilled workers. However, its reliability is restricted by vision fatigue and human errors. An automatic visual textile detection system based on machine learning can provide a promising solution that not only reduces high labor costs but also improves accuracy and efficiency of fabric defect detection.¹

Traditional machine-vision-based fabric defect detection methods can be categorized into two categories,^2,3 namely, non-feature extraction methods and feature extraction methods. Among the non-feature extraction methods, Gabor filtering is the most effective method.⁴ However, it has strict requirements on filter parameters in order to achieve good performance. Li and Zhang¹ proposed an embedded machine vision system using Gabor filters and pulse-coupled neural network to automatically identify defects of warp-knitted fabrics, which consisted of image enhancement implemented by Gabor filtering with optimal parameters to make the defects more obvious, and image segmentation achieved by a parameter adaptive pulse-coupled neural network layer by layer. Bo et al.⁵ proposed an unsupervised learning method of the training model without the image label, which uses k-singular value decomposition (k-SVD) method to learn the sparse dictionary from an image block, and then space pyramid pooling and orthogonal matching pursuit method to build hierarchical characteristics from a dictionary. However, this method would bring higher feature dimension and cannot be used in the visual task of large-scale database. Li et al.⁶ proposed an effective fabric detection method based on biological vision modeling (BVM), which simulated the mechanism of biological visual perception and applied a robust feature descriptor from the biological modeling of P ganglion cells to characterize fabric texture, and then low-rank representation adopted to model visual saliency. In addition, image segmentation techniques, according to the image’s color, texture, shape features, and so on, can partition an image into homogeneous regions.⁷ An image segmentation method based on Hough transform is used to detect the target contour directly by the global characteristics of image space and parameter space before and after image transformation. Mathematical morphology based on erosion, dilation, opening, and closing provides an effective approach to analyzing digital images. Morphological filters exploit geometric rather than analytic features of signals.⁸ The advantages of the morphological over linear filtering are direct geometric interpretations, simplicity, and efficiency in hardware implementation.

Feature extraction methods are often widely used, mainly to extract fabric images’ texture, color, shape, and spatial relationship characteristics. Based on the extracted feature, template matching,⁹ neighborhood information,¹⁰ Fourier transformation,^11,12 and wavelet decomposition^13,14 are adopted to localize the defect region. Due to their diversity, fabric texture and defect are difficult to efficiently characterize using one kind of feature; this reduces the adaptivity of the defect detection methods.¹⁵ Mak et al.¹⁶ detected fabric defects using previously trained Gabor wavelet networks and morphological elements with a linear structural element. However, this method fails to detect the defect in the fabric images with black and white patterns. Tsai et al.¹⁷ proposed a regularity measurement for defect detection in non-textured and homogeneously textured images using principal component analysis (PCA), which is an orthogonal transformation used to transform linearly and non-linearly the correlation of the source variables into a subspace in which the variables are not correlated. It is widely used in feature extraction and data compression, as well as typically utilized for data pre-processing in defect detection. Notably, the features of the scale invariant feature transform (SIFT) method¹⁸ were originally extracted at scale-space extrema and used for feature point matching. The SIFT method is invariant to scale, rotation, and shift. The features of the histogram of oriented gradient (HOG) method¹⁹ and later SIFT method²⁰ were densely computed by entire image pyramids. They all described the patch of an image in terms of image gradient histogram. On the basis of feature extraction, some methods are operated at the patch level instead of the pixel level, and each pixel is simply assigned the saliency value of its enclosing patch.²¹ Furthermore, all image patches are treated as independent data samples for classification or regression even when they are overlapping.

With the advances of artificial intelligence, the feature extraction is gradually integrated by a deep-learning algorithm in the training process. Deep neural networks (DNNs), especially convolutional neural networks (CNNs), can automatically extract and learn in-depth features of an image, which have been proved to be better than hand-crafted features extracted by carefully designed algorithms. CNN’s convolution operation significantly reduces the number of parameters in a trained model and improves the model’s efficiency, thereby avoiding complex feature selection and manual extraction.²² In addition, CNNs can directly process the original test images and generate the multi-layer feature of complex fabric texture images by multiple nonlinear transformations. Deep learning has achieved very good results in some tasks, mainly boosted by the feature learning performed, which allows the method to extract specific and adaptable visual features depending on the data. Hinton et al.²³ applied an unsupervised representation learning algorithm to help learning internal representations by providing a local training signal at each level of a hierarchy of features. Unsupervised representation learning algorithms can be adopted several times to learn different layers of a deep model. Several unsupervised representation learning algorithms retain many properties of artificial multi-layer neural networks, relying on the back-propagation algorithm to estimate stochastic gradients. Abid²⁴ adopted a polynomial interpolation and multilayer perceptron method to train a neural network to detect and locate regions of defects. Ren et al.²⁵ presented a generic deep-learning method that used a pre-trained network and transferred features to build classifier, and then convolved the trained classifier over input image to make pixel-wise prediction. Li and Yu²⁶ trained a DNN for deriving a saliency map from multi-scale features extracted using deep CNNs. Wang et al.²⁷ adopted a DNN to learn local patch features for each centered pixel. Weimer et al.²⁸ proposed a novel deep CNN architecture to detect defects, which took all types of defect-free and defective samples together as the input. Li et al.²⁹ proposed a discriminative representation for patterned fabric defect detection using Fisher criterion-based stacked denoising auto-encoders (FCSDA). Fabric images were divided into patches of the same size. Then, these fabric patches with defect-free and defective classes were used to train FCSDA. Finally, test patches were classified using FCSDA in defective and defect-free classes. Experimental results indicated that the FCSDA method could obtain the superior results on a complex Jacquard warp-knitted fabric.

Low-rank decomposition (LRD) techniques³⁰ can divide an image matrix into two parts: low-rank matrix and sparse matrix, where the low-rank matrix indicates a smooth background and the sparse matrix indicates the salient regions. It has been successfully used in a variety of applications, such as subspace segmentation,³¹ visual tracking,^32,33 image clustering,³⁴ and video background–foreground separation.^35,36 Shen and Wu³⁷ provided a unified framework for integrating high-level knowledge and low-level features, which is based on the assumption that an image could be represented as the sum of the background being low rank and the salient regions being sparse. Peng et al.³⁸ introduced a tree-structured sparsity-inducing norm regularization to provide a hierarchical characterization for saliency detection through low-rank and structured sparse matrix decomposition. Yang et al.³⁹ proposed a saliency detection method of constructing an affinity matrix and scoring each node with its similarity to background and foreground cues through graph-based manifold ranking. Wang and Huang⁴⁰ proposed a salient object detection method using low-rank approximation and l_2,1-norm minimization, which is based on an underlying assumption that an image is a combination of background regions being low rank and salient objects being sparse. The normal fabric images with complex textures have large visual redundancy, and the defects are outstanding from the background. The efficiently deep feature can make the background lie in a low-rank subspace, the defect region deviate from the background. Therefore, a novel fabric defect detection algorithm is proposed based on deep feature and LRD. Deep learning is used to extract a CNN feature, and the LRD model is used to separate the background and the defect region.

Normal fabric images with a complex texture have large visual redundancy, and the defects are more salient in the complex texture background. Considering these characteristics, applying the model of LRD to fabric defect detection is considerably suitable. Therefore, we propose an effective defect detection method based on multi-scale convolutional neural network (MCNN) and LRD techniques. MCNN was adopted to extract the fabric feature, and LRD was utilized to separate the defect information from the background.

This article is structured as follows. Section “Proposed algorithm” focuses on the specific procedures of the proposed algorithm. In section “Experimental results and analysis,” we comprehensively present experimental protocol and obtained results, as well as analysis. Finally, section “Conclusion” concludes the article and points promising directions for future work.

Proposed algorithm

Although fabric images have numerous kinds of defects and a complex texture, the defects are more salient in the complex texture background. Therefore, it is of great value to study the defect detection by combining a deep-learning method and LRD technology. The existing deep-learning method typically uses the features of the last convolution layer to carry out the target detection, which causes the loss of detailed texture information. Fabric image texture is relatively abundant, and the low-texture features for defect detection and recognition are crucial; so, a novel fabric defect detection algorithm is proposed based on MCNN and LRD, and its overall structure is illustrated in Figure 1. The multi-layer features are extracted and integrated by MCNN, preserving the semantic information of the high-level characteristics and the detailed texture information of the low-level features. LRD technique was then adopted to divide the generated feature matrix into the low-rank matrix that indicates the background and the sparse matrix that indicates the salient defect. In the end, the iterative optimal threshold segmentation algorithm was utilized to segment the saliency maps generated by the sparse matrix to locate the fabric defect area.

Figure 1.

Overall structure of the proposed algorithm.

Unsupervised pre-training

When the labeled data are insufficient, an auxiliary supervisory training can be adopted, and fine-tuning the particular area can also improve its effectiveness. The current study adopts the sparse auto-encoder (SAE) to pre-train the filters of CNN, and its purpose is in line with the statistical characteristics of datasets and achieving a better initial value of filter sets. For each extracted patch, dictionary was generated by SAE, which can describe their characteristics. And then, sparse matrix was obtained through the linear combination. In the network, the overall cost function is defined as

J_{s p a r s e} (W, b) = J (W, b) + β \sum_{j = 1}^{s_{2}} K L (ρ ∥ {\hat{ρ}}_{j})

(1)

where the first term is the traditional basic neural network, as shown in formula (2); the second is the sparse penalty factor, as shown in formula (3); and β is the weight parameter balancing the sparse penalty factor and J(W, b) as well as s₂, which is the number of neurons in the hidden layer

\begin{array}{l} J (W, b) = \\ [\frac{1}{m} \sum_{i = 1}^{m} (\frac{1}{2} {|| h_{w, b} (x^{(i)}) - x^{(i)} ||}^{2})] + \frac{θ}{2} \sum_{l = 1}^{n_{l} - 1} \sum_{i = 1}^{s_{l}} \sum_{j = 1}^{s_{l} + 1} (W_{j i}^{(l)})^{2} \end{array}

(2)

where the first term is a mean-square-error term; h_w,b denotes an activation function, which is actually a nonlinear transformation function with parameters w and b; and the second term represents a weight attenuation term to prevent overfitting, and the weight attenuation parameter θ is adopted to balance the above-mentioned two terms. Relative entropy is generally employed to measure the disparity between two probability distributions. The definition of relative entropy between two mean Bernoulli random variables of mean ρ and mean ${\hat{ρ}}_{j}$ is given by

K L (ρ ∥ {\hat{ρ}}_{j}) = ρ \log \frac{ρ}{{\hat{ρ}}_{j}} + (1 - ρ) \log \frac{1 - ρ}{1 - {\hat{ρ}}_{j}}

(3)

where ρ denotes the sparse parameter and is typically a smaller value close to 0 and ${\hat{ρ}}_{j}$ represents the average activity of the hidden neurons j. It can be seen from formula (3) that the average activity ${\hat{ρ}}_{j}$ approximates to the sparse parameters $b^{(l)} = b^{(l)} - α \frac{\partial}{\partial b^{(l)}} J_{s p a r s e} (W, b)$ by sparsity constraint of the SAE neural network. In other words, most of the neurons are inhibited, and only a few of them are activated.

After the overall cost function is obtained, the parameters are updated according to the following formulas

W^{(l)} = W^{(l)} - α \frac{\partial}{\partial W^{(l)}} J_{s p a r s e} (W, b)

(4)

b^{(l)} = b^{(l)} - α \frac{\partial}{\partial b^{(l)}} J_{s p a r s e} (W, b)

(5)

where α is the learning rate. The latter two derivatives of formulas (4) and (5) are calculated by the back-propagation algorithm. The updating of the whole coding network is completed until the parameters are converged and the characteristic parameters W and b are obtained. It is known that m is the number of hidden layer nodes of layer l, and W^(l) is then decomposed into a parameter set of the number m. Each parameter set is a filter, which leads to a pre-trained filter set.

MCNN feature extraction and training

Considering that the fabric image has complex texture and diversities, MCNN model can learn a hierarchy of features from the raw image input by automatically updating the filters during training on massive amounts of training data. So, the MCNN is adopted to extract a deep feature of fabric images. The architecture of the proposed MCNN is presented in Figure 2.

Figure 2.

Multi-scale CNN and traditional CNN.

A typical MCNN architecture consists of several nested convolutional layers and pooling layers followed by fully connected layers at the end. In the convolutional layer, the feature maps of the previous layer are convolved with the learned convolution kernels, and an activation function then acts on that value to form the output feature maps. For each block of fabric images, its output size after convolution can be expressed as

α_{m}^{l} = [\frac{α_{m}^{l - 1} - w_{m}^{l}}{P_{m}^{l} + 1}] + 1

(6)

α_{n}^{l} = [\frac{α_{n}^{l - 1} - w_{n}^{l}}{P_{n}^{l} + 1}] + 1

(7)

where l denotes the image layer, $(α_{m}, α_{n})$ is the input image size of each layer, and $(w_{m}, w_{n})$ is the convolution kernel, where each convolution kernel operates on the effective area of input image. Ignoring parameter $(P_{m}, P_{n})$ specifies the image pixel ignored by convolution kernel. The output node of a convolutional layer is expressed as

α_{n}^{l} = f (\sum_{\forall m} (α_{n}^{l - 1} \times w_{m, n}^{l}) + b_{n}^{l})

(8)

where $α_{n}^{l}$ and $α_{n}^{l - 1}$ are the feature maps of the current layer and the previous layer, respectively; $w_{m, n}^{l}$ is the convolution kernel that is from the mth feature map of the previous layer to the nth feature map of the current layer; f(x) = max(0, x) is the neuron activation function; and $b_{n}^{l}$ represents the neuronal bias. The output node of the pooling layer can be expressed as

α_{n}^{l} = f (w_{n}^{l} \times \frac{1}{s^{2}} \sum_{s \times s} α_{n}^{l - 1} + b_{n}^{l})

(9)

where s×s is the scale of the down-sampling template and $w_{n}^{l}$ is the weight of the template.

The fully connected layers are the last part of the neural networks. All of the neurons in the fully connected layers are connected to all of the units of the last layer. Therefore, the final output of the CNN fully connected layer can be expressed as

α_{n}^{o u t} = f (\sum_{\forall m} (α_{n}^{o u t - 1} \times w_{m, n}^{o u t}) + b_{n}^{o u t})

(10)

After combining all convolutional layers and pooling layers, the pre-training of SAE and MCNN’s supervisory training can achieve the optimal weight along with improved training speed. We present the parameter optimization for MCNN in Algorithm 1.

Algorithm 1. Parameter optimization for multi-scale convolutional neural network
Require: Datasets X, learning rate a, epoch number N, batch size $n_{b}$ , network structure M Ensure: MCNN’s parameters $w, k, b, β$ Initialize network parameters $w, k, b, β$ randomly for each $i \in [1, N]$ do for each batch data $X_{i}$ with size $n_{b}$ do for layer l: = L to 1 do 1. Compute sensitivities of layer l according to back propagation 2. Compute the gradient of each parameter in layer l 3. Update parameters in layer l in the stochastic gradient descent method with learning rate a end for end for end for

Algorithm 1. Parameter optimization for multi-scale convolutional neural network

Require: Datasets X, learning rate a, epoch number N, batch size

n_{b}

, network structure M
Ensure: MCNN’s parameters

w, k, b, β

Initialize network parameters

w, k, b, β

randomly
for each

i \in [1, N]

do
for each batch data

X_{i}

with size

n_{b}

do
for layer l: = L to 1 do
1. Compute sensitivities of layer l according to back propagation
2. Compute the gradient of each parameter in layer l
3. Update parameters in layer l in the stochastic gradient descent method
with learning rate a
end for
end for
end for

Given an input image or a patch, it is input to each channel of the MCNN for training. After training, the corresponding features are also represented by the feature vector, successively. The use of two-layer full connection reduces the characteristic dimension of training. Although the feature dimension is lower, the more texture information of fabric images is retained for further detection and identification.

LRD model and optimal solution

Given a test fabric image, it can be divided into image blocks {B_i}_i ₌ _{1, . . ., N} with different sizes, where N is the number of image blocks. The characteristic representation $f_{i} \in I R^{D}$ of each block is generated based on the MCNN. Then, we combine all the feature vectors f_i into a feature matrix to represent the image, which can be expressed as $X = [f_{1}, f_{2}, \dots, f_{N}] \in I R^{D \times N}$ , where D is the dimension of the feature matrix and f_i represents the feature vector of the ith block. The feature matrix X can be decomposed into a low-rank matrix (corresponding to the background) and a sparse matrix (corresponding to the defective regions) by utilizing the LRD model. In general, we require the rank of the decomposed low-rank matrix and the sparse part to be small. It can be described as follows

\begin{array}{l} (L^{*}, S^{*}) = \underset{L, S}{\arg \min} (r a n k (L) + λ {|| S ||}_{0}) \\ s . t . X = XL + S \end{array}

(11)

where rank(L) is the rank of matrix $L \in I R^{D \times N}$ and ${|| S ||}_{0}$ is the l₀-norm (number of non-zero elements) of matrix $S \in I R^{D \times N}$ . Here, the first term is to guarantee the low-rank matrix; the second term corresponds to sparsity; and the parameter $λ > 0$ is used to balance the effects of the low rank and sparsity, which could be chosen according to the properties of the two norms or tuned empirically. Due to non-smoothness and non-convexity, the above objective optimization problem is a tough task and comes with no closed-form solutions. To alleviate the difficulty, we can replace rank(L) with ${|| L ||}_{*}$ and ${|| S ||}_{0}$ with ${|| S ||}_{2, 1}$ , resulting in the following convex optimization problem

\begin{array}{l} (L^{*}, S^{*}) = \underset{L, S}{\arg \min} ({|| L ||}_{*} + λ {|| S ||}_{2, 1}) \\ s . t . X = X L + S \end{array}

(12)

where ${|| L ||}_{*}$ is the nuclear norm of the low-rank matrix L and ${|| S ||}_{2, 1} = \sum_{j = 1}^{N} \sqrt{\sum_{i = 1}^{N} {(S_{i j})}^{2}}$ is the l_2,1-norm of the sparse matrix S. In this article, augment Lagrange multipliers (ALMs) are used to solve formula (12). Correspondingly, formula (12) is equivalently converted into the following Augmented Lagrangian function form

\begin{array}{l} \underset{L, S, J, Y_{1}, Y_{2}}{\arg \min} {|| J ||}_{*} + λ {|| S ||}_{2, 1} \\ + t r [Y_{1}^{T} (X - X L - S)] + t r [Y_{2}^{T} (L - J)] \\ + \frac{μ}{2} ({|| X - X L - S ||}_{F}^{2} + {|| L - J ||}_{F}^{2}) \\ \begin{matrix} s . t . & X = X L + S \end{matrix}, Z = J \end{array}

(13)

where Y₁ and Y₂ are Lagrange multipliers and $μ > 0$ is a penalty parameter. For the above problem, we choose the inexact ALMs, which we outline in Algorithm 2.

Algorithm 2. Solving problem (12) by inexact ALMs
Input: Feature matrix X, parameter λ Initialize: L = J = 0, S = 0, Y₁ = 0, Y₂ = 0, u = 10⁻⁶, max_u = 10¹⁰, $ρ = 1.1$ , $ε = 10^{- 8}$ While the formula is not converged, do 1. Fix the others and update J by $J = \arg \min \frac{1}{u} {\|\| J \|\|}_{*} + \frac{1}{2} {\|\| J - (L + Y_{2} / μ) \|\|}_{F}^{2}$ 2. Fix the others and update L by $L = {(I + X^{T} X)}^{- 1} (X^{T} X - X^{T} S + J + (X^{T} Y_{1} - Y_{2}) / u)$ 3. Fix the others and update S by $S = \arg \min \frac{λ}{μ} {\|\| S \|\|}_{2, 1} + \frac{1}{2} {\|\| S - (X - X L + Y_{1} / μ) \|\|}_{F}^{2}$ 4. Update the multipliers $\begin{array}{l} Y_{1} = Y_{1} + μ (X - X L - S) \\ Y_{2} = Y_{2} + μ (L - J) \end{array}$ 5. Update the parameter u by $μ = \min (ρ μ, \max_{u})$ 6. Check the convergence conditions ${\|\| X - X L - S \|\|}_{\infty} < ε$ and ${\|\| L - J \|\|}_{\infty} < ε$ . end while

Algorithm 2. Solving problem (12) by inexact ALMs

Input: Feature matrix X, parameter λ
Initialize: L = J = 0, S = 0, Y₁ = 0, Y₂ = 0, u = 10⁻⁶, max_u = 10¹⁰,

ρ = 1.1

ε = 10^{- 8}

While the formula is not converged, do
1. Fix the others and update J by

J = \arg \min \frac{1}{u} {|| J ||}_{*} + \frac{1}{2} {|| J - (L + Y_{2} / μ) ||}_{F}^{2}

2. Fix the others and update L by

L = {(I + X^{T} X)}^{- 1} (X^{T} X - X^{T} S + J + (X^{T} Y_{1} - Y_{2}) / u)

3. Fix the others and update S by

S = \arg \min \frac{λ}{μ} {|| S ||}_{2, 1} + \frac{1}{2} {|| S - (X - X L + Y_{1} / μ) ||}_{F}^{2}

4. Update the multipliers

\begin{array}{l} Y_{1} = Y_{1} + μ (X - X L - S) \\ Y_{2} = Y_{2} + μ (L - J) \end{array}

5. Update the parameter u by

μ = \min (ρ μ, \max_{u})

6. Check the convergence conditions

{|| X - X L - S ||}_{\infty} < ε

and

{|| L - J ||}_{\infty} < ε

.
end while

Generation and segmentation of the saliency map

The generated feature matrix X can be decomposed into low-rank matrix L corresponding to the background and sparse matrix S corresponding to the defect by formula (11). Each column S_i of the sparse matrix S corresponds to the possibility of an image block as a defect. In this article, the l_2,1-norm of S_i is used to denote the prominence of image block i

S al (B_{i}) = {|| S_{i} ||}_{2, 1} = \sum_{j = 1}^{N} \sqrt{\sum_{i = 1}^{N} {(S_{i j})}^{2}}

(14)

If ${|| S_{i} ||}_{2, 1}$ is larger, the prominence of the image block B_i becomes larger, which indicates the likelihood of the defect is greater. The prominence of all image blocks constitutes a saliency map SM. Noise reduction processing obtains Ŝ for the visual saliency map SM, that is

\hat{S} = g \times (SM \circ SM)

(15)

where g is a circular smoothing filter, “∘” is the Hadamard inner product operator, and “_*” represents the convolution operation. The saliency map Ŝ is converted into a grayscale image G

G = \frac{\hat{s} - \min (\hat{s})}{\max (\hat{s}) - \min (\hat{s})} \times 255

(16)

Finally, the iterative optimal threshold segmentation algorithm⁴¹ is adopted to segment the saliency map generated by the sparse matrix. Thus, the defective region is located.

Experimental results and analysis

In the current study, numerous simulation experiments were conducted on the fabric images to evaluate the performance of the proposed algorithm. We selected several kinds of fabric defect images from two fabric image databases (including broken end, netting multiple, hole, thick bar, thin bar, dot-patterned, and star-patterned, etc.). The first is the fabric database of the University of Hong Kong, and the second is the TILDA fabric image database of the University of Freiburg. All of the algorithms were coded and executed in MATLAB R2016a, and all of the experiments were implemented on a 3.30 GHz, Intel (R) Core (TM) i3-2120 CPU PC.

To reduce the computation cost of the proposed algorithm, a whole fabric image was first down-sampled to 256 × 256 pixels and then fed into the network as the input in the experiments. The training was carried out, and features were extracted for the image block by MCNN. The multi-layer feature was fused to realize the representation of the fabric image, and the feature matrix was then composed. The low-rank matrix and the sparse matrix were decomposed by the ALM. Finally, the iterative optimal threshold segmentation algorithm was used to segment the saliency map generated by the sparse matrix. Thus, the defective area of the fabric is located.

To demonstrate the effectiveness of the adopted feature extraction method, we first compare the other four feature extraction methods with our adopted MCNN method: (1) Shen and Wu³⁷ used intensity, texture, and orientation features plus low-rank matrix recovery (LRR) to detect an object; (2) Li et al.¹⁹ adopted an HOG feature and an LRD algorithm to detect fabric defects; (3) Li et al.⁶ applied the hierarchical information processing of the biological visual system and established BVM to characterize all types of fabric textures, which can quickly locate the salient object; and (4) Zhang et al.⁴² proposed a Gabor feature and LRD to detect defects. Figures 3 and 4 indicate the detection results of these different feature extraction methods in a plain-texture fabric and patterned fabric defect detection. Figure 3 demonstrates the detection result of the plain-texture fabric, and Figure 4 illustrates the detection result of a patterned fabric. Figures 3(a) and 4(a) are the original fabric images; (b) to (f) are some saliency maps generated by Shen and Wu,³⁷ HOG + LRD,¹⁹ BVM + LRD,⁶ Gabor + LRD,⁴² and our method, respectively; (g) shows our segmentation maps. It can be clearly seen from the experimental results that the five feature extraction algorithms have good detection performance for plain and twill fabric, except that the defect area of the third column and the fourth column are not continuous. For the fabric defect images with a plain weave, the fabric defect area in Shen and Wu³⁷ and Zhang et al.⁴² can be highlighted in the visual effect. For the patterned fabric, the background texture is very complex, and the effect of saliency maps generated by the method of Shen and Wu³⁷ and Zhang et al.⁴² is not ideal. However, the detection performance of our method is significantly enhanced. The reason is that MCNN can be more effective in extracting the texture information of fabric images compared with other feature extraction algorithms, and the LRD model was integrated into the proposed algorithm. The fabric defects were accurately segmented, and the defect area was more prominent and was located effectively.

Figure 3.

Visual saliency maps of plain-weave fabric defect images: (a) original images, (b) Shen and Wu,³⁷ (c) Li et al.,¹⁹ (d) Li et al.,⁶ (e) Zhang et al.,⁴² (f) ours, and (g) segmentation.

Figure 4.

Visual saliency maps of patterned fabric defect images: (a) original images, (b) Shen et al.,³⁷ (c) Li et al.,¹⁹ (d) Li et al.,⁶ (e) Zhang et al.,⁴² (f) ours, and (g) segmentation.

Figures 3 and 4 show the saliency maps generated by different feature extraction methods in plain-texture fabric and patterned fabric defect detection. The first column indicates the original images; the second column indicates the saliency maps generated by Shen and Wu³⁷; the third column indicates the saliency maps generated by the HOG feature with the LRD model; the fourth column is the saliency maps generated by the BVM with the LRD; the fifth column is the saliency maps generated by the Gabor feature with the LRD model; and the last two columns indicate the saliency maps and segmentation maps generated by our algorithm, respectively.

In the MCNN network, we added batch normalization⁴³ before each activation layer. The batch normalization stores the running average of the mean and standard deviation from its inputs. The stored mean is subtracted from each input of the layer, and division by the standard deviation is also performed. It has been demonstrated that by applying batch normalization, overfitting is decreased, and higher learning rates for training are achieved. In addition, we applied the padding strategy that pads zeros around the borders of the feature maps after each convolution leading to unchanged output shape in each channel. In addition, the pooling strategy adopted in all of the pooling layers is max-pooling, which is robust to small distortions. The output of the max-pooling creates a new set of image channels that are then fed through another layer of convolutions and nonlinearities. Finally, the last fully connected layer generates the output vectors, which are then stacked to a feature matrix. The parameters of the network, such as weights of the convolutional filters, are optimized through back propagation.

The selection of a stride and convolutional kernel size are critical for training network parameters and improving MCNNs’ performance. A larger value of the stride parameter reduces accuracy, although the training speed is improved. If the convolutional kernel is small, the local feature of the image cannot be extracted effectively. Conversely, if the convolutional kernel is too large, the calculation complexity will be far higher. The current study adopted multi-scale input images to increase local invariant information and collocation of different sampling intervals with different sizes of the convolutional kernel to obtain feature invariance, thus ensuring that the detailed information of the fabric image texture is not lost. We selected the cross-entropy function as the loss function of our proposed algorithm, and during the MCNN training process, the stochastic gradient descent (SGD) with mini-batches of 50 samples was applied to update the weight parameters. We also incorporated the momentum and learning rate decay into the SGD optimizer, and the updating rules of the weights in each iteration are as follows

v_{i + 1} = μ \cdot v_{i} - a \cdot \nabla g

(17)

w_{i + 1} = w_{i} + v_{i + 1}

(18)

a \leftarrow a \cdot \frac{1}{1 + d \cdot i}

(19)

where i is the iteration index, w is the weight hyperparameter, μ is the momentum coefficient, v is the current velocity vector, a is the learning rate, d is the decay parameter of the learning rate, and ∇g is the average value of gradients with respect to w over the mini-batch at each iteration. In our experiments, the momentum parameter and the weight attenuation coefficient were set to 0.9 and 10⁻⁵, respectively. The learning rate was initialized to 0.1 and then reduced by half the increase of the number of iterations. Apart from that, the dropout strategy with a probability of 0.5 was applied to the last fully connected layers that also help to avoid overfitting. In addition, we initialized the bias value to 0 for every layer and chose the ReLU activation function according to the work of He et al.⁴⁴

An important parameter in neural network training is the iteration number. If it is too small, the network prediction error increased. Furthermore, the input defect images cannot be fully learned, resulting in low detection accuracy of the final multi-layer deep-learning algorithm; on the contrary, the long calculation time and the accuracy rate gradually increase with a higher number of iterations. Figure 5 illustrates the relationship between the number of iterations and both the loss function and the accuracy rate.

Figure 5.

Relationship between the iteration number and both the loss and the accuracy rates.

As can be seen in Figure 5, the accuracy rate is improved as the increase of iteration number and stability are achieved. In the experiments, the number of iterations was set to 3600 for a compromise of iteration times and accuracy. When training from scratch, the performance did not seem to improve beyond 3600 iterations, thus paused the back propagation there. Although the performance was much lower than for the fine-tuned networks, the network learned to predict the fabric defect.

However, the accuracy rate was not a good measure for imbalanced datasets as performance on the defect detection. Therefore, in the current study, we adopted means and standard deviations of average precisions, recalls, F-measure, and mean absolute error (MAE).⁴⁵ The precision (also known as the positive predictive value or false positive rate, defined as TP/(TP + FP)) is the ratio of the correct significant region and the total detection area, which reflects the testing accuracy and measures the refused ability of the false detection rate in the test system error and test method. The recall (also known as sensitivity or true positive rate, defined as TP/(TP + FN)) is the probability of the correct significant area out of the total significant areas, which reflects the comprehensiveness of detection. Here, TP is the number of true positives, FP the number of true negatives, and FN the number of false negatives. The higher the precision and recall values, the better the detection effect, but there is a certain contradiction between the two. For example, when the detection area is very large, it can be used to ensure that the recall ratio is high, but the accuracy rate is not enough. To keep the balance between the two, it is necessary to adopt the F-measure value to synthesize the evaluation criteria of the two indexes. It is defined as

F = \frac{(1 + β^{2}) P \times R}{β^{2} P + R}

(20)

where $β^{2} = 0.3$ , and according to Yang et al.,³⁹ it is a constant. Furthermore, MAE was also applied to evaluate the effectiveness of saliency detection of the proposed algorithm. The result is illustrated in Figure 6.

Figure 6.

Comparison of our results with four feature extraction methods.

In Figure 6, the detection performance of our proposed algorithm is better than other existing methods, and the multi-layer features of MCNN for fabric defect description are more remarkable than some hand-crafted features.

Conclusion

Fabric defect defection is a key component of quality control in the textile industry. In the current study, we propose a novel fabric defect detection algorithm based on MCNNs and LRD. The proposed algorithm has two contributions: (1) the traditional methods of extracting features are replaced by MCNN to characterize a fabric defect image’s texture and (2) LRD is adopted to decompose the feature matrices in the low-rank and sparse parts. The detection results are obtained by segmenting the saliency map generated by the sparse part. The experimental results demonstrate that the proposed method can accurately detect the defect regions in various fabric defect images, even for images with complex textures, and it is superior to the state of the art.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (Grant Nos 61772576 and 61379113), the Key Natural Science Foundation of Henan Province (Grant No. 162300410338), Science and Technology Innovation Talent Project of the Education Department of Henan Province (Grant No. 17HASTIT019), Intelligent Image Analysis Processing and Machine Vision Innovation Team in Henan Province (Grant No. 2018091), and the Henan Science Fund for Distinguished Young Scholars (Grant No. 184100510002).

ORCID iD

Baorui Wang

References

Zhang

Automated vision system for fabric defect inspection using Gabor filters and PCNN. J Springerplus 2016; 5(1): 765.

Zhou

Semenovich

Sowmya

, et al. Dictionary learning framework for fabric defect detection. J Text Inst 2014; 105(105): 223–234.

Zhu

, et al. Fabric defect detection via small scale over-complete basis set. J Text Res J 2014; 84(15): 1634–1649.

Jung

Kim

A unified spectral-domain approach for saliency detection and its application to automatic object segmentation. IEEE T Image Process 2012; 21(3): 1272–1283.

Ren

Fox

Unsupervised feature learning for RGB-D based object recognition. New York: Springer, 2013, pp. 387–402.

Gao

Liu

, et al. Fabric defect detection based on biological vision modeling. IEEE Access 2018; 6: 27659–27670.

Cheng

Tian

Liu

, et al. Image segmentation based on multi-region multi-scale local binary fitting and Kullback–Leibler divergence. J Signal Image Video Proc 2018(2): 1–9.

Xia

Jiang

, et al. Warp-knitted fabric defect segmentation based on non-subsampled Contourlet transform. J Text Inst 2017; 108(2): 239–245.

Jiang

Hao

FZ.

Injured ticket detection and location based on improved template matching method. J Electr Design Eng 2018; 26: 175.

10.

Wang

Woven fabric defects based on singular value decomposition. Shanghai, China: Donghua University, 2014.

11.

Zhu

Pan

Gao

Fabric defect detection using characteristic spectrum of Fourier transform and correlation coefficient. J Comput Eng Appl 2014; 50(10): 866–873.

12.

Guan

Gao

, et al. Defect detection of plain weave based on visual saliency mechanism. J Text Res 2014; 35(4): 56–61.

13.

Ngan

HYT

Pang

GKH

Yung

, et al. Wavelet based methods on patterned fabric defect detection. J Pattern Recogn 2005; 38(4): 559–576.

14.

Arivazhagan

Ganesan

Bama

Fault segmentation in fabric images using Gabor wavelet transform. J Mach Vis Appl 2006; 16(6): 356–363.

15.

Chunlei

Zhang

Liu

, et al. A novel fabric defect detection algorithm based on textural differential visual saliency model. J Shandong Univ 2014; 44(4): 1–8.

16.

Mak

Peng

Yiu

KFC

. Fabric defect detection using morphological filters. J Image Vis Comput 2009; 27(10): 1585–1592.

17.

Tsai

Chen

, et al. A fast regularity measure for surface defect detection. J Mach Vis Appl 2012; 23(5): 869–886.

18.

Lowe

DG.

Distinctive Image Features from Scale-Invariant Keypoints. J Int J Comput Vis 2004; 60(2): 91–110.

19.

Gao

Liu

, et al. Fabric defect detection algorithm based on histogram of oriented gradient and low-rank decomposition. J Text Res 2017; 38: 153–158.

20.

Long

Zhuo

Jia-Feng

, et al. A low-redundancy dense SIFT feature extraction algorithm. J Meas Control Technol 2017; 36(3): 20–23,27.

21.

. Deep contrast learning for salient object detection. Comput Vis Pattern Recogn 2016: 478–487.

22.

Zhang

Application of deep convolutional neural network in computer vision. J Data Acquisit Process 2016; 31(1): 1–17.

23.

Hinton

Osindero

Teh

YW.

A fast learning algorithm for deep belief nets. J Neural Comput 2006; 18: 1527–1554.

24.

Abid

Texture defect detection by using polynomial interpolation and multilayer perceptron. J Eng Fiber Fab 2019; 14: 1–12.

25.

Ren

Hung

Tan

KC.

A generic deep-learning-based approach for automated surface inspection. IEEE Trans Cybernetics 2018; 48(3): 929–940.

26.

. Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, 7–12 June 2015, pp. 5455–5463. New York: IEEE.

27.

Wang

Ruan

, et al. Deep networks for saliency detection via local estimation and global search. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, 7–12 June 2015, pp. 3183–3192. New York: IEEE.

28.

Weimer

Scholz-Reiter

Shpitalni

Design of deep convolutional neural network architectures for automated feature extraction in industrial inspection. CIRP Ann Manuf Technol 2016; 65(1): 417–420.

29.

Zhao

Pan

Deformable patterned fabric defect detection with fisher criterion-based deep learning. IEEE T Autom Sci Eng 2017; 14(2): 1256–1264.

30.

Candès

, et al. Robust principal component analysis. J ACM 2011; 58(3): 11.

31.

Yin

Cai

Gao

Robust face recognition via double low-rank matrix recovery for feature extraction. In: Proceedings of the 2013 IEEE international conference on image processing, Melbourne, VIC, Australia, 15–18 September 2013, pp. 3770–3774. New York: IEEE.

32.

Wan

Qian

, et al. Total variation regularization term-based low-rank and sparse matrix representation model for infrared moving target tracking. Remote Sens 2018; 10: 510.

33.

Vaswani

Bouwmans

Javed

, et al. Robust subspace learning: robust PCA, robust subspace tracking and robust subspace recovery. IEEE Signal Pr Mag 2018; 35(4): 32–55.

34.

Zhang

Zhao

Low-rank matrix approximation with manifold regularization. IEEE T Pattern Anal Mach Intel 2013; 35(7): 1717–1729.

35.

Bouwmans

Sobral

Javed

, et al. Decomposition into low-rank plus additive matrices for background/foreground separation: a review for a comparative evaluation with a large-scale dataset. Comput Sci Rev 2017; 23: 1–71.

36.

Xue

Cao

Motion saliency detection using low-rank and sparse decomposition. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), Kyoto, Japan, 25–30 March 2012.

37.

Shen

A unified approach to salient object detection via low rank matrix recovery. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Providence, RI, 16–21 June 2012, pp. 853–860. New York: IEEE.

38.

Peng

, et al. Salient object detection via low-rank and structured sparse matrix decomposition. In: Proceedings of the twenty-seventh AAAI conference on artificial intelligence, AAAI’13, Bellevue, Washington, DC, pp. 796–802. Reston, VA: AAAI Press.

39.

Yang

Zhang

, et al. Saliency detection via graph-based manifold ranking. Comput Vis Found 2013: 3166–3173.

40.

Wang

Huang

Salient object detection with low-rank approximation and l2,1-norm minimization. J Image Vision Comput 2017; 57: 67–77.

41.

Gao

Liu

, et al. Defect detection for patterned fabric images based on GHOG and low-rank decomposition. IEEE Access 2017; 7: 83962–83973.

42.

Zhang

Gao

. Fabric defect detection algorithm based on Gabor filter and low-rank decomposition. In: Proceedings of the eighth international conference on digital image processing, Chengdu, China, 20–23 May 2016, pp. 1–6. Bellingham: International Society for Optics and Photonics.

43.

Ioffe

Szegedy

. Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv Preprint, arXiv:1502.03167v3.

44.

Zhang

Ren

, et al. Delving deep into rectifiers: surpassing human-level performance on imageNet classification. In: Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 7–13 December 2015, pp. 1026–1034.

45.

Yapi

Allili

Baaziz

Automatic fabric defect detection using learning-based local textural distributions in the contourlet domain. IEEE T Autom Sci Eng 2018; 15(3): 1014–1026.