Abstract
Deep convolutional neural networks (CNNs) have shown great success in single-class fabric image detection. However, real-world fabric defect images generally contain several types of defects in one image. Accurately recognizing and classifying multi-class fabric defect images is still an unsolved issue due to the complexity of intersected defects, as well as difficulty in distinguishing small-size defects. To address these challenges, this study develops a methodology based on the deep learning feature pyramid networks (FPN) approach to detect multi-class fabric defects. To evaluate the proposed detection model, we built a unique multi-class fabric defects database (DHU-MO1000), where multi-class defect images are generated by industrial monitors from a textile factory. We used the dataset as the benchmark for multi-class defects detection training and testing the FPN. Furthermore, we conducted extensive experimental validations for various design choices. The experimental results show that the model outperformed existing multi-class object detection methods.
Keywords
Introduction
In the modern fabric industry, fabric defect recognition and detection are important for textile industrial quality control. In most textile mills, the visual detection of trained workers is still a critical element in the fabric defect detection process, with low detection efficiency and precision due to psychological factors, different fibers, and many other constraints. Due to the constraints, automated detection based on computers or machines has drawn considerable attention in recent decades.
Typically, single-class fabric defect detection is a special case of multi-class detection. The multi-class object detection of fabric defect images is a more general and practical problem, since the majority of real-world fabric images contain multiple defects. As shown in Fig. 1, for single-label defect images, the fabric defects are roughly aligned with the label. While for multi-label defect images, even with the same label (e.g., brokenpick or felter in Fig. 1), the textile and size of defects are different from those of single-label defects. Also, some defects are too small to be distinguished. Hence, the multi-class object detection of fabric defect images is more difficult than the single-class situation.

Selected image examples of fabric defects. Foreground fabric defects in single-class images are usually roughly aligned (images in the first row). However, the assumption of fabric defect alignment is not valid for multi-class object images (images in the second and third rows).
At present, there are mainly five types of single-class fabric defect detection
algorithms: Spectral-based methods (e.g., Fourier Transform
7
and Gabor
Transform10,14,19,33). These methods can accurately detect fabric
defects by using image frequency domain information, but local and whole
information in the image is difficult to be considered. These approaches
need a great deal of calculation to ensure detection precision. Statistical-based algorithms, which are characterized by the extraction
of the eigenvalue and spatial distribution of image gray values (e.g.,
morphology
32
and co-occurrence matrix
18
) The
advantage of these methods is their speed because of using the gray
value characteristics of the image, but they are more susceptible to
noise and external interference. In addition, statistical-based
algorithms do not easily detect fabric defect features that are not
obvious. Learning-based algorithms (e.g., support vector machines
15
and
neural networks8,13,16) that can further extract the feature
information of the defective image. The disadvantage of using these
methods is that the dimension of the feature information is low, thus,
the detection accuracy is difficult to improve. Structure-based algorithms, using the texture analysis method.
1
These
algorithms can achieve good detection results, but it is difficult to
choose the suitable feature extractor for the fabric image where defects
are not distinct. Model-based algorithms, which construct the stochastic models with random
variables. The auto-regression model
2
belongs to these
algorithms. In this approach, the parameters can be adjusted to
determine whether there are fabric defects. But the problem is that if
the selected parameters are improper, the convergence rate is very
slow.
Lately, convolutional neural networks (CNNs) 12 have shown promising performance in computer vision applications, such as image detection,28,31 image restoration, 9 crowd counting, 34 and paleo valley recognition. 11 In addition, CNN has also achieved state-of-the-art performance in large-scale single-object image classification. 3 With the characteristics of CNN, Girshick et al. designed the region-based convolutional neural network (RCNN), 5 and obtained the candidate region by the region selection approach. Then, they presented the Fast RCNN 4 and Faster RCNN 22 approaches for object detection. Furthermore, a top-down architecture with lateral connections is proposed for building high-level semantic feature maps at all scales, which is called the feature pyramid networks (FPN). 17 CNN has been widely applied in single-class fabric defect classification, 27 fabric pattern generation, 26 and detection. 28 For instance, Li et al. 16 proposed a Fisher criterion-based stacked denoising auto-encoder (AE) model for detecting deformable patterned textile defects. Mei et al. 20 presented an unsupervised learning-based automated approach with a multi-scale convolutional denoising auto-encoder network model, which synthesizes results from multiple pyramid levels and highlights defective regions through the reconstruction residual maps generated by the convolutional denoising auto-encoder networks. Mei et al. 21 also presented an unsupervised learning approach for automated defect inspection on homogeneous and nonregular textured surfaces.
Meanwhile, many methods23,30 have also been proposed to address multi-object image detection. However, these approaches cannot detect small-size defects effectively. At present, recognizing and classifying multi-class fabric defect images still remain unsolved, making it a pressing and crucial task for improving the quality of textile products.
In this study, based on the development of deep learning, the multi-class object detection of fabric defect images is studied. Due to the advantages of detecting small targets, the FPN network is used to identify fabric defects. The main contributions of this paper are presented as follows.
A unique fabric database (DHU-MO1000) was created by collecting the
fixation data from textile mills. An FPN was used for the multi-class detection of fabric defect images. To
the best of our knowledge, this is the first work on applying the deep
learning framework into multi-class fabric defect detection. Our experiments on DHU-MO1000 demonstrated that the model can obtain
better recognition performance than the current state-of-the-art
approaches.
The remainder of this paper is organized as follows. FPN is overviewed, multi-class detection of fabric defect images based on FPN is introduced, experiment results and feature analysis are provided, followed by the conclusion.
Overview of FPN
Girshick et al. further presented the FPN based on Faster RCNN. 22 The structure of the network consisted of the FPN, a region proposal network (RPN), region of interest (ROI) pooling, and classification and regression.
FPN Architecture
The FPN architecture takes an image of an arbitrary size as the input, and outputs proportionally-sized feature maps at multiple levels. The structure of FPN involves a top-down pathway, lateral connections, and a bottom-up pathway. The top-down pathway can obtain higher resolution features by upsampling semantically stronger, but spatially coarser, feature maps from higher pyramid levels. These features are then enhanced by the bottom-up pathway with lateral connections. Each lateral connection merges features with the same size from the top-down pathway and the bottom-up pathway. The bottom-up pathway executes the feed-forward computation, which computes the feature maps at several scales. The bottom-up pathway completes the feed-forward computation that computes a feature hierarchy. The feature hierarchy consists of feature maps at several scales with a scaling.
As shown in Fig. 2, the
building block constructs the top-down feature maps. With a spatially coarser
feature map, the block upsamples the spatial resolution by a factor of 2. Then,
the upsampled feature maps are integrated with the corresponding bottom-up map
by element-wise addition. This process is iterated until the finest resolution
is generated. Before the iteration process, the 1 x 1 convolutional layer is
attached to produce the coarsest resolution map. Finally, the 3 x 3 convolution
operation is appended on each merged map to generate the final feature map,
which can reduce the aliasing effect of upsampling. The final set of the feature
map is {

A building block illustrating the lateral connection and the top-down pathway, merged by addition.
RPN
The RPN network takes an image as input and outputs a series of object proposals.
This process is modeled with a fully convolutional network. To generate region
proposals, the RPN network is slid over the convolutional feature map, which is
outputed by the last shared convolutional layer. This RPN network takes an

Region Proposal Network (RPN).
ROI Pooling Layer
The ROI operation uses max pooling to convert features in any valid area of
interest into small feature maps with a fixed spatial range of
After the ROI operation, the feature maps and region proposals can be collected
through the ROI pooling layer, which is characterized by the non-fixed size of
the feature maps. The output of ROI pooling layer is a vector, whose size is
Classification and Regression
In the FPN network, the model can achieve a bounding-box regression by a
different manner from previous ROI-based methods. The feature information used
for regression is of the same spatial size on the feature maps. To account for
varying sizes, a set of
For image classification, classification layers calculate the proposal's class by using the full connection layer and the softmax network. For the bounding box regression, Faster RCNN adopts the parameterizations of the four following coordinates (Eq. 2). 22
FPN Net-Based Fabric Defect Detection Model
The method developed to identify multi-class defects is comprised of three key steps.
The multi-class fabric defect dataset is collected from textile mills.
The quality of the data largely affects the detection result of our
algorithm. Hence, we preprocessed the raw data. FPN net is trained and validated using the preprocessed data. FPN net is tested on DHU-MO1000, and the classification performance is
evaluated and presented. The learning process of the proposed approach
is described in detail later.
Data Preparation and Augmentation
In this study, the multi-class fabric defect dataset is collected from the textile mills. Before training the FPN model using the dataset, the original data is preprocessed and augmented by two steps: segmentation of fabric defects and diversification of the images. The original defect images are 1280x1024 pixels, and include stain, irregular texture, and fringe. The first step is to crop and obtain local image blocks, each of which has a size of 320x320 pixels. The second step is to rotate and translate the images so that the model can learn more invariant image features. The range of rotation is from 5° to 20°, and the range of translation is from 0 to 50 pixels. The original fabric defect images are shown in Fig. 4. The images with horizontal and vertical flips are shown in Fig. 5.

Examples of the original image.

Horizontal and vertical flips of the images.
FPN Algorithm
In the FPN network, the two most basic operations are convolution and pooling. With a multi-class defect image as the input, the convolutional layer convolutes the feature map of the upper layer to generate the feature maps (Eqs. 3 and 4).
The operation
In this study, the pre-trained resnet50 6 model is used for image feature extraction. The FPN architecture can generate multi-dimensional feature representations for an image. The main role of the RPN is to generate region proposals. Then, ROI can convert a feature map with a random input into a fixed-size feature map. The whole learning process is presented in Algorithm 1.
FPN Model for multi-class object fabric defects detection.
Learning the Proposed Networks
A certain number of multi-class fabric images are randomly sampled from the training sets for training. The testing sets are used to evaluate the performance of the proposed approach. The learning process of the FPN model is illustrated separately.
Resnet50 is pretrained by ImageNet dataset to extract the features. A proper initialization is set on the hyper parameters. All new layers are randomly initialized by drawing weights from a zero-mean Gaussian distribution with a standard deviation of 0.01. The stochastic gradient descent with a batch size of one sample is used with 10,000 iterations, the weight decay of 0.0004, the momentum of 0.9, and the gamma of 0.1. The learning rate is initially set as 0.0001. The size of ROI is set as 14. Our model is developed based on the deep learning library Tensorflow 1.2.0 and relevant third-party libraries. We conduct our experiments on a personal computer with 128GB RAM and four Nvidia GeForce GTX 1080 GPUs.
Results and Discussion
To validate the effectiveness of the proposed approach, the experiments are carried
out on a self-developed dataset (DHU-MO1000). In addition, the detection performance
is also verified on the single-class detection of fabric defect images. The
single-class fabric defect image dataset (DHU-SL1000)
28
is adopted in this study. The
comparison experiments with three state-of-the-art algorithms are also conducted.
This section includes three subsections described as follows: The experiment setup is introduced, including an overview of the dataset
and several quantitative indicators. The detection results are presented and evaluated. The detection
performance is compared with the state-of-the-art models. To further understand the learning process of the model, the features are
visualized and analyzed.
Setup
The textile dataset DHU-MO1000 consists of approximately 1000 samples, including 950 defect images and 50 defect-free images. The dataset contains six categories of defects: normal (defect-free), sundries, oilstains, brokenpick, felter, and brokenend. Some typical textile defect samples are shown in Fig. 4. The characteristics of the fabric defect images dataset are shown in Table I. Over 80% of these images belong to multiple classes simultaneously.
Characteristics of Multi-Class Fabric Defect Data a
Several quantitative indicators are selected to evaluate the multi-class detection results, including accuracy, recall, Average Precision (AP) and mean of AP (mAP) in Eq. 6.
Detection Results
We evaluate our approach for multi-class fabric defects detection and compare its detection performance with other approaches. In Fig. 6, the detection results for some multi-class fabric defect images are presented. It can be seen from the figure that the FPN model can obtain effective performance in different fabric defect types. Meanwhile, we can see that the model can achieve competitive detection performance even for some very small defects, such as sundries.

Examples of multi-class fabric defect detection.
Table II shows the recall and AP of multi-class detection for detecting different defect types. It can be seen that the model can obtain good results for detecting the defect brokenpick, but cannot obtain effective results for detecting felter. This phenomenon may be caused by the property and background of a fabric defect. The defect brokenpick is more pronounced relative to the background. The felter is more blurred, making it difficult to be distinguished. The detection performance of the compared methods on DHU-MO1000 is quantified (Table III). The first approach is based on non-locally centralized sparse representation. 25
Experimental Results of Multi-Class Detection of Various Defect Types
Performance Comparison of Different Detection Approaches on Multi-Class Fabric Dataset
The second approach is Faster RCNN. 22 The mAP of Faster RCNN is 71.32%, which is better than nonlocal sparse method. Our FPN model can reach 75.56%, improving the Faster RCNN method by 4.24%. This experiment further demonstrates the superiority of the FPN model.
To verify the efficiency of the proposed model on multi-class fabric defects detection, we also evaluated the testing time. The results are obtained on the GPU with Nvidia GeForceGTX 1080. During the training process, the time cost of training the model is 357 min. Table IV shows the time consumed for the testing process. The average detection time is 0.5 s for each image. From the results, we can see that the proposed method is suitable for multi-class fabric defects detection.
Time of Fabric Defect Detection a
In addition, it should be recognized that single-class fabric defect detection is a special case of multi-class detection. Here, we also verify the performance of the algorithm on single-class fabric dataset. 28 The detection results of the proposed method are compared with the state-of-the-art algorithms. Among the three existing methods, the first one is a fabric defect detection model that uses optimized filters to detect the defect images. 24 The second approach is based on non-locally centralized sparse representation. 25 The last method is a modified Faster RCNN. 28 As shown in Table V, the deep learning algorithms (Faster RCNN and FPN) can achieve better performance than the traditional detection algorithms (such as Gabor Filter and Nonlocal Sparse). Moreover, the proposed FPN model has the best comprehensive performance compared with the other methods. These results prove that our model achieves good performance, not only in the case of the multi-class situation, but also in the single-class fabric image detection.
Performance Comparison of Various Detection Approaches on Single-Class Fabric Dataset
Feature Analysis
To further understand the learning process of the network, we visualize the extracted feature information of four convolution layers in Fig. 7.

Visualization of the four convolution layers. The first row is the original images, and the 2nd-5th rows represent respectively the feature maps of the 2nd-5th convolution layer.
As shown in Fig. 7, the different layers in the network are concerned with the different feature information. It is worth noting that, the model can first learn some low-level features, such as colors and edges. Then, the model can learn more distinguishing and discriminative features, such as fabric texture and defects. The important feature information (defects) can be clearly seen in layer 5. Furthermore, with the different feature information in different layers, the FPN framework for building feature pyramids inside Con-Nets can achieve good feature representations.
Conclusions
In this study, deep learning has been applied to multi-class fabric defect detection. The experimental results demonstrate that the FPN model suits the processing of fabric defect datasets and can achieve effective detection performance, considering the characteristics of multi-class defect images. In future work, the FPN model can be implemented based on other state-of-the-art back-bone networks, such Resnet101 and Resnet152. The back-bone networks can further improve the detection performance of our proposed approach.
