Sage Journals: Discover world-class research

Abstract

Real-time detection of fabric defects is a fairly critical part of industrial production. However, there are still some key issues to be solved in practical detection production, such as low detection speed and delays in traditional cloud detection. To address these issues, in this paper, a new detection network architecture, called YOLOV4-TinyS, is proposed. Firstly, the k-medoids clustering algorithm is used to improve the matching of anchor points and ground truths for datasets with great differences. Secondly, the residual structure is changed to reduce the complexity of the network structure, and a depth-separable convolution is used instead of partial convolution to improve the detection speed. Thirdly, the output feature layer is designed with shallow feature fusion to improve the location information extraction capability and use spatial attention and channel attention to improve the network efficiency. Finally, the whole network is trained and tested on four different datasets and extensive experiments show that the network has higher detection accuracy and faster detection speed compared to existing methods. Compared to the original network, YOLOV4-Tiny, the model complexity is reduced by 67.86% and the highest detection accuracy of 99.91% is achieved. Furthermore, the establishment of an efficient fabric inspection system and the validation of the method allows for the fast detection of fabric defects on conveyor belts. Thus, the proposed method has the potential to lay the foundation for the real-time detection of fabric defects and their application in industry.

Keywords

Deep learning fabric defect detection convolutional neural network edge computing

Fabric is not only used in clothing, elastic bands and gloves, but also in other industries such as aerospace, military enterprises and health care.^1,2 However, mechanical faults, defective yarns or oil stains on sewing machines can lead to product defects.¹ Therefore, effective inspection is necessary during the production process. For some brands, a defective product can have a negative impact and even weaken their influence. According to surveys, prices can lose up to 45–65% if the fabric is defective.³

To date, more than 70 categories of fabric defects have been defined by the textile industry.⁴ Manual inspection is subject to several factors, such as boredom and human eye fatigue. If a factory uses manual inspection methods to detect defects, the quality of the inspection will be significantly reduced, thus affecting the price of the product. According to surveys, manual inspection can only detect 60–75% of defects.⁵ With the development of computer hardware, the cost of machine vision is gradually decreasing. As a result, most companies are starting to use computer vision methods, which not only saves a great deal of cost but also improves the accuracy of detection. At present, computer vision-based detection methods are divided into two main categories: traditional algorithms and deep learning algorithms.

Plain weave,⁶ twill⁷ and leather fabrics⁸ exist among those with a uniform textured structure. Researchers have proposed several methods based on these fabric characteristics, including automated statistical methods, frequency-domain methods and modeling methods. Chetverikov and Hanbury⁹ detected defects based on the regularity and local anisotropy of the texture structure. Experimental results showed that the approach was feasible and applicable. These methods typically use hand-designed features to suit the specific type of defect and therefore require a large number of parameters for different fabric types and types of defects. As can be seen, traditional machine learning methods have several limitations, including low accuracy, high manual design costs and low robustness.

In recent years, with the development of convolutional neural networks (CNNs), deep learning has been gradually applied to defect detection. Compared with traditional methods, deep learning methods can automatically extract effective features from the input image without the need to manually design complex features. Ali et al.¹⁰ used CNNs in wall crack detection, and showed better results than traditional methods. An effective defect classification model based on YOLO was proposed by Wei et al.¹¹ A fabric defect segmentation method was proposed by Huang et al.¹² A CNN with an adaptive learning rate was proposed by Tang et al.¹³ for signal fault diagnosis. The CNN-based algorithm improves the effectiveness of detection.^14,15

In industrial production, there is a large amount of data that needs to be processed in real-time; therefore, the large computing resources of the cloud platform are used in industrial computing. The entire cloud computing model is shown in Figure 1. The data captured by the cameras is uploaded to the cloud and computed there. When a defect is detected, the system screens out the defective fabric images and sends the detection results to the user. The cloud has high information processing power and storage capacity. For this reason, the cloud computing model is widely used for data processing.

Figure 1.
Cloud computing method.

Cloud computing uses a great deal of resources for computing and storing data and it can process data efficiently. However, there are some disadvantages to cloud computing. During detection, if the broadband network is insufficient, data transfer will become very slow. In addition, a large amount of data to be transferred can cause a heavy load and delays in transmission. Finally, if a defect is detected, delays in information transmission on the internet can cause the sorting equipment to miss the defective fabric, resulting in incorrect sorting.

Edge computing has been adopted for fabric defect detection based on the above-mentioned disadvantages of cloud computing. Edge computing is an open platform that integrates computing, storage and other functions on a single device, facilitating proximity to the data source. The entire edge computing deployment is shown in Figure 2. If there is a defect in the fabric, the edge device can detect it in the first instance and send the information to the cloud for storage. Compared to detecting on a cloud platform, edge detection can effectively avoid latency and improve detection efficiency.

Figure 2.
Edge detection graph.

The low power consumption of edge devices means that lightweight networks are often used for real-time detection. However, most current networks neglect detection speed for better accurate detection accuracy. In industrial production, detection efficiency will be compromised if speed is not met. As a result, many lightweight-based networks are being developed. A deep learning method for fabric defect detection on edge devices is proposed to meet industrial needs. In summary, the main contributions of this work are as follows.
The weight file generated by this neural network model is only 7 MB in size, which not only provides high accuracy but also ensures fast detection speed.

The attention module and depth-separable convolution are used to reduce network parameters and improve the accuracy of network detection. In addition, the k-medoids clustering algorithm is used to optimize the pre-defined anchor points to improve the detection of the network. Finally, multiple small residuals are used instead of large residual modules and some of the bloated structure is removed, while the location of the backbone output feature measures is changed to increase the variability of the detection network.

Edge devices are typically deployed in the industry. We use four different types of datasets to test the accuracy and speed of the network in JETSON TX2.

Based on the above approach, we propose a detection system based on edge devices and verify the feasibility of the above approach.

The rest of the paper is organized as follows. In the second section, traditional defect detection methods, deep learning methods and the background of edge computing are described. In the third section, the network structure and the methods used are described. The fourth section describes the experimental metrics and hardware devices being used. The fifth section describes the experiments, mainly including a comparison of the algorithmic models presented in the third section, the device used in the experiment is edge device. In the sixth section, an industrial inspection site is mimicked to verify the feasibility of our method. In the final section, the work in this article is summarized.

In this section, we have summarized the necessity of fabric defect detection and the commonly used industrial inspection methods. By comparing them, we have found that deep learning-based defect detection methods can effectively improve detection efficiency and address detection challenges. Therefore, this paper proposes an object detection-based fabric defect detection method, and at the end of this section, we provide a summary of the contributions of this paper.

Related work

This section discusses three main aspects of fabric defect detection, namely traditional defect detection methods, deep learning methods and edge device detection.

Traditional methods

Traditional detection methods are divided into statistical, model-based and frequency-domain methods. Statistical-based methods were first used to detect fabric defects.

Statistical methods usually divide the image into image blocks. Textural features of standard textiles and features of defective areas are extracted by analyzing the distribution of grey values of fabric defective pixels. Commonly used methods include autocorrelation measures, co-occurrence matrices and variance averaging. However, the limitation of these methods is that it is difficult to distinguish between blurring of the average grey level and small defects on the fabric. Anitha and Radha¹⁶ used independent component analysis algorithms to extract the desired texture and structural information from the image. The structural information can be reduced by phase coherence and then the template image can be distinguished from the input image. However, the method is only applicable to some defects and is not ideal for detecting defects in complex textures.

The texture of fabric exhibits periodicity, which can be detected through transformations between time and space domains, or between frequency domains. Common methods based on the frequency spectrum are the Fourier transform (FT), wavelet transform (WT) and Gabor filter. The FT is often used to extract texture information for detection by suppressing high and low frequencies in an image. Researchers have used the space-after FT to solve this problem, but it can be challenging to find a suitable function. On the other hand, WT and Gabor wavelets leverage the spatial frequency domain to detect defects; an optimized Gabor filter generation method was proposed by Bodnarova et al.¹⁷ However, these methods may suffer from reduced efficiency due to their high redundancy and computational costs.

The model-based approach treats the texture characteristics of the fabric as a stochastic process. The main methods for defect detection using model statistical information are Gauss–Markov random field models and Gaussian mixture models.¹⁸ These methods are effective in detecting fabric defects, but they are not as effective in detecting small defects.¹⁹ This is because small defects are often confused with background noise and their small pixel area in the image is not easily detected accurately by statistical models. The above traditional methods for defect detection are summarized in Table 1.

Table 1.
Traditional methods of defect detection

Author Proposed method Results

Anitha and Radha¹⁶ Independent component analysis and vector quantized principal component analysis based on Gabor wavelets The overall success detection rate is 89.99%

Bodnarova et al.¹⁷ An optimal Gabor wavelet filter-based approach Defects are detected in all 35 sample images, with only six of them showing a small number of false alarms

Allili et al.¹⁸ A new framework for contour-based statistical modeling is developed using a finite mixture of generalized Gaussian distributions Better results have been produced than the state-of-the-art methods

Deep learning methods

The traditional detection method can be applied to most situations, but it is difficult to apply to situations with high computational complexity and small targets. Meanwhile, with improvements in computer hardware, deep learning-based methods have developed rapidly. Deep learning algorithms build a more complex network model. Their powerful model-fitting capabilities give them great research potential. CNN-based algorithms are widely used in the field of detection. Excellent networks such as Alexnet,²⁰ Vgg,²¹ GoogleNet²² and ResNet²³ have been proposed. Jeyaraj et al.²⁴ used a CNN to detect defects in six different fabric materials with an average accuracy of 96.55%, which demonstrates the effectiveness of the algorithm. However, the authors did not test complex textures. Liu et al.²⁵ proposed an algorithm applied to the Faster-RCNN algorithm for fabric defect detection, but the detection speed did not meet industrial requirements. Miao et al.²⁶ proposed a two-stage model approach combining the traditional WT and deep learning. It has higher accuracy and is more suitable for industrial production than the traditional and single-stage methods. However, its cumbersome detection speed of the two-stage detection step needs to be improved. Katiyar et al.²⁷ used a lightweight network based on MobileNetV2 for defect detection, but the detection of small targets is low. Jing et al.²⁸ proposed an efficient CNN for fabric defect detection, where the lightweight MobileNetV2 is used as an encoder and the softmax layer is used to generate a segmentation mask. Experimental results show that the proposed method achieves state-of-the-art performance in terms of segmentation accuracy and detection speed. Wang et al.²⁹ proposed the attention-based EfficientNet algorithm to detect cancer cells. The method obtained good results. Qi et al.³⁰ used the improved YOLOV3-Tiny algorithm to test railway firmware and achieved good results. However, with the advent of YOLOV4-Tiny, the algorithm needs to be updated. In 2020, Bochkovskiy et al.³¹ proposed the YOLOV4 algorithm. They updated the new backbone structure, CSPDarknet53, and added new modules to the original network. Dlamini et al.³² used the YOLOV4 algorithm to develop a real-time machine vision system that is relatively fast and detects defects in functional textiles. Lim et al.³³ proposed a lightweight model based on YOLOv4-Tiny. The authors improved the detection accuracy of the network model by modifying the loss function. Zheng et al.³⁴ proposed an improved YOLOv5-based method for fabric defect detection, which is based on the squeeze and excite (SE) module to improve network performance. Experimental results show that the proposed model, SE-YOLOV5, improves accuracy and generalization capability. Zheng et al.³⁵ proposed an improved YOLOV7 defect detection algorithm. By changing the clustering method, the attention mechanism and loss function were added to improve the network detection accuracy. Experimental results showed that the proposed YOLOV7 model can effectively achieve accurate detection of small objects in complex backgrounds. Haleem et al.³⁶ created a fabric defect detection system using a deep learning algorithm to validate the new method’s feasibility and its potential for application in the textile industry. Most importantly, the deep learning algorithm has high industrial promise, but it has a high demand for computing resources. When hardware resources are insufficient, the detection efficiency is severely reduced. The above deep learning references are summarized in Table 2.

Table 2.
Deep learning methods of defect detection

Author Proposed method Results

Jeyaraj et al.²⁴ A deep learning classification network is designed The accuracy of the defect classification is tested on six different fabric materials and an average accuracy of 96.55% is obtained

Liu et al.²⁵ An algorithm for fabric defect detection applied to the Faster-RCNN algorithm is proposed The proposed method can locate the fabric defect region with higher accuracy compared with the state-of-art, and has better adaptability to all kinds of the fabric images

Miao et al.²⁶ A method that combines continuous wavelet transform (CWT) with a convolutional neural network is proposed The accuracy of the method was 96.94%, an improvement of almost 10% over the traditional method. Actual average detection time is only 2.4 seconds

Katiyar et al.²⁷ A model successfully trained on Google Cloud ML Engine is proposed MobileNet-SSD can automatically detect surface defects more frequently, more accurately and more precisely than traditional deep learning methods

Jing et al.²⁸ A highly efficient convolutional neural network, Mobile-Unet, is proposed The proposed method achieves state-of-the-art performance in terms of segmentation accuracy and detection speed

Lim et al.³³ A lightweight object detection model based on the YOLOv4-Tiny framework is proposed The best model achieves an average accuracy of 88.32% at 225.22 frames per second

Zheng et al.³⁴ A YOLOv5 (SE-YOLOv5) based on the squeeze and excitation (SE) module is proposed The proposed model SE-YOLOv5 improves the accuracy and generalization

Zheng et al.³⁵ An improved YOLOV7 model is proposed The average accuracy of the model is 93.8%, which is 7.6%, 3.7% and 4% higher than that of the Faster-RCNN model, YOLOV7 model and YOLOV5s model, respectively

Edge computing

Edge computing is a low-cost, easy-to-deploy approach to computing that is different from cloud computing. It is deployed in a distributed manner, close to the data source for easy transmission, and addresses the disadvantages of transmission costs and internet latency. As a result, various edge device and deep learning-based solutions have been proposed over the past few years. NVIDIA is one of the graphics processing unit (GPU) manufacturers and some researchers have started to use NVIDIA JETSON TX2 to provide solutions for various tasks. For example, Blanco-Filgueira et al.³⁷ proposed a real-time embedded system to address multi-target tracking and edge computing applications. It demonstrates the feasibility of embedded devices in terms of frame rate and power consumption. Goyal et al.³⁸ used deep learning algorithms to detect and localize foot ulcers in diabetic patients. They used the edge device TX2 for deployment. Hoang et al. (2019) proposed an implementation of enhanced TX2 for enhanced signpost detection, which not only facilitates deployment but also enables fast detection. Song et al.³⁹ ported the EfficientDet network to TX2 and used a lightweight network to achieve real-time defect detection. Haut et al.⁴⁰ used the low-power TX2 to classify hyperspectral images with a deep learning algorithm. The deep learning algorithm classifies the hyperspectral images and achieves promising results. Therefore, TX2 will be used as a hardware platform to detect defects on fabric surfaces. The above edge computing references are summarized in Table 3.

Table 3.
Edge computing methods of defect detection

Author Proposed method Results

Blanco-Filgueira et al.³⁷ An end-to-end solution for real-time deep learning-based multiple object tracking in an embedded and low-power IoT oriented platform is presented 10 FPS video capture, using at any time only the available frame from the live camera without intermediate storage for delayed processing, and a total power consumption of only 12 W is achieved

Goyal et al.³⁸ Deep learning method designed for real-time DFU localization and model performance evaluated on NVIDIA Jetson TX2 Demonstrates the power of deep learning for DFU real-time localization

Song et al.³⁹ A low-latency, low-power, easily scalable, edge computing-based automatic visual inspection system is proposed Detection speeds of up to 22.7 frames per second (FPS) on the edge device Jetson TX2. This is a 2.5× reduction in response time compared to cloud-based methods, with the ability to detect industrial defects in real-time

DFU; diabetic foot ulcers; IoT: Internet of Things.

In this section, we have classified the commonly used fabric defect detection methods, including traditional image processing methods, deep learning and edge computing. We have compared and analyzed these methods, evaluating their advantages and disadvantages. Ultimately, we have chosen the combination of deep learning and edge computing as the approach for fabric defect detection.

Proposed method

The lightweight network YOLOV4-Tiny⁴¹ has been improved for industrial applications to make it more suitable for edge devices to meet higher requirements. The network structure has been improved by adding depth-separable convolution and attention modules to improve detection speed and accuracy. The clustering method is also improved to improve the accuracy of anchors for datasets with large differences.

Network structure

The overall network structure is shown in Figure 3. The original map is convolved by two 3 × 3s, batch normalization (BN) and downsampling used by the activation function to obtain 104 × 104 × 64 features. In the shallow layer of the backbone network, the position information of the defective target is mainly contained. Therefore, the spatial attention module is used in the shallow layer of the network to enable the network to learn more location information. After two large residual blocks, a feature size of 26 × 26 is obtained, at which point the network has many redundant feature layers, and the channel attention module is used to weigh the channels containing more semantic information about the defects, enabling the network to acquire more relevant semantic information.

Figure 3.
Network structure diagram. The input image undergoes multiple downsampling operations to obtain two different-sized feature layers for prediction. BN: batch normalization; ReLU: rectified linear unit.

For the third large residual block, the original large residual structure has been modified, and two consecutive small residuals are used instead. At the same time, we changed the feature layer of the backbone output to the detection head to enrich the information in the different feature layers; the overall structure is shown in Figure 4. As the network goes deeper and the number of channels increases, each convolution requires large calculations, and we used this method to effectively reduce the network parameters.

Figure 4.
Third residual block modification structure diagram.

At the same time, the original network used the last two layers of the final residual block as the output feature measurement; the overall structure is shown in Figure 5.

Figure 5.
Map of the change in the location of the output feature layer of the backbone network (original on the left, improved on the right).

After testing we find that this method is not effective for the detection of small targets, because the output to the 26 × 26 feature layer and the 13 × 13 feature layer output features are very close together, resulting in very similar feature information obtained, reducing the features obtained by the detection head. In addition, the 26 × 26 feature layer contains very much semantic information in the deeper layers of the network, while the 13 × 13 feature layer is mainly responsible for detecting small targets and the network needs to obtain more edge information to detect small targets. The output layer of the second large residual block is used as the input layer of the 26 × 26 detection head to improve its sensitivity to location information and the underlying feature layer as the 13 × 13 feature layer, which avoids the overlap of feature information in neighboring feature layers.

Anchor clustering method

This detection method requires pre-setting candidate anchors and then adjusting them based on the prediction frame based on the anchors. As a result, the clustering effect of the candidate anchors affects the detection results. For datasets with large differences, extreme data can bias the anchor points if the mean method is used, so to prevent the effect of extreme data, a new clustering method, k-medoids, is proposed to make it more applicable to datasets with large differences.

The k-medoids method is a clustering algorithm that aims to divide a set of data points into k different clusters, such that the similarity between data points within each cluster is as high as possible, while the similarity between different clusters is as low as possible. The steps of the k-medoids algorithm are as follows.
Select k random points (data points from the dataset). These points are also known as “medoids.”

Assign all data points from the dataset to the nearest medoid using any distance formula, such as Euclidean distance or Manhattan distance.

Now, choose new medoids by computing the median of all data points in each cluster.

Continue with steps 2 and 3 until no data points change their cluster assignment between two iterations.

In summary, k-medoids is an effective clustering algorithm that can handle noisy and outlier data effectively, and is easy to implement and understand.

If the k-medoids algorithm is used, as in the initialization process, an intersection over union (IOU) with k cluster centers will be calculated for each sample. Equation (1) is used to find the cluster core closest to each point. The IOU is calculated as follows (Equation (2)):
$d (box, centorid) = 1 - IOU (box, centroid)$
(1)
$IOU = \frac{{inter}_{area}}{{box}_{area} + {cluster}_{area} - {inter}_{area}}$
(2)where ${box}_{area}$ represents the area of the bounding box, ${cluster}_{area}$ represents the area of the anchor box and ${inter}_{area}$ represents the overlap area. A visual representation in the form of Figure 6 is used to describe the calculation principle of IOU and to better illustrate what it is, where A represents the predicted bounding box and B represents the ground truth bounding box, and IOU $= \frac{A ⋂ B}{A ⋃ B}$ .

Figure 6.
Intersection over union schematic.

Depthwise separable convolution

By using k-medoids clustering, the network can obtain better matching anchor points. To minimize the number of parameters in the computation, depth-separable convolution is used instead of normal convolution with a convolution channel count greater than 256, which effectively reduces the number of parameters in the network, as shown in Figure 7. A 12 × 12 pixel, three-channel feature layer (shaped as 12 × 12 × 3) is subjected to a deep convolution operation, and a three-channel image is processed into three feature maps. Then each convolution kernel convolves get an 8 × 8 × 1 feature map. This calculation performs independent convolutions for each channel, but it does not benefit the feature information for the same location information on different channels. Therefore, a 1 × 1 × 3 pointwise convolution method convolution kernel is used to combine the information from the different channels.

Figure 7.
Depthwise separable convolution. Firstly, a 5 × 5 convolution is applied to the 12 × 12 × 3 feature map. Then, a 1 × 1 × 3 convolution is performed on the feature map, resulting in an 8 × 8 × 1 feature layer.

Attention module

To compensate for the lack of information extraction capability of the network, different attention modules are added to the network. For the location information available in the shallow network, the spatial attention module is used to extract the location information. As shown in Figure 8, the input features are connected by two description channels of maximum pooling and average pooling, and the weight parameters are obtained by the convolution and sigmoid.

Figure 8.
Spatial attention module. The input feature map undergoes pooling and convolution with a sigmoid activation function to generate a weight ratio for each pixel.

Deeper network information has higher semantic features and, as the network deepens, the number of feature channels increases after each convolution. Channel attention deepens the network to the channels containing key information. As shown in Figure 9, the input feature information is max-pooled and average-pooled to obtain two 1 × 1 × C feature information, which is sent to two convolutional layers. The first layer has C/r neurons. while the second layer has C neurons. These two layers of the network share parameters. With the two feature layers, the sigmoid obtains the distribution of weights for the different channels.

Figure 9.
Channel attention. Channel attention generates a weight distribution for each channel. This weight distribution is used to amplify or attenuate each feature channel, selectively highlighting the most informative features and suppressing less important ones.

In this section, we have introduced the proposed method, which includes modifying the network architecture to make it more suitable for different sizes of fabric defects. The k-medoids clustering method is used to adapt to different defect sizes, and depthwise separable convolutions are added to improve detection speed. Finally, different attention mechanisms are applied at various locations in the network to enhance detection accuracy.

Experimental work

In this section, we performed tests according to the proposed methodology. The hardware is tested using an Inter Gold5118 CPU (2.30 GHz) processor with 128G RAM. NVIDIA GeForce 2080Ti is used for local experiments and the software is Windows 10 OS, PyTorch1 and Cuda10.1. The edge device NVIDIA JETSON TX2 is used for testing. The GPU uses the NVIDIA Pascal architecture and has 265 CUDA cores and 8 GB of RAM. The processor consists of a dual-core Denver2 processor and a CortexA57, which consumes only 7.5W, but is suitable for edge computing.

Experimental datasets

In the experiments, four different datasets are applied to the test network. The four datasets are camouflage fabric (CF), light blue striped fabric (LBF), dark red fabric (DRF) and grid fabric (GF), as shown in Figure 10. During training, the datasets are trained and tested in a 8:2 ratio. A test set is used to validate the effectiveness of the network training. The preprocessing step of resizing the image before feeding it into the neural network is a common practice in deep learning. Neural networks typically require images of a certain size as input, and if the original image size does not meet these requirements, it needs to be adjusted accordingly. One common method is to scale the image down to the desired size. Therefore, images of the original size of 256 × 256 are standardized to a uniform size of 416 × 416, as shown in Table 4.

Figure 10.
The four fabrics: (a) camouflage fabric; (b) light blue striped fabric; (c) dark red fabric and (d) grid fabric.

Table 4.
Dataset composition

Dataset Training set Testing set Total Image size

CF 360 40 400 256 × 256

LBF 422 46 468 256 × 256

DRF 378 42 420 256 × 256

GF 338 36 374 256 × 256

CF: camouflage fabric; LBF: light blue striped fabric; DRF: dark red fabric; GF: grid fabric.

Evaluation metrics

The experimental metrics consist of two main parts, detection speed (FPS), and detection accuracy (MAP). Among them, MAP is evaluated by the following indicators:
$MAP = \frac{\sum {A P}_{C}}{N}$
(3)where ${A P}_{C}$ represents the accuracy of each category, C represents the type of defect and $N$ is the number of defect classes. In this article, N = 1, and MAP is equivalent to AP because we define all defects on each type of fabric as one type of defect. According to the research, in practical production, it is only necessary to detect whether there is a defect in the fabric, so it is sufficient to define the defect type as whether it is defective. When calculating AP, a precision–recall curve is computed for each class, where the horizontal axis represents recall and the vertical axis represents precision. Typically, as recall increases, precision decreases, resulting in a downward convex curve. The relationship between the curves of AP and precision and recall is shown in Figure 11. Precision and recall are calculated as follows:
$Precision = \frac{T P}{T P + F P}$
(4)
$Recall = \frac{T P}{T P + F N}$
(5)where $T P$ is an example where the classifier considers a positive sample and it is indeed a positive sample, $F P$ is an example where the classifier considers a positive sample but it is not actually a positive sample and $F N$ is an example where the classifier considers a negative sample but it is not actually a negative sample:
$FPS = \frac{F_{totalFrame}}{FtotalTime}$
(6)where $F_{totalFrame}$ represents the number of images processed during the total detection time, which refers to the total number of frames processed during the entire time; the inferred time is $FtotalTime$ (s).

Figure 11.
AP and recall and precision relationship diagram.

The size of the weight file obtained from training the neural network model can be indirectly reflected by using #Params(MB), which indirectly represents the complexity of the current algorithm.

In this section, we have introduced the dataset used in our paper. The entire dataset is captured in industrial production settings using the same equipment. We have also provided an overview of the relevant evaluation metrics.

Results and discussion

Ablation experiments

Anchor cluster analysis

The YOLO network needs to set the anchor in advance. The quality of the anchor indirectly affects the detection result, and the clustering effect of the anchor is crucial to result. There are large differences in the CF dataset, so the k-medoids and k-means methods are used for comparison. The results are shown in Table 5 and it can be found that k-medoids are on average 4% more accurate than k-means in terms of anchor points.

Table 5.
Clustering method impact

Dataset k-means (%) k-medoids (%)

CF 73.82 76.14

LBF 83.71 86.55

DRF 72.40 78.00

GF 73.45 76.10

Mean ACC 75.85 79.20

ACC: accuracy; CF: camouflage fabric; LBF: light blue striped fabric; DRF: dark red fabric; GF: grid fabric.

The anchor points obtained from the CF dataset are trained on the above two clustering methods and it is eventually found that the detection of the network after k-medoids clustering is higher than that of k-medoids on the CF dataset. Because of the large data variation in the dataset, when the k-means algorithm is used to adjust the anchor point parameters, the set size coordinates of the anchor points will receive more influence from the extreme points, which will reduce the effectiveness of defect detection. This problem can be avoided by the k-medoids algorithm. Therefore, our proposed clustering algorithm solves the problem of large data differences and improves the applicability of the algorithm. As shown in Table 6, the detection accuracy of the network has been greatly improved, with the average MAP increasing from 89.59 to 93.9. Our experiments have found that using our clustering algorithm provides better clustering results for datasets with greater differences.

Table 6.
Comparison of detection results of two clustering methods

Clustering method MAP FPS

k-means 89.59 21.20

k-medoids 93.90 21.20

Backbone comparison experiments

Among the lightweight models, the MobileNet and EfficientNet series are representative. To explore which backbone network is more suitable for the YOLO probe head, different backbone networks were used to verify the effect of different backbones on the detection results. Since MobileNetV2⁴¹ and EfficientDet-D0⁴² are the most representative, they are used to replace the original backbone network for experiments, respectively using Mobile-YOLO and Efficiency-YOLO. The 14th and 18th feature layers of MobileV2 are extracted as input to the YOLO detection head. The last feature layer and the fifth residual block of EfficientDet-D0 are extracted as input feature layers to the YOLO detection head. The results of the four datasets for the different backbone networks are shown in Figure 12.

Figure 12.
The detection results of the four datasets: (a) ground truth; (b) YOLOV4-Tiny; (c) Mobile-YOLO and (d) Efficiency-YOLO. CF: camouflage fabric; LBF: light blue striped fabric; DRF: dark red fabric; GF: grid fabric.

It is found that Mobile-YOLO does not detect small scratches in the first image of GF. Mobile-YOLO’s detection of small targets is not ideal. This is because this backbone is a linear network. The collection of information is not very effective and therefore this network is abandoned. Both the original network and Effective-YOLO are effective in detecting defects with good detection results. Although both Effective-YOLO and YOLOV4-Tiny can detect all defects, the backbone network of Effective-YOLO is very complex, resulting in a slow detection speed.

By analyzing Table 7, it can be seen that the average accuracy of YOLOV4-Tiny is the highest among the three backbone networks. The detection speed reaches 21.20 FPS, which is much faster than the Effective-YOLO network. Therefore, YOLOV4-Tiny has high comprehensiveness in a lightweight backbone network. YOLOV4-Tiny is chosen for the next ablation experiment to determine the effectiveness of other modules.

Table 7.
The test results of the different backbones in the YOLO series

Dataset Method Backbones #Params (MB) MAP (%) FPS

CF YOLOV4-Tiny CSPDarknet53-Tiny 22.40 93.25 21.20

Mobile-YOLO Mobilenet-V2 20.60 94.44 18.16

Effici-YOLO EfficiDet-D0 21.10 82.10 12.78

LBF YOLOV4-Tiny CSPDarknet53-Tiny 22.40 99.88 21.20

Mobile-YOLO Mobilenet-V2 20.60 97.85 18.16

Effici-YOLO EfficiDet-D0 21.10 97.82 12.78

DRF YOLOV4-Tiny CSPDarknet53-Tiny 22.40 90.96 21.20

Mobile-YOLO Mobilenet-V2 20.60 85.15 18.16

Effici-YOLO EfficiDet-D0 21.10 89.49 12.78

GF YOLOV4-Tiny CSPDarknet53-Tiny 22.40 91.55 21.20

Mobile-YOLO Mobilenet-V2 20.60 85.18 18.16

Effici-YOLO EfficiDet-D0 21.10 88.97 12.78

CF: camouflage fabric; LBF: light blue striped fabric; DRF: dark red fabric; GF: grid fabric.

Module innovation experiments

To determine the effectiveness of each module, as shown in Table 8, the following modules are gradually added to the network to test its effectiveness, and the best solution is ultimately obtained. The following modules are changed separately.

Table 8.
Ablation experiment of different modules

Methods to improve MAP (%) FPS

YOLOV4-Tiny 93.25 21.20

YOLOV4-Tiny + 1 90.23 23.55

YOLOV4-Tiny + 1 + 2 88.74 24.62

YOLOV4-Tiny + 1 + 2 + SA 90.26 24.45

YOLOV4-Tiny + 1 + 2 + SA+CA 93.68 24.20

Note. CA: channel attention; SA: spatial attention.

The position of the backbone output feature layer is replaced and the last large residual structure is changed to multiple small residual edges. As shown in Figure 4, the last two feature layers of the original backbone are used as input to the detection network. The close proximity between the two feature layers in this approach results in a great deal of overlapping information and the location information in the shallow layer is lost, so the second residual block is used as the shallow feature layer input. Also deep in the network, as the number of channels increases, the amount of computation per convolution increases exponentially and, to reduce the computation time, the last residual block is replaced with two small residual edges to reduce the number of computational parameters.

When the number of convolution layers is greater than 256, the normal convolution is replaced by depth-separable convolution. Feature layers with a high number of channels require significant computation time for each convolution operation. This issue can be addressed by using depth-separable convolution, which effectively reduces the computational parameters involved. Consequently, convolutions with more than 256 layers are replaced by deeply separable convolutions.

The Spatial Attention module (SA module) has been added. Spatial attention can effectively increase the network’s ability to extract spatial.

The Channel Attention module (CA module) is used. As the network deepens, the number of channels increases exponentially after each convolution. To extract channels containing more defective information, channel attention uses a sigmoid function to assign weights so that the network applies more weight to channels containing more useful information and obtains more semantic information about the defects.

The CF dataset is used for testing the ablation experiment. After replacing the position of the backbone output feature layer and changing the residual structure to add a depth-separable depth, the network speed improved significantly, by about 14%. This is mainly due to a significant reduction in the number of computational parameters significantly reduced, which is more suitable for detection on edge devices. However, there is a decrease in accuracy. The addition of the attention module improves the network’s extraction of positional and semantic information, and the change in position of the feature layer of the output detection head increases the amount of information acquired by the network and improves the network’s detection accuracy.

Network comparison experiments

The feasibility of our proposed method is verified by comparing it with other networks. The networks compared include the original network, MobileNetV2, EfficientDet- D0, YOLOV4, Faster-RCNN, YOLOV5S and YOLOV7, and the metrics tested include accuracy MAP and speed FPS.

YOLOV5 and YOLOV4 are high-precision versions of the YOLO series object detection algorithms, which have greatly improved accuracy and speed compared to previous versions. In comparison, YOLOV5 uses more efficient network structures and data augmentation techniques, making it more advantageous in terms of both speed and accuracy than YOLOV4.

YOLOV7 is a new object detection algorithm that differs from previous YOLO series algorithms in that it adopts modular design and uses attention mechanisms to improve feature representation, which can result in more precise detection results.

MobileNetV2-SSD is an object detection algorithm designed for mobile devices, which uses the lightweight MobileNetV2 as the feature extractor and SSD algorithm for detection. It has the advantages of a lightweight model, fast speed and high accuracy, making it suitable for real-time detection applications on mobile devices.

Faster RCNN is a region proposal-based object detection algorithm that uses a RPN (region proposal network) to extract candidate boxes and region of interest (RoI) pooling for object classification and localization. Compared to YOLO series algorithms, Faster RCNN is more accurate but slower, making it suitable for scenes that require high accuracy.

YOLOV4-Tiny is a lightweight version of YOLOV4 that maintains high detection accuracy while having a smaller model size and faster detection speed, making it suitable for embedded devices and real-time detection applications.

YOLOV4-TinyS is a fabric defect detection method proposed by us for edge devices. It is based on YOLOV4-Tiny and has been improved by optimizing the network structure and changing the location of the network output layer and the size of the residual blocks extracted from the backbone. In addition, different attention structures have been added to improve the detection accuracy of the network. Finally, compared to the original network, there is a significantly higher improvement in speed.

The above networks are tested on four datasets. The results are shown in Tables 9 –12.

Table 9.
Test results of the camouflage fabric dataset

Methods #Params (MB) MAP (%) FPS

YOLOV4-Tiny 22.4 93.25 21.20

Mobilenetv2 33.8 84.10 19.80

EfficientDet-D0 15.1 93.48 14.00

YOLOV4 245.0 95.73 3.35

Faster-RCNN 108.0 80.75 4.78

YOLOV5 13.7 92.48 11.67

YOLOV7 143.0 99.96 4.00

Ours 7.2 93.71 24.48

Table 10.
Test results of the light blue striped fabric dataset

Methods #Params (MB) MAP (%) FPS

YOLOV4-Tiny 22.4 99.83 21.20

Mobilenetv2 33.8 95.45 19.80

EfficientDet-D0 15.1 99.05 14.00

YOLOV4 245.0 99.98 3.35

Faster-RCNN 108.0 100.00 4.78

YOLOV5 13.7 99.70 11.67

YOLOV7 143.0 100.00 4.00

Ours 7.2 99.91 24.48

Table 11.
Test results of the dark red fabric dataset

Methods #Params (MB) MAP (%) FPS

YOLOV4-Tiny 22.4 90.93 21.20

Mobilenetv2 33.8 89.77 19.80

EfficientDet-D0 15.1 89.77 14.00

YOLOV4 245.0 100.00 3.35

Faster-RCNN 108.0 94.58 4.78

YOLOV5 13.7 99.61 11.67

YOLOV7 143.0 98.58 4.00

Ours 7.2 95.52 24.48

Table 12.
Test results of the grid fabric dataset

Methods #Params (MB) MAP (%) FPS

YOLOV4-Tiny 22.4 91.57 21.20

Mobilenetv2 33.8 91.02 19.80

EfficientDet-D0 15.1 84.43 14.00

YOLOV4 245.0 97.66 3.35

Faster-RCNN 108.0 85.71 4.78

YOLOV5 13.7 96.10 11.67

YOLOV7 143.0 100.00 4.00

Ours 7.2 93.21 24.48

It can be found that our network performs well on these four datasets. The accuracy is slightly lower than that of YOLOV4, mainly because YOLOV4 has a large number of neurons which makes it have better detection, but the detection speed is only 3.35 FPS, which is not suitable for edge devices, while our method is much faster than YOLOV4.

Compared to the other networks, our network is much faster than the others. In addition, it also has a high degree of accuracy. Through analysis, we have added modules that reduce the network parameters to give a significant speed improvement without loss of accuracy, which is more suitable for edge devices and therefore gives our network an advantage in lightweight networks.

The defect detection results are shown in Figure 13. It can be found that defects in GF cannot be fully detected by MobilenetV2. This is mainly because this network structure is not sensitive to small targets. EfficientDet-D0 did not detect anything on the LBF dataset. This is because it does not learn the defect features well. Although YOLOV5 correctly detects defective regions on the four datasets, the confidence level of detection needs to be improved. The MAP distribution of Faster-RCNN on the four datasets is not uniform and the speed is not satisfactory. The detection speed of YOLOV7 is much lower than that of our algorithm and the detection confidence is very low for each dataset. Although YOLOV4 and our method can detect all defects, the speed of YOLOV4 is far from satisfactory. In addition, our detection frames are much closer to the real frames. Overall, our network has a great advantage in terms of speed and high detection accuracy.

Figure 13.
The detection results of the comparison experiment of the four datasets, from (a) ground truth, (b) YOLOV4-Tiny, (c) Mobilenetv2, (d) EfficientDet-D0, (e) YOLOV4, (f) Faster-RCNN, (g) YOLOV5, (h) YOLOV7 and (i) ours. CF: camouflage fabric; LBF: light blue striped fabric; DRF: dark red fabric; GF: grid fabric.

At the same time, our network has a strong synthesis capability for these four datasets. This is mainly because it is better able to extract key texture information and is more sensitive to small targets. The ability to balance accuracy and speed makes our network feasible.

The above detection network is shown in Figure 14. It can be seen that our method has the fastest speed and a high accuracy rate in the four datasets. In general, lightweight networks need to ensure that the inference speed of the network is maximized while maintaining accuracy and, at the same time, our method has better robustness with little bias on these four datasets. In summary, our method can achieve detection accuracy while maintaining speedup, which is in line with the industry’s requirements.

Figure 14.
Performance comparison chart of different algorithms. CF: camouflage fabric; LBF: light blue striped fabric; DRF: dark red fabric; GF: grid fabric.

In order to verify that our network is not overfitting during training, the loss curve of the validation set during its training process is displayed. Through Figure 15, it is discovered that the network gradually stabilizes after 100 epochs. In the experiment, we choose the weight file where the loss has just stabilized.

Figure 15.
Convergence test of YOLOV4-TinyS on the dataset.

In this section, we conduct relevant experiments on the proposed method, including ablation experiments for each component and comparative experiments with related networks. Through experimental comparisons, we validate the effectiveness of our method. It achieves real-time detection on edge devices and exhibits a faster detection speed compared to other networks.

Application

A system that simulates industrial inspections has been built and new algorithms are used for testing. A batch of images is first sent to the cloud for training and, once training is completed, the weight file will be sent to the edge device and then the conveyor belt is opened for online detection. The flow chart in Figure 16 shows a network is trained for each dataset by our system. When different fabrics need to be detected, the weights need to be retrained, which takes about 2 hours.

Figure 16.
Detection flow chart. The data captured by the camera is sent to the cloud, and the network is trained in the cloud to obtain weights. The trained network model is sent to the edge device for real-time detection, and finally the defective fabrics are sorted.

Experimental tools

TX2 is used as a detection device, adopting a 2.1 m long conveyor belt, and the transmission speed is 11.4 m/min, which simulates the industrial site for fabric inspection. A light-emitting diode (LED) is used as a ring light source for lighting, the camera is a Basler-acA2500-10gmarea scan color industrial camera and the lens used to collect the pictures is a Basler C125-0618-5M 6 mm. A real shot of the detection is shown in Figure 17.

Figure 17.
Fabric defect detection system. The whole system includes a light source, shooting part, detection part and display part.

Experimental method

Images of the defective fabric are collected by the Basler camera during transmission. After training the network, we embed the trained model into TX2 for detection. We have built a user interface that enables real-time detection of both video streams and pictures. In Figure 18, when a defect is detected, the defect category, defect confidence, prediction frame coordinates and the number of defects are recorded as a log. At the same time, TX2 will send a signal to the steering gear and use the baffle to distinguish the defects. After verification, our method can effectively detect common fabric defects.

Figure 18.
User interface.

In this section, we construct a defect detection system, which includes a conveyor belt, edge devices and a GUI human-machine interface. Ultimately, through testing, our method can effectively detect fabric defects on the conveyor belt. This provides a foundation for practical industrial fabric defect detection and validates the effectiveness of our approach.

Conclusions

In this article, YOLOV4-TinyS, a fabric defect detection method suitable for edge devices, is proposed. We change the position of the feature output layer in the backbone network to improve sensitivity to location information and increase the difference in output feature layers to enrich the information obtained by the network. The bottom residual module is changed and depthwise separable convolution is used instead of normal convolution to reduce the number of network parameters and, finally, the attention module is used to improve the network’s ability to obtain defect information. Experiments show that the proposed network can effectively improve the ability to detect fabric defects and that it has superiority over other advanced methods. The proposed method is verified for feasibility by simulating an industrial inspection site, and it meets the real-time detection requirements on an edge device, a Jetson TX2. In the future, we will attempt to develop more defect detection methods for industrial scenarios to improve actual detection efficiency.

Author	Proposed method	Results
Anitha and Radha¹⁶	Independent component analysis and vector quantized principal component analysis based on Gabor wavelets	The overall success detection rate is 89.99%
Bodnarova et al.¹⁷	An optimal Gabor wavelet filter-based approach	Defects are detected in all 35 sample images, with only six of them showing a small number of false alarms
Allili et al.¹⁸	A new framework for contour-based statistical modeling is developed using a finite mixture of generalized Gaussian distributions	Better results have been produced than the state-of-the-art methods

Author	Proposed method	Results
Jeyaraj et al.²⁴	A deep learning classification network is designed	The accuracy of the defect classification is tested on six different fabric materials and an average accuracy of 96.55% is obtained
Liu et al.²⁵	An algorithm for fabric defect detection applied to the Faster-RCNN algorithm is proposed	The proposed method can locate the fabric defect region with higher accuracy compared with the state-of-art, and has better adaptability to all kinds of the fabric images
Miao et al.²⁶	A method that combines continuous wavelet transform (CWT) with a convolutional neural network is proposed	The accuracy of the method was 96.94%, an improvement of almost 10% over the traditional method. Actual average detection time is only 2.4 seconds
Katiyar et al.²⁷	A model successfully trained on Google Cloud ML Engine is proposed	MobileNet-SSD can automatically detect surface defects more frequently, more accurately and more precisely than traditional deep learning methods
Jing et al.²⁸	A highly efficient convolutional neural network, Mobile-Unet, is proposed	The proposed method achieves state-of-the-art performance in terms of segmentation accuracy and detection speed
Lim et al.³³	A lightweight object detection model based on the YOLOv4-Tiny framework is proposed	The best model achieves an average accuracy of 88.32% at 225.22 frames per second
Zheng et al.³⁴	A YOLOv5 (SE-YOLOv5) based on the squeeze and excitation (SE) module is proposed	The proposed model SE-YOLOv5 improves the accuracy and generalization
Zheng et al.³⁵	An improved YOLOV7 model is proposed	The average accuracy of the model is 93.8%, which is 7.6%, 3.7% and 4% higher than that of the Faster-RCNN model, YOLOV7 model and YOLOV5s model, respectively

Author	Proposed method	Results
Blanco-Filgueira et al.³⁷	An end-to-end solution for real-time deep learning-based multiple object tracking in an embedded and low-power IoT oriented platform is presented	10 FPS video capture, using at any time only the available frame from the live camera without intermediate storage for delayed processing, and a total power consumption of only 12 W is achieved
Goyal et al.³⁸	Deep learning method designed for real-time DFU localization and model performance evaluated on NVIDIA Jetson TX2	Demonstrates the power of deep learning for DFU real-time localization
Song et al.³⁹	A low-latency, low-power, easily scalable, edge computing-based automatic visual inspection system is proposed	Detection speeds of up to 22.7 frames per second (FPS) on the edge device Jetson TX2. This is a 2.5× reduction in response time compared to cloud-based methods, with the ability to detect industrial defects in real-time

Dataset	Training set	Testing set	Total	Image size
CF	360	40	400	256 × 256
LBF	422	46	468	256 × 256
DRF	378	42	420	256 × 256
GF	338	36	374	256 × 256

Dataset	k-means (%)	k-medoids (%)
CF	73.82	76.14
LBF	83.71	86.55
DRF	72.40	78.00
GF	73.45	76.10
Mean ACC	75.85	79.20

Clustering method	MAP	FPS
k-means	89.59	21.20
k-medoids	93.90	21.20

Dataset	Method	Backbones	#Params (MB)	MAP (%)	FPS
CF	YOLOV4-Tiny	CSPDarknet53-Tiny	22.40	93.25	21.20
Mobile-YOLO	Mobilenet-V2	20.60	94.44	18.16
Effici-YOLO	EfficiDet-D0	21.10	82.10	12.78
LBF	YOLOV4-Tiny	CSPDarknet53-Tiny	22.40	99.88	21.20
Mobile-YOLO	Mobilenet-V2	20.60	97.85	18.16
Effici-YOLO	EfficiDet-D0	21.10	97.82	12.78
DRF	YOLOV4-Tiny	CSPDarknet53-Tiny	22.40	90.96	21.20
Mobile-YOLO	Mobilenet-V2	20.60	85.15	18.16
Effici-YOLO	EfficiDet-D0	21.10	89.49	12.78
GF	YOLOV4-Tiny	CSPDarknet53-Tiny	22.40	91.55	21.20
Mobile-YOLO	Mobilenet-V2	20.60	85.18	18.16
Effici-YOLO	EfficiDet-D0	21.10	88.97	12.78

Methods to improve	MAP (%)	FPS
YOLOV4-Tiny	93.25	21.20
YOLOV4-Tiny + 1	90.23	23.55
YOLOV4-Tiny + 1 + 2	88.74	24.62
YOLOV4-Tiny + 1 + 2 + SA	90.26	24.45
YOLOV4-Tiny + 1 + 2 + SA+CA	93.68	24.20

Methods	#Params (MB)	MAP (%)	FPS
YOLOV4-Tiny	22.4	93.25	21.20
Mobilenetv2	33.8	84.10	19.80
EfficientDet-D0	15.1	93.48	14.00
YOLOV4	245.0	95.73	3.35
Faster-RCNN	108.0	80.75	4.78
YOLOV5	13.7	92.48	11.67
YOLOV7	143.0	99.96	4.00
Ours	7.2	93.71	24.48

Methods	#Params (MB)	MAP (%)	FPS
YOLOV4-Tiny	22.4	99.83	21.20
Mobilenetv2	33.8	95.45	19.80
EfficientDet-D0	15.1	99.05	14.00
YOLOV4	245.0	99.98	3.35
Faster-RCNN	108.0	100.00	4.78
YOLOV5	13.7	99.70	11.67
YOLOV7	143.0	100.00	4.00
Ours	7.2	99.91	24.48

Methods	#Params (MB)	MAP (%)	FPS
YOLOV4-Tiny	22.4	90.93	21.20
Mobilenetv2	33.8	89.77	19.80
EfficientDet-D0	15.1	89.77	14.00
YOLOV4	245.0	100.00	3.35
Faster-RCNN	108.0	94.58	4.78
YOLOV5	13.7	99.61	11.67
YOLOV7	143.0	98.58	4.00
Ours	7.2	95.52	24.48

Methods	#Params (MB)	MAP (%)	FPS
YOLOV4-Tiny	22.4	91.57	21.20
Mobilenetv2	33.8	91.02	19.80
EfficientDet-D0	15.1	84.43	14.00
YOLOV4	245.0	97.66	3.35
Faster-RCNN	108.0	85.71	4.78
YOLOV5	13.7	96.10	11.67
YOLOV7	143.0	100.00	4.00
Ours	7.2	93.21	24.48

Footnotes

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the Innovation Capability Support Program of Shaanxi (program no. 2021TD-29), in part by The Youth Innovation Team of Shaanxi Universities, in part by the National Natural Science Foundation of China (grant 62176204) and the Key Research and Development Program of Shaanxi (no. 2022GY-066).

ORCID iDs

Hulin Yang

Junfeng Jing

Zhen Wang

YanQing Huang

ShaoJun Song

References

Anandan

Sabeenian

RJPcs.

Fabric defect detection using discrete curvelet transform. J Procedia Comput Sci 2018; 133: 1056–1065.

Yapi

Allili

Baaziz

NJIToAS

, et al. Automatic fabric defect detection using learning-based local textural distributions in the contourlet domain. IEEE Trans Autom Sci Eng 2017; 15: 1014–1026.

Kang

Zhang

EJTRJ.

A universal defect detection approach for various types of fabrics based on the Elo-rating algorithm of the integral image. Text Res J 2019; 89: 4766–4793.

Ngan

Pang

Yung

NHJI

, et al. Automated fabric defect detection—a review. J Image Vis Comput 2011; 29: 442–458.

Das

Wahi

Kumar

, et al. Moment-based features of knitted cotton fabric defect classification by artificial neural networks. J Nat Fibers 2022; 19: 1498–1506.

Ngan

Pang

Yung

S-P

, et al. Wavelet based methods on patterned fabric defect detection. J Patt Recognit 2005; 38: 559–576.

Yang

Pang

Yung

NHCJOE.

Discriminative fabric defect detection using adaptive wavelets. J Opt Eng 2002; 41: 3116–3126.

Kwak

Ventura

Tofang-Sazi

KJJoIM.

A neural network approach for defect identification and classification on leather fabric. J Intell Manuf 2000; 11: 485–499.

Chetverikov

Hanbury

AJPR.

Finding defects in texture using regularity and local orientation. J Patt Recognit 2002; 35: 2165–2180.

10.

Ali

Alnajjar

Jassmi

, et al. Performance evaluation of deep CNN-based crack detection and localization techniques for concrete structures. J Sensors 2021; 21: 1688.

11.

Wei

Hao

Tang

X-s

, et al. A new method using the convolutional neural network with compressive sensing for fabric defect classification based on small sample sizes. Text Res J 2019; 89: 3539–3555.

12.

Huang

Jing

Wang

ZJIToI

, et al. Fabric defect segmentation method based on deep learning. IEEE Trans Instrum 2021; 70: 1–15.

13.

Tang

Zhu

Yuan

SJAEI.

An improved convolutional neural network with an adaptable learning rate towards multi-signal fault diagnosis of hydraulic piston pump. J Adv Eng Informat 2021; 50: 101406.

14.

Zhang

Wang

, et al. Attention-Gate-based U-shaped Reconstruction Network (AGUR-Net) for color-patterned fabric defect detection. Textile Research Journal 2023: 00405175221149450.

15.

Zhang

Xiong

, et al. QA-USTNet: yarn-dyed fabric defect detection via U-shaped swin transformer network based on quadtree attention. Textile Research Journal 2023: 00405175231158134.

16.

Anitha

Radha

Evaluation of defect detection in textile images using Gabor wavelet based independent component analysis and vector quantized principal component analysis. J Springer Lect Notes Elec Eng 2013; 222: 433–442.

17.

Bodnarova

Bennamoun

Latham

SJPr.

Optimal Gabor filters for textile flaw detection. J Patt Recognit 2002; 35: 2973–2991.

18.

Allili

Baaziz

Mejri

MJIToM.

Texture modeling using contourlets and finite mixtures of generalized Gaussian distributions and applications. IEEE Trans Multimedia 2014; 16: 772–784.

19.

Hanbay

Talu

Özgüven

ÖFJO.

Fabric defect detection systems and methods—a systematic literature review. J Optik 2016; 127: 11960–11973.

20.

Krizhevsky

Sutskever

Hinton

GEJCotA.

Imagenet classification with deep convolutional neural networks. J Commun ACM 2017; 60: 84–90.

21.

Simonyan

Zisserman

AJapa.

Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556 2014.

22.

Szegedy

Liu

Jia

, et al. Going deeper with convolutions. In: proceedings of the IEEE conference on computer vision and pattern recognition, Boston, USA, 1–9 July 2015, pp.1–9.

23.

Zhang

Ren

, et al. Deep residual learning for image recognition. In: proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, USA, 1–9 July 2016 pp. 770–778.

24.

Jeyaraj

Samuel Nadar

. Computer vision for automatic detection and classification of fabric defect employing deep learning algorithm. J Int J Clothing Sci Technol 2019; 31: 510–521.

25.

Liu

, et al. Fabric defect detection based on faster R-CNN. In: ninth international conference on graphic and image processing (ICGIP 2017), Qingdao, China, October 2018, pp. 55–63. SPIE.

26.

Miao

Gao

, et al. Online defect recognition of narrow overlap weld based on two-stage recognition model combining continuous wavelet transform and convolutional neural network. J Comput Ind 2019; 112: 103115.

27.

Katiyar

Behal

Singh

Automated defect detection in physical components using machine learning. In: 2021 8th international conference on computing for sustainable global development, New Delhi, India, March 2021, pp. 527–532. IEEE.

28.

Jing

Wang

Rätsch

, et al. Mobile-Unet: an efficient convolutional neural network for fabric defect detection. Text Res J 2022; 92: 30–42.

29.

Wang

Liu

Xie

, et al. Boosted efficientnet: detection of lymph node metastases in breast cancer using convolutional neural networks. J Cancers 2021; 13: 661.

30.

Wang

, et al. MYOLOv3-Tiny: a new convolutional neural network architecture for real-time detection of track fasteners. J Comput Ind 2020; 123: 103303.

31.

Bochkovskiy

Wang

C-Y

Liao

H-YMJapa.

Yolov4: optimal speed and accuracy of object detection. J arXiv preprint arXiv:10934 2020.

32.

Dlamini

Kao

, et al. Development of a real-time machine vision system for functional textile fabric defect detection using a deep YOLOv4 model. Text Res J 2022; 92: 675–690.

33.

Lim

W-H

Bonab

Chua

KH.

An Optimized Lightweight Model for Real-Time Wood Defects Detection based on YOLOv4-Tiny. In: August 2022 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS), Shah Alam, Malaysia, 2022, pp.186–191. IEEE.

34.

Zheng

Wang

, et al. A fabric defect detection method based on improved yolov5. In: 2021 7th International Conference on Computer and Communications (ICCC), ChengDu, China, December 2021, pp.620–624. IEEE.

35.

Zheng

Zhang

, et al. Insulator-defect detection algorithm based on improved YOLOv7. J Sensors 2022; 22: 8801.

36.

Haleem

Bustreo

Del Bue

AJCiI.

A computer vision based online quality control system for textile yarns. J Comput Ind 2021; 133: 103550.

37.

Blanco-Filgueira

Garcia-Lesta

Fernández-Sanjurjo

, et al. Deep learning-based multiple object visual tracking on embedded system for IoT and mobile edge computing applications. IEEE Internet Things J 2019; 6: 5423–5431.

38.

Goyal

Reeves

Rajbhandari

, et al. Robust methods for real-time diabetic foot ulcer detection and localization on mobile devices. IEEE J Biomed Health Informat 2018; 23: 1730–1741.

39.

Hoang

Nam

Park

KRJIA

. Enhanced detection and recognition of road markings based on adaptive region of interest and deep learning. IEEE Access 2019; 7: 109817–109832.

40.

Song

Jing

Huang

, et al. EfficientDet for fabric defect detection based on edge computing. J Eng Fibers Fabr 2021; 16: 15589250211008346.

41.

Haut

Bernabé

Paoletti

, et al. Low–high-power consumption architectures for deep-learning models applied to hyperspectral image classification. IEEE Geosci Rem Sens Lett 2018; 16: 776–780.

42.

Wang

C-Y

Bochkovskiy

Liao

H-YM.

Scaled-yolov4: Scaling cross stage partial network. In: Proceedings of the IEEE/cvf conference on computer vision and pattern recognition. Online, June 2021, pp.13029–13038.

43.

Tan

Pang

QV.

Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR) Seattle. WA, USA, June 2020, pp.10781–10790.

YOLOV4-TinyS: a new convolutional neural architecture for real-time detection of fabric defects in edge devices

Abstract

Keywords

Related work

Traditional methods

Deep learning methods

Edge computing

Proposed method

Network structure

Anchor clustering method

Depthwise separable convolution

Attention module

Experimental work

Experimental datasets

Evaluation metrics

Results and discussion

Ablation experiments

Anchor cluster analysis

Backbone comparison experiments

Module innovation experiments

Network comparison experiments

Application

Experimental tools

Experimental method

Conclusions

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

References