Sage Journals: Discover world-class research

Abstract

Fabric defect detection is crucial in the textile industry, as it suffers from challenges such as small defect sizes, diverse morphologies, and imbalanced sample distributions. Current mainstream methods approach it as an object detection problem. Many fabric defects, particularly small ones, are caused by production faults that disrupt the fabric texture. These defects often lack distinct structural information, which poses a challenge for object detection methods that rely heavily on such features. To address these limitations, we propose a multi-scale detection channel based on Low-level Texture Feature Retention (LTFR), which significantly enhances the detection capability for small-size defects. Additionally, we introduce Powerful-IoU (PIoU) to guide anchors along effective paths for regression, improving localization accuracy for defects with extreme aspect ratios. To further tackle the issue of imbalanced sample distribution, we adopt the SlideLoss function, which adaptively adjusts the weights of easy and hard samples, offering a more effective solution than the traditional binary cross-entropy classification loss. Experimental results on the publicly available fabric defect dataset from Alibaba Cloud Tianchi show that the model achieves mAP@0.5 and mAP@0.5:0.95 values of 49% and 24%, respectively, which represent improvements of 8.4% and 4.8% over the baseline YOLOv8. Even compared to the state-of-the-art YOLOv9-c, the mAP@0.5 value shows a further improvement of 5.1%. These results confirm the effectiveness of the proposed method and highlight its potential as a valuable guide for researchers in the field of fabric defect detection.

Keywords

object detection fabric defect detection feature fusion texture feature retention

Introduction

In manufacturing industries such as apparel, medical, and military instruments, fabric is an essential raw material. However, during the production process, human errors and machine failures are inevitable, leading to potential defects in fabric products. Traditional manual inspection methods heavily rely on human visual perception and personal experience, leading to potential fatigue and other limitations. Over the past few decades, researchers have developed machine vision models and algorithms to automate defect detection processes. These models and algorithms improve the accuracy and efficiency of defect detection, ensuring consistent product quality.¹ However, the increasing complexity of fabric textures and the need for real-time, high-precision detection have posed significant challenges, especially for small defects that are difficult to detect.

Traditional fabric detection methods include statistical methods, spectral analysis methods, model-based methods, and dictionary learning methods. Statistical methods^2–14 discriminate defect regions by calculating the statistical feature differences between the pixels under test and the surrounding pixels. These methods lack global feature analysis, making it difficult to effectively handle fabric images with complex textures and insufficiently detect small defects. Spectral analysis methods, such as Fourier transform⁶ and Wavelet transform¹⁵ transform the test image into the frequency domain and then locate defects through energy functions. The detection performance of these methods depends on the selection of filters and lacks adaptability. Model-based methods model fabric image textures as Gaussian mixture models, Gaussian-Markov random field models,¹⁶ and then discriminate defects by analyzing the model parameters of the test images. These methods have high computational complexity and cannot effectively locate small defect areas. Dictionary learning methods⁷ train a set of dictionaries from normal images and then reconstruct normal images based on the test set, locating defect areas through differencing. These methods require training a dictionary set for each type of fabric image, lack adaptability, and are relatively slow. Despite the advancements, traditional methods still struggle with adapting to the complex and varied nature of fabric defect detection.

With the advancement of deep learning, the use of convolutional neural networks for fabric defect detection has emerged as a significant focus in academic research. Fabric defects are regions of interest characterized by significant differences in shape, color, and texture from their surroundings and can be localized using bounding boxes. Researchers have applied object detection techniques, including Single Shot MultiBox Detector(SSD),¹⁷ Faster R-CNN,¹⁸ Cascade R-CNN,¹⁹ and You Only Look Once(YOLO),²⁰ to achieve high detection accuracy in fabric defect detection. Although directly applying object detection algorithms to fabric defect detection can yield good results, the specific nature of fabric defect detection differs significantly from general object detection. These fabric defect detection methods emphasize extracting structural information from objects while disregarding variations in lighting, viewing angles, and texture. However, candidate images for fabric defects mainly come from specific cameras on production lines, with fixed lighting and viewing angles. Smaller defects primarily reflect disruptions in texture rather than structure, often caused by various faults during the production process. Detection of such defects typically relies on texture rather than structural information. Moreover, fabric defects are usually small, difficult to localize, challenging to classify, and often suffer from an imbalance between easy and hard samples. Therefore, traditional object detection algorithms are often insufficient to detect fabric defects effectively.

To address these challenges, we propose several innovative strategies to enhance the YOLOv8 architecture, with the improved version named LTFR-YOLOv8. First, we propose a multi-scale detection channel based on Low-level Texture Feature Retention (LTFR), which retains finer-grained defect features, thereby improving the detection capability for small-size defects. Additionally, we use Powerful-IoU (PIoU) (See the Methodology section for details), a novel anchor-guidance mechanism, which enhances localization accuracy for defects with extreme aspect ratios. Finally, we adopt the SlideLoss function (For details, see Section 3.3) to address the issue of imbalanced sample distribution, further improving detection accuracy by focusing on hard samples that are often overlooked by traditional methods. The main contributions of our work are summarized as follows:

1. We propose a detection channel with low-level texture feature retention as part of the YOLOv8, preserving rich low-level texture information and enhancing the network’s ability to detect small defects;

2. By incorporating the PIoU, along with adaptive penalty factors and non-monotonic attention layers, we enhance the model’s ability to accurately locate defects, particularly those with disparate aspect ratios;

3. We adopt the SlideLoss function, which increases the relative loss for hard-to-classify samples, encouraging the model to focus more on difficult and misclassified fabric samples.

Related work

Fabric defect detection is an application of CNN-based multi-scale detection algorithms in industrial production. Therefore, we review related work through the following two aspects.

Fabric detection algorithm based on CNN

Convolutional Neural Networks (CNN) consist of convolutional layers and pooling layers.²¹ Object detection models based CNN have been widely applied in computer vision tasks.^22,23 Currently, neural network models in object detection can be broadly categorized into two types: anchor-based detectors and anchor-free detectors. Anchor-based detectors can be further classified into two-stage and single-stage detectors. Two-stage detectors, such as the R-CNN series,^24–26 and Task-aware Spatial Disentanglement (TSD),²⁷ begin by generating a set of candidate boxes as proposals. Subsequently, these proposals are classified and refined using CNN.

Single-stage detectors, such as YOLO series,^28,29 SSD,³⁰ RetinaNet,³¹ and EfficientDet,³² directly perform classification and regression tasks without the need for a separate proposal generation step. In recent years, anchor-free detectors such as CornerNet,³³ ExtremeNet,³⁴ and FCOS³⁵ have been proposed to overcome the reliance on anchor priors in object detection methods. In the context of fabric detection tasks, anchor-based detection methods outperform anchor-free detection methods due to the multi-scale and spatially random distribution of defect objects. Additionally, single-stage detectors excel in meeting real-time detection speed requirements, even though two-stage detectors exhibit higher accuracy performance.³⁶ However, existing single-stage detectors are more effective for natural scenes than textile images due to significant differences in object attributes such as texture structure, size, and spatial distribution. As such, the accuracy and speed of current deep convolutional neural networks (DCNNs) remain insufficient for fulfilling the requirements of fabric defect detection tasks.

Inspired by the success of deep convolutional neural networks (DCNNs) in various fields,^37,38 researchers are leveraging deep learning-based methods to improve the performance of fabric defect detection models. The adaptability of fabric defect detection models has been a key research focus due to the diversity of textures and backgrounds. For example, Li et al.³⁹ utilized a compact convolutional neural network (CNN) architecture for detecting several common fabric defects. Xie and Wu⁴⁰ achieved improved detection results for fabric images with plain backgrounds, regular patterns, and irregular patterns by enhancing the refineDet method. Liu et al.⁴¹ proposed an effective shallow network called DLSE-Net, which utilized the expansion Up-Weight CAM and Link-SE module to highlight defect regions and improve the adaptability to complex textures. Zhang et al.⁴² achieved improved detection accuracy on yarn-dyed fabric defect detection by optimizing the hyperparameters of the YOLOv2 network. Jing et al.⁴³ enhanced the YOLOv3 framework by incorporating the k-means algorithm for dimension clustering of target frames. They further optimized the detection layer, leading to an improved fabric detection algorithm with enhanced real-time performance. Jin and Niu⁴⁴ improved the defect detection capability of the YOLOv5 model by utilizing a teacher-student architecture and incorporating multitask learning to enhance its classification abilities. To enhance the real-time detection efficiency, Jing et al.⁴⁵ proposed a highly effective end-to-end defect segmentation CNN called Mobile-Unet. However, these methods did not adequately consider the underlying texture information of fabric defects, leading to a lack of effective preservation of multi-scale contextual information. As a result, a proper balance between detection efficiency, accuracy, and generalization was not achieved.

Application of multi-scale detection

Research exploring the application of multi-scale object detection techniques for industrial-scale multi-scale defect detection is scarce. In 2017, Zhou et al.⁴⁶ resolved the issue of detecting small-sized solder ball defects in high-density chips by constructing a three-dimensional inductive heating finite element model. In a study conducted in 2021 Li et al.⁴⁷ proposed a belt layer defect detection method based on an improved Faster R-CNN for detecting small-sized defects in the carcass layer of radial tires. The method employs feature fusion and DIoU to tackle the problems of inadequate feature extraction for small-sized defects and loose bounding boxes. In view of the shortcomings of small target detection methods in image remote sensing applications.

The characteristics of the objects investigated in the aforementioned studies on small object detection are different from the fabric studied in this paper, as they do not involve the complexities of texture and pattern backgrounds. Additionally, conventional small object detection methods have limited research and application in the multi-scale detection of fabric defects, as they are primarily used for tasks like PCB detection and small object recognition in remote sensing imagery. This paper focuses on improving the detection accuracy of fabric defects and addressing the multi-scale detection challenge, specifically targeting small defects on fabric surfaces.

Methodology

YOLOv8 is one of the YOLO series, an efficient real-time object detection algorithm. It is based on deep learning and convolutional neural networks (CNN), capable of simultaneously performing object classification and localization, achieving efficient detection through a single forward pass. The YOLOv8 algorithm has gained significant attention in object detection due to its remarkable detection capability and efficient processing speed. The YOLOv8 architecture consists of three primary components: backbone, neck, and head. In the backbone component, the Cross-stage Local Network (CSP) is introduced to enhance feature extraction. Specifically, the CSPDarknet53⁴⁸ network is used as the backbone for extracting rich and discriminative features from the input images. The PAN-FPN⁴⁹ method is used to fuse multi-scale features from the output of the backbone. During prediction, a decoupled head structure is used, and the bounding box regression loss is computed using the CIoU loss function. To address the challenges of locating small fabric defects with diverse morphologies and the imbalance in sample difficulty distribution, this paper proposes a multi-scale detection channel with low-level feature retention. This approach enhances the model’s ability to preserve low-level texture information and improves the accuracy of detecting defects, especially small ones. In the prediction stage, PIoU is utilized to guide the effective regression of anchor boxes, resulting in improved localization accuracy, and then the SlideLoss function is used to solve the problem of unbalanced distribution of hard and easy samples. The model structure is illustrated in Figure 1.

Figure 1.

Architecture of the LTFR-YOLOv8. The feature map first passes through the feature extraction backbone network and then enters the multi-scale feature fusion Neck network to enhance rich multi-scale information. Finally, classification and regression are performed through the Head. Among them, the Detect head, with a size of 160 × 160 × 45, serves as a detection head for preserving low-level texture features, thereby improving the model’s ability to detect small objects.

Multi-scale detection channel based LTFR(M-LTFR)

For defect detection tasks, the early layers of the feature extraction backbone network capture rich texture structural information. However, as the network deepens and the receptive field expands, the clarity of image texture details diminishes, leading to increased blurring. This may result in the gradual disappearance of texture details while semantic information continues to increase. Figure 2 depicts the feature maps at different layers of the input image: layers B1 and B2 contain rich low-level information, with B2 having less noise than B1, while B3, B4, and B5 contain more structural information and high-level semantic information. The neck of YOLOv8 uses a Path Aggregation Network with Feature Pyramid Network (PAN-FPN) structure to build a top-down and bottom-up network architecture. Through feature fusion, this structure achieves the complementarity of positional information and semantic information, ensuring feature diversity and completeness. However, it overlooks the retention of B2 feature information. Fabric images differ significantly from natural scene images, as they possess abundant texture information and simple semantic information. To retain more low-level features, the PAN-FPN needs to provide more fine-grained information.

Figure 2.

Visualization of feature maps at all levels.

Therefore, we design a Multi-scale detection channel based LTFR(M-LTFR), whose structure diagram is shown in Figure 3. M-LTFR combines feature information from the backbone, preserving rich shallow-level texture features, and transfers the enriched texture and semantic information to the feature fusion stage. To provide a detailed description of the feature fusion process in M-LTFR, the feature mappings (if present) at each layer are defined as follows:

\begin{array}{l} P n t d 1 = \\ C 2 f (C o n c a t (P n i n, U p s a m p l e (P n + 1 t d 2))) \\ P n t d 2 = C B S (C 2 f (P n - 1 t d 2)) \\ P n o u t = C 2 f (P n t d 1 + P n t d 2) \end{array}

Figure 3.

M-LTFR architecture design. From $P 2 i n$ → $P 5 i n$ , the texture information contained in the feature map gradually decreases, while high-level semantic information gradually increases. In this process, more low-level texture information is retained in the output to ensure the detection accuracy of small objects.

Here, $P n t d 1$ , $P n t d 2$ and $P n o u t$ represent the initial intermediate feature map in the nth layer of the top-down pathway, the subsequent intermediate feature map in the nth layer of the bottom-up pathway and the final output feature map in the nth layer respectively.

Compared with PAN-FPN, M-LTFR has three improvements. First, under the premise of retaining the original structure, the low-level feature fusion is added, at this time, the high-level fusion nodes {P3, P4, P5} remain unaffected. This ensures that the additional information from the input node to the output node is completely preserved. Second, M-LTFR maximizes the utilization of initial features by incorporating the intermediate features of the low-level P2, enhancing the integration of rich bottom-level features without incurring significant computational overhead. Third, by utilizing intermediate features, we design a new detection scale to address the difficulty of detecting small defects. Experiments show that this method significantly improves the accuracy of fabric defect detection while maintaining a detection speed that meets the requirements of industrial applications.

In M-LTFR, features are independently transmitted from each relevant layer ({P2, P3, P4, P5}) to two separate classification/regression subnets for subsequent stages of bounding box regression and defect classification.

Enhancement of bounding box regression loss function

At present, commonly used loss functions for bounding box regression include IoU loss⁵⁰ and its improved variants, such as GIoU loss,⁵¹ DIoU loss,⁵² CIoU loss, EIoU loss,⁵³ and WIoU.⁵⁴ The IoU expression is defined as follows:

L_{I o U = 1 - I o U}

I o U = \frac{| B_{p r e d} \cap B_{g t} |}{| B_{p r e d} \cup B_{g t} |}

Here, $B_{p r e d}$ and $B_{g t}$ denote anchor and target boxes respectively. There is a problem of gradient vanishing when the anchor does not overlap with the ground truth box. To solve the problem of $L_{I o U}$ , Rezatofighi et al., 2019 proposed the GIoU loss, whose expression is shown in Figure 4(a)(1). However, when there is an inclusion relationship between the predicted box and the ground truth box, the GIoU loss degenerates to the IoU loss. The DIoU loss, introduced in AAAI 2020, is an enhancement of $L_{G I o U}$ and its formula is depicted in Figure 4(a)(2). $L_{D I o U}$ minimizes the distance between target boxes directly. However, when the centroid of the anchor box coincides with the target box and the intersection ratio remains the same, the computed result is the same as $L_{G I o U}$ , limiting further learning by the network. and the network cannot further learn. YOLOv8 uses CIoU loss, whose expression is shown in Figure 4(a)(3). CIoU takes into account the overlap area, center point distance, aspect ratio, and regularization term in IoU calculation. However, when the anchor and target boxes converge to a linear ratio in terms of length and width, the relative proportion of the penalty terms in CIoU becomes ineffective. Additionally, the penalty terms do not reflect changes in the size of the target box. To solve the problem, the EIoU loss, proposed by Zhang et al., modifies the aspect ratio loss term by considering the difference between the predicted width and the minimum bounding box width, thereby accelerating convergence. Additionally, the EIoU loss incorporates the Focus Loss to minimize the impact of prediction boxes with lower overlap on the bounding box regression. This ensures that the regression process prioritizes high-quality prediction boxes. However, the attention function in EIoU loss is monotonic and has certain limitations. The EIoU and Focal-EIoU loss expressions are shown in Figure 4(a)(4). Here, $d$ is the diagonal length where the anchor box intersects the target box, $c$ is the diagonal length between the anchor box and the minimum outer box of the target box, and $γ$ is a parameter that controls the degree of outlier suppression. To solve the Focal-EIoU’s problem, WIoU proposed a dynamic non-monotonic modulation loss function, as depicted in Figure 4(b)(5), the attention function denoted as $r$ in WIoU is controlled by hyperparameters $δ$ and $ε$ , which regulate its gradient behavior. The operation $*$ , represents the split operation, and $L_{I o U}$ denotes the average of anchor box values within a batch. This function focuses on anchor boxes of moderate quality while mitigating the impact of harmful gradients from low-quality samples. However, the dependence of WIoU on these two hyperparameters is a challenge in determining suitable values for different dataset.

Figure 4.

IoU-based losses. (a) presents the graphical structure and formulas of GIoU, DioU, CioU, and EIoU. (b) presents the graphical structure and specific formulas of WIoU and PIoU.

The aforementioned bounding box regression loss functions do not account for the directional mismatch between anchor boxes and target boxes. Without constraining the orientation between target box and anchor, the anchor box is likely to expand during training, failing to adapt to the target box and resulting in a poor model. This defect can lead to slower model convergence and lower learning efficiency. Additionally, EIoU loss and WIoU have shortcomings in optimizing anchor box quality.

The PIoU is an improved boundary box regression loss function based on the concept of IoU, offering more powerful capabilities. The PIoU loss combines an adaptive penalty factor for target size and a non-monotonic attention layer, guiding anchor boxes along effective regression paths and enhancing the focus on medium-quality anchors. The pseudocode of this function is shown in Table 1. Its expression is shown in Figure 4(b)(6), where P is the adaptive penalty factor. In contrast to penalty factors in other loss functions, the value of P remains unchanged when the anchor box is enlarged. Furthermore, unless the anchor box completely overlaps with the target box, P never degrades to 0 and adapts to the target size, especially for targets with extreme aspect ratios. The function

u (q)

represents a non-monotonic attention function controlled by a single hyperparameter

λ

, as illustrated in Figure 5.

Table 1.

Pseudo-code of the PIoU algorithm.

Function: Function: PIoU Calculation

Inputs: box1,box2,xywh,PIoU,PIoU2,Lambda,eps

Step 1: Coordinate Transformation

If xywh is True:

Convert box1 and box2 from (x, y, width, height) to (x1, y1, x2, y2) format.

If xywh is False:

Use the provided box1 and box2 directly in (x1, y1, x2, y2) format.

Step 2: Calculate Intersection Area

Compute the width and height of the intersection between the two boxes.

If there is no overlap (negative width or height), set the intersection area to 0.

Step 3: Calculate Union Area

Compute the union area as the sum of areas of box1 and box2 minus the intersection area.

Add a small epsilon value to avoid division by zero.

Step 4: Compute IoU (Intersection over Union)

IoU = intersection_area/union_area

Step 5: Calculate Convex Box Dimensions

Compute the width and height of the smallest convex enclosing box for both boxes.

Step 6: Compute PIoU (Position-sensitive IoU)

Calculate positional differences (dw1, dw2, dh1, dh2) between the boxes in both x and y directions.

Compute the positional penalty term P:

P = ((dw1 + dw2)/abs(w2) + (dh1 + dh2)/abs(h2))/4

Step 7: Compute PIoU

L_v1 = 1 - IoU - exp(-P^2) + 1

If PIoU is True, return L_v1.

Step 8: Compute PIoU2 (Alternative PIoU)

If PIoU2 is True:

q = exp(-P)

x = q * Lambda

Return: 3 * x * exp(-x^2) * L_v1

Outputs:

Return L_v1 (PIoU) or the computed PIoU2 value, based on the flags.

Figure 5.

Curves of different values of λ for µ(λx).

Experiments show that PIoU loss can make YOLOv8 model converge faster, improve positioning accuracy, and perform better in locating defects with disparate aspect ratio. Therefore, this paper adopt PIoU loss with better performance to replace the CIoU loss as the bounding box regression loss function of YOLOv8 network.

Enhancement of classification loss function

The choice of the loss function has an impact on the stability of the neural network model during training. YOLOv8 employs the second hospital’s cross-entropy loss function for classification. For binary classification, the formula is as follows:

C E (p, y) = {\begin{cases} - \log (p) \\ - \log (1 - p) \end{cases} \begin{array}{l} i f y = 1 \\ o t h e r w i s e \end{array}

where p and y are the confidence level and the label value respectively.

In fabric defect detection, the size of the defect area is significantly smaller compared to the background area. The fabric defect detection process often results in an imbalance between easy-to-classify and difficult-to-classify samples, with a higher proportion of the former. This imbalance extends to positive and negative samples as well. As a result, the loss function used during model training can be skewed, giving more weight to negative samples and easy-to-classify samples. This imbalance can negatively impact the model’s performance on positive samples and difficult-to-classify samples, as they may receive less attention and emphasis within the training process. In order to solve this problem, Shrivastava et al.⁵⁵ proposed the OHEM algorithm, which selected challenging samples based on their loss and incorporates the loss of these difficult samples into the training process using stochastic gradient descent. Aiming at the problem that OHEM algorithm ignores simple samples, Focal Loss⁵⁶ achieved higher accuracy by effectively leveraging all samples through weighting SRNS⁵⁷ also followed this idea. Faceboxes⁵⁸ utilizes IoU loss for sample classification and ensures that the ratio between positive and negative samples remains within 1:3. Although the aforementioned methods effectively address the issue of sample imbalance, they also introduce additional hyperparameters, making the adjustment process more challenging.

SlideLoss is a classification loss function that can adaptively adjust the weight size and improve sample imbalance. SlideLoss adopts the average IoU value of all bounding boxes as the threshold $μ$ , which serves to distinguish between easy and hard samples based on the IoU comparison between anchor boxes and target boxes, and then the parameter $μ$ is used to classify the samples into positive and negative samples. Finally, the weighting function SlideLoss is used to emphasize the samples at the boundary. The Slide image of the weighting function is shown in Figure 6, and the formula can be expressed as follows.

f (x) {\begin{cases} 1 x \leq μ - 0.1 \\ e^{1 - μ} μ < x < μ - 0.1 \\ e^{1 - x} x \geq u \end{cases}

In this paper, we hope for the model to optimize these samples adaptively and utilize them more effectively for network training. Therefore, we employ the SlideLoss function as the classification loss function for YOLOv8.

Figure 6.

Output curves of SlideLoss functions.

Experimental conclusion and analysis

Experimental setting

Experimental environment and evaluation parameter

All experiments were conducted on an Ubuntu 22.04.4 LTS platform utilizing an NVIDIA L40 GPU (48 GB VRAM) and an Intel® Xeon® Platinum 8458P CPU, with PyTorch 2.1.0 accelerated by CUDA 12.1 and Python 3.8. The enhanced YOLOv8 architecture was optimized using SGD with a base learning rate of 0.01, momentum coefficient 0.937, and L2 regularization weight decay 0.0005 over 300 maximum training epochs. Deterministic modes with fixed random seed (0) and CUDA-configured reproducibility ensured experimental repeatability. Training employed 32-image batches of 640 × 640 resolution inputs processed through 8 parallelized data-loading workers. Critical regularization mechanisms included early stopping (50-epoch patience threshold) to mitigate overfitting and progressive deactivation of mosaic augmentation in the final 10 epochs for loss convergence stability. Initializations excluded pretrained weights to isolate model improvements, while rigorous baseline comparisons maintained identical dataset partitioning (6:2:2 train:val:test ratio). Quantitative evaluations measured detection quality via AP@0.5 (IoU = 50%) and mAP@[0.5:0.95] while assessing computational efficiency through FPS rates and GFLOPs. Memory optimization protocols incorporated gradient checkpointing alongside disabled data caching (RAM/disk), complemented by directory overwrite protection to preserve training environment integrity.

To evaluate the performance of the enhanced YOLOv8 algorithm, key evaluation metrics are employed, including average precision, frames per second (fps), mean average precision (mAP), and parameter count. Among these metrics, mAP holds particular significance as it offers a comprehensive evaluation of the model’s performance.

P r e c i s i o n = \frac{T P}{T P + F P}

Re c a l l = \frac{T P}{T P + F N}

A P = \int_{0}^{1} P (R) d R

m A P = \frac{1}{n} \sum_{k = 1}^{k = n} A P_{k}

Where TP, TN, FP and FN represent true positive samples, false negative samples, false positive samples and false negative samples. The P-R (Precision-Recall) curve is plotted with Precision (P) on the vertical axis and Recall (R) on the horizontal axis. The Average Precision (AP) for k categories is represented by k, while n denotes the total number of categories. AP is the area under the P-R curve, with a larger area indicating a higher AP value and more accurate detection. The mean Average Precision (mAP) is the average of the AP values for all categories.

Dataset and preprocessing

This paper performed experiments on the publicly available Alibaba Cloud Tianchi fabric defect dataset, which is widely recognized for its high-quality defect data. It includes 5913 images with a resolution of 2446 × 1000 pixels and covers 34 defect categories: Knot, Head, Three, strands, Coarse, weft, Broken spandex, Warping knot, Weft contraction, Loose warp, Starch stain, Hole, Broken ends, Thin file, Stain, Star jump, Thick end, Gouge, Capillus, Centipede, Suspending warp, Retouching, Check jump, Deathfold, Skip, Oil stain, Darts, Grinding mark, Water stain, Reed path, Poor weft, Singe mark, Chromatic crotch, Wavy crotch, Double weft, Double ends, Cloud weaving. An example plot of each defect category is shown in Figure 7. The training, validation, and testing dataset were split in a standard 6:2:2 ratio, with the model’s performance on the test set being the final criterion. After collation, the analysis visualization is shown in Figure 8. Where (a) is the distribution of the centroid position of the fabric defect, (b) is the distribution of the size of the fabric defect, and the abscissa width and ordinate height denote the width and height of the object, respectively. The annotation files of this dataset are in JSON format based on the PASCAL VOC standard. They need to be converted into TXT files in the YOLO format using the following conversion formula:

x_{c} = (x_{\max} + x_{\min}) / 2 d_{w}

y_{c} = (y_{\max} + y_{\min}) / 2 d_{w}

w = (x_{\max} - x_{\min}) / d_{w}

h = (y_{\max} - y_{\min}) / d_{h}

Where

(x_{c}, y_{c})

represents the centroid of the normalized label, w and h represent the width and height of the normalized tag,

(x_{\min}, y_{\min})

(x_{\max}, y_{\max})

correspond to the coordinates of the top-left and bottom-right positions of the label in the JSON file,

d_{w}

and

d_{h}

represent the width and height of the image.

Figure 7.

Sample images of each type of defect.

Figure 8.

Dataset analysis. (a) is the distribution of the centroid position of the fabric defect, (b) is the distribution of the size of the fabric defect.

Experiment results and analysis

Both the YOLOv8 and LTFR-YOLOv8 models were trained using the same parameter settings and dataset. Figure 9 compares the bounding box regression loss, classification loss, and distribution focusing loss during training. The x-axis is the number of training batches, while the y-axis is the corresponding loss values. It is evident from these figures that the LTFR-YOLOv8 model exhibits lower values for both bounding box loss and distribution focusing loss, indicating faster convergence during training. The classification loss increases with the number of iterations, and the value is larger than that of the YOLOv8 model, which is due to the high weight value assigned by SlideLoss to difficult samples. The training batches are all 300 epochs, but LTFR-YOLOv8 converges faster with less than 250 training times, which confirms the significant improvement in the performance of the LTFR-YOLOv8.

Figure 9.

Comparison curves of different loss functions.

To validate the effectiveness of LTFR-YOLOv8 in fabric defect detection tasks, this paper conducted six groups of experiments. Experiments focused on comparing the average precision of the YOLOv8 and LTFR-YOLOv8 models for each fabric defect class in the dataset. The experimental results have been summarized in Table 2. The results showed that LTFR-YOLOv8 outperformed YOLOv8 in terms of average detection accuracy for the majority of defect categories. From the analysis of Table 2, certain defect categories (e.g., Skip, Starch stain, Thick end, and Water stain) exhibited reduced accuracy in LTFR-YOLOv8 compared to the baseline. This discrepancy primarily stemmed from dataset characteristics and defect-specific challenges. Notably, these underperforming categories may have suffered from insufficient training samples, as limited sample availability restricted the model’s ability to learn discriminative features and adapt to intra-class variability. For instance, rare or subtle defect patterns (e.g., Starch stain variations or faint Water stains) inherently lacked sufficient representation in the data distribution, causing inconsistent feature extraction despite the enhanced architecture. Furthermore, defects with ambiguous visual characteristics or high similarity to background textures (e.g., Thick end distortions) might have lacked distinctive low-level features even with improved texture retention. However, a few classes might have had lower accuracy than the original model due to limited sample availability, particularly when sparse annotations prevented robust generalization despite the model’s enhanced feature retention capabilities.

Table 2.

Comparison of AP values each class of detection results on YOLOv8 and LTFR-YOLOv8.

Number	Defect category	Baseline YOLOv8s	OURS
1	Knot head	37.9	53.6
2	Three strands	63.2	71.1
3	Coarse weft	30.0	41.1
4	Broken spandex	16.1	28.0
5	Warping knot	46.0	50.9
6	Weft contraction	11.5	16.7
7	Loose warp	42.6	48.3
8	Starch stain	71.8	70.8
9	Hole	62.3	74.9
10	Broken ends	13.4	15.8
11	Thin file	26.8	26.3
12	Stain	7.68	15.5
13	Star jump	63.3	67.4
14	Thick end	54.2	51.3
15	Gouge	40.7	58.8
16	Capillus	8.99	29.3
17	Centipede	35.7	43
18	Suspending warp	17.2	34.7
19	Retouching	42.0	47.5
20	Check jump	70.1	74.6
21	Deathfold	23.0	23.6
22	Skip	57.5	56.3
23	Oil stain	64.5	70.4
24	Darts	43.2	44.6
25	Grinding mark	30.6	36.7
26	Water stain	82.3	77.1
27	Reed path	17.8	22.4
28	Poor weft	62.1	79.2
29	Singe mark	89.5	99.5
30	Chromatic crotch	16.5	39
31	Wavy crotch	2.01	2.9
32	Double weft	12.2	29.1
33	Double ends	68.7	66.9
34	Cloud weaving	49.7	99.5

Experiment 2 compared the performance of YOLOv8 when using different bounding box regression loss functions. The original network model employed the CIoU loss function, while this study compared it with the GIoU, DIoU, and PIoU loss functions, and Table 3 showed the experimental results. The bolded text represents the performance demonstration of the method proposed in this paper (all bolded text in the tables carries this meaning). The results indicated that utilizing the DIoU loss for bounding box regression in the YOLOv8 model resulted in a 3% improvement in mAP@0.5 and a significant 3.2% improvement in mAP@0.5:0.95 compared to CIoU. However, there was a slight decrease in FPS. Although PIoU demonstrated slightly lower accuracy than DIoU, it exhibited a notable advantage in terms of FPS. By replacing CIoU with PIoU, there was a 2.6% improvement in mAP@0.5 and a 3% improvement in mAP@0.5:0.95.

Table 3.

Performance evaluation using different bounding box regression loss functions.

Model	IoU loss	mAP@0.5 (%)	mAP@0.5:0.95 (%)	Parameters (M)	FPS
YOLOv8s	CIoU	40.6	19.2	21.46	172
	DIoU	43.6	22.4	21.46	163
	GIoU	34.2	15.8	21.46	158
	PIoU	43.2	22.2	21.46	178

Experiment 3 compared the performance of YOLOv8 using different classification loss functions. The original model used BCE loss function, which was compared with VFLoss, Focal Loss, and SlideLoss, and Table 4 showed the experimental results. In the YOLOv8 model, the mAP@0.5 and mAP@0.5:0.95 values of SlideLoss increased by 2.3% and 2%, respectively. Although FPS had a slight decrease, it filled the defect of sample imbalance, and was particularly well-suited for fabric defect detection tasks compared to BCE loss. VFLoss and Focal loss had a high computational cost in postprocess, resulting in a low FPS.

Table 4.

Performance evaluation using different classification loss functions.

Model	Loss	mAP@0.5 (%)	mAP@0.5:0.95 (%)	Parameters (M)	FPS
YOLOv8s	BCE Loss	40.6	19.2	21.46	172
	VFLoss	21.1	10.2	21.46	14
	Focal	17.7	7.53	21.46	13
	SlideLoss	42.9	21.2	21.46	178

Experiment 4 showed the impact of using PIoU’s hyperparameters on model performance, and Table 5 showed the experimental results. We noticed that PIoU was sensitive to changes in hyperparameters, with the best results when

λ

= 1.3 (this value was employed in this research paper), and the overall model performance exhibited significant fluctuations when the value of λ became larger or smaller. When the quality of the anchor box was medium, the non-monotonic function assigned a large gradient to it to speed up the anchor box regression speed. The non-monotonic function assigned a larger gradient to anchor boxes of moderate quality to accelerate their regression speed. When the anchor box quality was higher, the gradient was reduced appropriately to facilitate further stable optimization towards the target box.

Table 5.

Performance evaluation using different λ for PIoU.

	mAP@0.5 (%)	mAP@0.5:0.95 (%)
λ = 1.1	40.6	20.4
λ = 1.2	40.3	19.4
λ = 1.3	43.2	22.2
λ = 1.4	41.1	19.5
λ = 1.5	42.2	20.9

Ablation experiment

To provide a comprehensive analysis of the contributions of various improvements in the LTFR-YOLOv8 method, this study conducted ablation experiments, and Table 6 showed the experimental results. Using M-LTFR, PIoU, and SlideLoss each contributed to enhancing the mAP@0.5 and mAP@0.5:0.95 values to some extent. Among them, M-LTFR showed the most significant improvement in model accuracy. This is due to the fact that M-LTFR primarily focused on preserving a significant amount of shallow texture information, which had a crucial impact on fabric defect detection tasks. Therefore, using M-LTFR greatly enhanced the overall model performance. Additionally, experiments with the individual combinations of the three proposed methods demonstrated the effectiveness of each improvement. In particular, the combination of PIoU with M-LTFR resulted in a slight reduction in mAP@0.5 compared to the original PIoU, due to M-LTFR preserved rich low-level texture features, significantly improving the accuracy of detecting small defect targets. Since small defect anchors have smaller sizes and the PIoU regression process tends to enhance the quality of medium to high-quality anchors, it resulted in a 0.5% decrease in mAP@0.5. However, due to the focus of PIoU on higher-quality anchors, mAP@0.5:0.95 improved by 1.2%, resulting in an overall performance improvement. Combining the three improvements, the LTFR-YOLOv8 was improved by mAP@0.5(8.4%) and mAP@0.5:0.95(4.8%) compared with the YOLOv8, and the number of parameters was reduced by 1.9%. Despite a slight reduction in detection speed, it remains within the acceptable range for the textile industry. Figure 10 illustrated a local comparison of defect detection results, highlighting the superior detection accuracy and improved quality of anchor achieved by LTFR-YOLOv8 in fabric defect detection.

Table 6.

Results of ablation experiments on Tianchi dataset.

YOLOv8s	PIoU	M-LTFR	SlideLoss	mAP@0.5(%)	mAP@0.5:0.95(%)	Parameters (M)	FPS
√				40.6	19.2	21.46	172
√	√			43.6	22.9	21.46	178
√		√		45.7	22.6	21.04	163
√			√	42.9	21.2	21.46	178
√	√		√	44.5	21.6	21.46	188
√	√	√		45.2	23.8	21.04	169
√		√	√	46.2	23.4	21.04	166
√	√	√	√	49	24.0	21.04	169

Figure 10.

Comparison of detection effect on Tianchi dataset.

Finally, to validate the effectiveness of our proposed method, we conducted a comparison with several advanced detection models, as shown in Table 7. The results demonstrated that while LTFR-YOLOv8 may not be the fastest in terms of detection speed, it achieved the highest average accuracy. Additionally, the network model size was smaller compared to most models, making it easier to deploy on mobile and embedded devices. This aspect fulfilled the algorithm requirements of the textile detection industry effectively.

Table 7.

Comparison of results on TianChi dataset between LTFR-YOLOv8 and others models.

Model	mAP@0.5 (%)	mAP@0.5:0.95 (%)	Weight (M)	FPS
YOLO5s	45.7	22.2	13.54	192
YOLO6n6	29.3	13.6	10.35	119
YOLO7	42	20.3	71.65	164
YOLO8n	38.8	19.2	5.94	207
YOLO8s	40.6	19.2	21.46	172
YOLO8m	44.8	23.0	49.62	119
YOLO8l	42.8	21.2	83.60	93
YOLO8x	41.8	21.2	130.41	65
YOLOv9-c	43.9	23.0	96.84	31
YOLOv10s	40.6	23.2	15.8	158
YOLOv11s	42.0	23.1	18.3	149
LTFR-YOLOv8	49	24	21.04	169

To assess the LTFR-YOLOv8 network’s generalization and robustness, in addition to conducting experiments on the Tianchi dataset, this research also conducted another set of experiments. We selected our self-built Denim fabric dataset, which is derived from the actual production workshops, and contained a total of 15 kinds of fabric defects and 6241 defect images. Each image had a resolution of 3072 × 2048. Defect types included Flying lowers, Flying sand, Double warp, Weft shrinkage, Double weft, Cotton grain, Broken tail, Loose warp, Weft jump, Thin path, Fine path, Weft knot, Small broken warp, and Rib warp. The experimental setup and evaluation parameters were identical to the previous experiment, and the corresponding results were summarized in Table 8. Compared with the baseline YOLOv8s algorithm, the LTFR-YOLOv8 algorithm increased the mAP@0.5 by 6% and the mAP@0.5:0.95 by 3.1%, respectively. Figure 11 illustrated the local comparison of defect detection results. LTFR-YOLOv8 exhibited improved detection accuracy for the denim fabric dataset, particularly addressing the issue of low detection accuracy encountered by the YOLOv8 model when dealing with small defects. The experiments demonstrated that the LTFR-YOLOv8 showed significant improvements.

Table 8.

Performance of PIoU for different hyperparameter values.

Model	mAP@0.5 (%)	mAP@0.5:0.95 (%)
YOLOv8s	57.3	28.4
LTFR-YOLOv8	63.3	31.5

Figure 11.

Model performance on the Demin dataset.

Conclusion and future work

In this paper, we proposed an enhanced fabric defect detection network that effectively retains low-level texture features and incorporates adaptive anchor boxes to improve detection performance. The M-LTFR technique was introduced to preserve crucial fabric texture information, significantly boosting the accuracy of small defect detection. Furthermore, by integrating PIoU loss with an adaptive penalty factor and a non-monotonic attention layer, our approach guided anchor boxes along effective regression paths, improving their quality and enabling better localization of defects with extreme aspect ratios. The use of the SlideLoss function addressed the issue of sample imbalance in fabric datasets, offering more effective weight adjustment for easy and hard samples compared to traditional loss functions. Experimental results showed that our model outperformed the baseline YOLOv8s in terms of detection accuracy, parameters, and deployment ease, meeting both high detection accuracy and real-time performance requirements in industrial settings.

In the future, we plan to apply the multi-scale fusion method with low-level texture feature retention to other industrial defect detection tasks, such as in electronics manufacturing and automotive components. We also aim to improve the robustness of the model by testing it on a wider range of fabric types and under various environmental conditions, which will help assess its generalization ability in more diverse real-world scenarios. Additionally, we will explore potential optimizations in the model’s architecture and training strategies to further improve its efficiency and accuracy for fabric defect detection. These future studies will help refine our method and expand its practical applications in industry.

Footnotes

ORCID iD

Miao Yu

Statements and declarations

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by NSFC (No.61873293), Leading talents of science and technology in the Central Plain of China (234200510009), Henan province key science and technology research projects (222102210008, 232102211002, 232102211030).

conflicting interest

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Uzen

Turkoglu

Hanbay

. Texture defect classification with multiple pooling and filter ensemble based on deep neural network. Expert Syst Appl 2021; 175: 114838.

Susan

Sharma

. Automatic texture defect detection using Gaussian mixture entropy modeling. Neurocomputing 2017; 239: 232–237.

Liu

Gao

, et al. Robust low-rank decomposition of multi-channel feature matrices for fabric defect detection. Multimed Tools Appl 2019a; 78(6): 7321–7339.

Jia

Chen

, et al. Fabric defect inspection based on lattice segmentation and template statistics. Inf Sci 2020; 512: 964–984.

Wang

, et al. Surface defects detection using non-convex total variation regularized rpca with kernelization. IEEE Trans Instrum Meas 2021; 70: 1–13.

Qin

Wen

. Fabric defect detection algorithm based on residual energy distribution and gabor feature fusion. Vis Comput 2023; 39(11): 5971–5985.

Wang

Pan

Gao

. Fabric defect detection method using optimized sparse dictionary. J Textil Res 2023; 44(08): 81–87.

Yıldız

Buldu

Demetgul

. A thermal-based defect classification method in textile fabrics with K-nearest neighbor algorithm. J Ind Textil 2016; 45(5): 780–795.

Yildiz

. Dimensionality reduction-based feature extraction and classification on fleece fabric images. Signal Image Video Process (SIViP) 2017; 11: 317–323.

10.

Yildiz

Buldu

Demetgul

, et al. A novel thermal-based fabric defect detection technique. J Text Inst 2015; 106(3): 275–283.

11.

Yıldız

Buldu

. Wavelet transform and principal component analysis in fabric defect detection and classification. Pamukkale University Journal of Engineering Sciences 2016; 23(5): 622–627.

12.

Yıldız

Demir

Ülkü

. Fault detection of fabrics using image processing methods. Pamukkale University Journal of Engineering Sciences 2017; 23(7): 841–844.

13.

Yıldız

Demir

, et al. Determination of yarn twist using image processing techniques. InInternational Conference on Image Processing. International Conference on Image Processing, Production and Computer Science The Nanjing 2015; 39: 83–88.

14.

Bocekci

Yildiz

. Classification of textures using filter based local feature extraction. EDP Sciences 2016; 75: 03001.

15.

Barman

H-C

Kuo

C-FJ

. Development of a real-time home textile fabric defect inspection machine system for the textile industry. Textil Res J 2022; 92(23-24): 4778–4788.

16.

Yang

. Fabric defect detection of statistic aberration feature based on gmrf model. J Text Res 2013; 4: 026.

17.

Liu

Dong

, et al. Fabric defects detection based on multi-sources features fusion. In Proceedings of the 2019 8th International Conference on Computing and Pattern Recognition 2019; 100–105.

18.

Wen

Lai

. Textile fabric defect detection based on improved faster r-cnn. AATCC Journal of Research (AATCC J. Res) 2021; 8(1 suppl): 82–90.

19.

. Bag of tricks for fabric defect detection based on cascade r-cnn. Textil Res J 2021; 91(5-6): 599–612.

20.

Xie

Yang

Zhiqi

, et al. An intelligent defect detection system for warp-knitted fabric. Textil Res J 2022; 92(9-10): 1394–1404.

21.

Lee

Chan

Mayo

, et al. How deep learning extracts and learns leaf features for plant classification. Pattern Recogn 2017; 71: 1–13.

22.

Fang

Xia

Liu

, et al. Automatic zipper tape defect detection using two-stage multi-scale convolutional networks. Neurocomputing 2021; 422: 34–50.

23.

Zhao

Zhang

, et al. Simultaneous detection of defects in electrical connectors based on improved convolutional neural network. IEEE Trans Instrum Meas 2022; 71: 1–10.

24.

Ren

Girshick

, et al. Faster r-cnn: towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence 2016; 39(6): 37–49.

25.

Gkioxari

Dollar

, et al. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision 2017; 2961–2969.

26.

Cai

Vasconcelos

. Cascade r-cnn: delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition 2018; 6154–6162.

27.

Song

Liu

Wang

. Revisiting the sibling head in object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020; 11563–11572.

28.

Redmon

Ali

. YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition 2017; 7263–7271.

29.

Redmon

Ali

. YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 2018a.

30.

Liu

Anguelov

Erhan

, et al. Ssd: single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands 2016; 21–37.

31.

Lin

Priya

Ross

Kaiming

Piotr

. “Focal loss for dense object detection.” In Proceedings of the IEEE international conference on computer vision, pp. 2980-2988. 2017.

32.

Tan

Pang

Efficientdet

QVL

. Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020; 10781–10790.

33.

Law

Deng

. Cornernet: detecting objects as paired keypoints. In Proceedings of the European conference on computer vision (ECCV) 2018; 734–750.

34.

Zhou

Zhuo

Krahenbuhl

. Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2019; 850–859.

35.

Tian

Shen

Chen

, et al. Fcos: fully convolutional one-stage object detection. corr abs/1904.01355 (2019). In Proceedings of the IEEE/CVF international conference on computer vision 2019.

36.

Carranza-Garc´ıa

Torres-Mateo

Lara-Benıtez

, et al. On the performance of one-stage and two-stage object detectors in autonomous vehicles using camera data. Remote Sens 2020; 13(1): 89.

37.

Miao

Shan

Zhou

, et al. Real-time defect identification of narrow overlap welds and application based on convolutional neural networks. J Manuf Syst 2022; 62: 800–810.

38.

Tulbure

A-A

Tulbure

A-A

Dulf

. A review on modern defect detection models using dcnns–deep convolutional neural networks. J Adv Res 2022; 35: 33–48.

39.

Zhang

Lee

D-J

. Automatic fabric defect detection with a wide-and compact network. Neurocomputing 2019b; 329: 329–338.

40.

Xie

. A robust fabric defect detection method based on improved refinedet. Sensors 2020; 20(15): 4260.

41.

Liu

Huo

, et al. Dlse-net: a robust weakly supervised network for fabric defect detection. Displays 2021; 68: 102008.

42.

Zhang

H-w

Zhang

L-j

P-f

, et al. Yarndyed fabric defect detection with YOLOv2 based on deep convolution neural networks. In 2018 IEEE 7th data driven control and learning systems conference (DDCLS) 2018; 170–174.

43.

Jing

Zhuo

Zhang

, et al. Fabric defect detection using the improved YOLOv3 model. J Eng Fiber Fabr 2020; 15: 1558925020908268.

44.

Jin

Niu

. Automatic fabric defect detection based on an improved YOLOv5. Math Probl Eng 2021; 2021: 1–13.

45.

Jing

Wang

Ratsch

, et al. ¨ Mobile-unet: an efficient convolutional neural network for fabric defect detection. Textil Res J, 92(1-2):30– 42, 2022.

46.

Zhou

Chen

. Study on detection method of small-size solder ball defects. In 2017 2nd IEEE International Conference on Integrated Circuits and Microsystems (ICICM) 2017; 213–217.

47.

Dong

Shi

, et al. Detection of small size defects in belt layer of radial tire based on improved faster r-cnn. In 2021 11th International Conference on Information Science and Technology (ICIST) 2021; 531–538.

48.

Farhadi

Yolov3 J

. Yolov3: An incremental improvement. In Computer vision and pattern recognition 2018; 1804: 1–6.

49.

Liu

Qin

, et al. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition 2018; 8759–8768.

50.

Jiang

Wang

, et al. Unitbox: an advanced object detection network. In Proceedings of the 24th ACM international conference on Multimedia 2016; 516–520.

51.

Zheng

Wang

Ren

, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cybern 2021; 52(8): 8574–8586.

52.

Zheng

Wang

Liu

, et al. Distance-iou loss: faster and better learning for bounding box regression. In Proceedings of the AAAI conference on artificial intelligence 2020; 34(07): 12993–13000.

53.

Zhang

Y-F

Ren

Zhang

, et al. Focal and efficient iou loss for accurate bounding box regression. Neurocomputing 2022; 506: 146–157.

54.

Tong

Chen

, et al. Wiseiou: bounding box regression loss with dynamic focusing mechanism. arXiv preprint arXiv:2301.10051 2023.

55.

Shrivastava

Gupta

Girshick

. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition 2016; 761–769.

56.

Lin

Goyal

Girshick

Dollár

. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision 2017; 2980–2988.

57.

Chi

Zhang

Xing

, et al. Selective refinement network for high performance face detection. In Proceedings of the AAAI conference on artificial intelligence 2019; 33(01): 8231–8238.

58.

Zhang

Zhu

Lei

, et al. Faceboxes: a cpu real-time face detector with high accuracy. IEEE International Joint Conference on Biometrics (IJCB) 2017; 1–9.

Enhanced fabric detection network with retained low-level texture features and adaptive anchors

Abstract

Keywords

Introduction

Related work

Fabric detection algorithm based on CNN

Application of multi-scale detection

Methodology

Multi-scale detection channel based LTFR(M-LTFR)

Enhancement of bounding box regression loss function

Enhancement of classification loss function

Experimental conclusion and analysis

Experimental setting

Experimental environment and evaluation parameter

Dataset and preprocessing

Experiment results and analysis

Ablation experiment

Conclusion and future work

Footnotes

ORCID iD

Statements and declarations

Funding

conflicting interest

References