Sage Journals: Discover world-class research

Abstract

Predicting cashmere fiber and wool fiber is one of the main challenges in the textile industry. Near infrared spectroscopy(NIR) is a fast, nondestructive and quickly packaged detection method. Due to the highly similar characteristics of the near infrared spectroscopy of cashmere fiber and wool fiber, it is difficult to distinguish them. In order to improve the accuracy of predicting cashmere fiber and wool fiber, a near infrared spectroscopy cashmere fiber and wool fiber prediction model based on Markov transition field (MTF) and improved YOLOv8 is proposed in this paper. This method calculates the Markov transition matrix of local near infrared spectroscopy data between adjacent wavelength intervals, arranges each probability in wavelength order to expand the Markov transition matrix, forming a MTF of local wavelengths. By replacing the backbone network of YOLOv8 with a hierarchical visual transformer using displacement windows, the network’s attention to local frequency bands and peaks is enhanced. Dropout is added to Swin Transformer (ST) to prevent network overfitting. To examine the effectiveness and stability of the model, it is compared with KNN, decision trees, random forests, AlexNet, VGG16, GoogLeNet, ResNet50, YOLOv8 and other models, and ablation experiments are conducted to further validate the proposed model structure. Experimental results show that the average prediction accuracy of cashmere fiber and wool fiber using this method is highest at 97.01%. The proposed near infrared spectroscopy cashmere fiber and wool fiber prediction model based on MTF and improved YOLOv8 can achieve rapid and non-destructive prediction of cashmere fiber and wool fiber, providing new ideas for qualitative analysis in the field of near infrared spectroscopy.

Keywords

near infrared spectroscopy wool fiber cashmere fiber Markov transition field YOLOv8 qualitative analysis deep learning machine learning

Introduction

As the world’s largest producer of cashmere raw materials and exporter of textile processing, China occupies an irreplaceable and important position in the international cashmere trade system. Cashmere fiber has become the core raw material for high-end apparel manufacturing due to its excellent lightness, delicate touch and warmth-retaining properties. However, cashmere production is extremely scarce, and in the production process, the accurate identification of cashmere fiber and wool fiber has always faced technical bottlenecks. Due to the high degree of similarity between the two in terms of morphological characteristics, traditional detection methods have limitations such as time-consuming and high cost. Therefore, the establishment of a fast and efficient fiber prediction model is not only related to the quality control of raw materials, but also the key to optimize the textile testing technology system.¹

Traditional identification methods mainly include microscopic methods, DNA methods and image-based methods.² The optical microscopy method relies mainly on the experience of the inspector to identify the type of fiber, which is influenced by subjective factors, relatively slow detection speed and high labor costs. The DNA method³ identifies animal fibers by polymerase chain reaction (PCR) primers and probes, which are effective in identifying cashmere/wool mixtures. The chemolysis method⁴ predicts the cashmere content (CC) of blended samples by referencing the near-infrared (NIR) band assignments of chemical bonds in proteins and performing a stoichiometric analysis. These two methods are limited by human and material resources and are not suitable for wide-scale dissemination. The main component of both cashmere and wool is keratin, and their spectral images are very similar, but the different content of cysteine produces different near-infrared absorption peaks, resulting in some differences in the near-infrared spectral band data of cashmere and wool fibers. Chrimatopoulos et al.⁵ combined the attenuated total reflection Fourier transform infrared spectroscopy (ATR FT-IR) with partial least squares discriminant analysis (PLS-DA) to establish a prediction model. The PLS-DA model provided a good differentiation between camelid and eight species of hair. Zhu et al.⁶ proposed an image identification method of cashmere fiber and wool fiber based on an improved Xception network. Zhu et al.⁷ proposed an improved version of ShuffleNetV2 and migration learning that cashmere wool fiber classification, which achieves fast and accurate fiber classification. Lv et al.⁸ used principal component analysis (PCA) combined with the wavelength maximum distance method to predict cashmere fibers and wool fibers. The experimental results showed that the prediction accuracy of cashmere was 80.76% and that of wool was 87.02%. Wang et al.⁹ used particle swarm optimization-support vector machine (PSO-SVM) to establish a qualitative model of cashmere fiber and wool fiber, and the experimental results showed that the prediction accuracy of cashmere fiber and wool fiber reached 93%, which achieved the qualitative analysis of cashmere and wool.

With the rapid development of deep learning, neural networks become an effective architecture for image classification,¹⁰ target detection¹¹ and image segmentation.¹² Deep learning networks can be trained and learned from images,¹³ and can also be used as a preferred method for extracting features from spectral data. As a network model in deep learning, Yolov8 excels in a number of tasks, such as image segmentation, classification and object detection.¹⁴ Gu et al.¹⁵ proposed an improved Yolov8 network model and deployed it on edge mobile devices to achieve simultaneous detection of mango fruit and fruit stalk, and achieved good results. Riza et al.¹⁶ developed a YOLOv8-CoLa network model based on the framework of YOLOv8 to accurately detect the degree of fermentation of cocoa beans. Wang et al.¹⁷ integrated the Shape-loU loss function into YOLOv8 to achieve good results in the detection of small and medium-sized foreign bodies in Pu’er sun-dried green tea. Wang et al.¹⁸ proposed a YOLOv8 network to enhance the attention mechanism, which realized the detection of colorectal polyps. Duan et al.¹⁹ introduced a small target detection head and Inner-WioU to improve YOLOv8 to realize the detection of small targets in UAV aerial photography. Li et al.²⁰ used the improved YOLOv8n model, combined with the Bi FPN structure and the SPD-Conv module, to improve the detection performance of mango fruits and stems. Cao et al.²¹ proposed a Pyramid-YOLOv8 model. On the YOLOv8x network framework, the multi-attention feature fusion network structure is adopted, and a lightweight module is designed to reduce the amount of calculation, so as to realize the rapid detection of rice leaf blast disease. Tao et al.²² enhanced the feature extraction ability by introducing the Convolutional Block Attention Module (CBAM), optimized the weighted intersection union ratio (Wiou) loss function, and improved the YOLOv8 algorithm to achieve fast and accurate detection and identification of pavement cracks. The differences between the above test objects are more obvious, the differences between wool and cashmere are smaller, and the sensitivity of the local band differences is lower, so it is necessary to improve the sensitivity of the yolov8 algorithm to the local band differences.^23,24

To solve the problem of fast and nondestructive identification of cashmere fiber and wool fibers, a prediction model for identification of cashmere fiber and wool fiber based on near-infrared spectroscopy is proposed by using MTF and improved YOLOv8 network inspired by the above discussion. The problem of high similarity of traditional one-dimensional spectral features is solved by converting the time series data into images through MTF, replacing the backbone network with a hierarchical visual transformer to enhance the sensitivity to the local band differences and adding a dropout in the Swin Transformer to prevent overfitting and to improve the generalizability, which achieves a fast near-infrared spectroscopy for cashmere fibers and wool fibers, efficient and non-destructive prediction of cashmere fiber and wool fiber by NIR spectroscopy.

Methods

An improved YOLOv8

Based on the characteristics and neural network structure, a ST-YOLOv8 network for predicting cashmere fiber and wool fiber by near infrared spectroscopy is established by using MTF and improved YOLOv8 network. Firstly, MTF transforms the near infrared spectroscopy data of cashmere fiber and wool fiber to ensure its suitability for enhancing the YOLOv8 network model. Secondly, replacing the continuous 3 × 34 convolutions in the YOLOv8 Backbone with Swin Transformer enhances the network’s focus on local frequency bands and peaks of near infrared spectroscopy data, thereby improving the accuracy of cashmere fiber and wool fiber prediction. Additionally, adding a Dropout layer in the Swin Transformer reduces network complexity to prevent overfitting. Finally, improving the Detect layer in YOLOv8 achieves the final prediction of cashmere fiber and wool fiber. The overall model architecture is illustrated in Figure 1.

Figure 1.

MTF and ST-YOLOv8 network overall model framework.

The network architecture of YOLOV8 consists of three main components²⁵:

(i) Backbone: a series of convolution and deconvolution are used to extract features, and residual connections and bottleneck structures are also used to reduce the network size and improve the performance.

(ii) Neck: multi-scale feature fusion techniques are used to fuse feature maps from different stages of the Backbone to enhance feature representation.

(iii) Head: mainly responsible for the final target detection and classification tasks, including a detection head and a classification head. The detection head contains a series of convolutional and inverse convolutional layers to generate detection results, while the classification head uses global average pooling to classify each feature map.

Markov transition field

In this paper, MTF transforms the near infrared spectroscopy data of cashmere fiber and wool fiber to ensure its suitability for enhancing the YOLOv8 network model. The first step to establish the Markov transition field is to quantify the one-dimensional spectral data and establish a first-order Markov transition matrix.

For a given near infrared spectral data, the wavelength sequence as is presented as:

W = {w_{1}, w_{2}, \dots, w_{n}}

(1)

The absorbance amplitude corresponding to the wavelength sequence is:

A = {a_{1}, a_{2}, \dots, a_{n}}

(2)

The continuous time series $W$ are discretized based on the quantile method. By dividing the $D$ quantiles, each quantile interval has the same area under the Gaussian curve. And the wavelength sequence $w_{i}$ s mapped to the quantile interval $b_{j}$ . The two adjacent quantiles shall approximately satisfy as follows:

S (b_{j + 1}) - S (b_{j}) = \frac{1}{D}

(3)

where $S (b_{j})$ denotes the area of the quantile interval.

For the quantile interval $A = {a_{1}, a_{2}, \dots, a_{n}}$ , each quantile interval $a_{j}$ has a corresponding $w_{i}$ , and the transition matrix probability of the quantile interval is calculated by the first-order Markov chain along the direction of $w_{i}$ increase to construct the weight matrix M. The element m_ij of M is the probability of a point in the quantile interval $a_{j}$ in the next step in the interval $a_{j}$ . The elements in the weight matrix $M$ are normalized, and the matrix obtained with probability $p$ is the Markov transformation matrix $M^{*}$ described above:

\begin{array}{l} \begin{array}{l} M^{*} = [\begin{matrix} m_{11 p (w_{i} \in a_{1} w_{i - 1} \in a_{1})} & m_{12 p (w_{i} \in a_{1} w_{i - 1} \in a_{2})} & \dots & m_{1 D p (w_{i} \in a_{1} w_{i - 1} \in a_{D})} \\ m_{21 p (w_{i} \in a_{2} w_{i - 1} \in a_{1})} & m_{22 p (w_{i} \in a_{2} w_{i - 1} \in a_{2})} & \dots & m_{2 D p (w_{i} \in a_{2} w_{i - 1} \in a_{D})} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ m_{D 1 p (w_{i} \in a_{D} w_{i - 1} \in a_{1})} & m_{D 2 p (w_{i} \in a_{D} w_{i - 1} \in a_{2})} & \dots & m_{D D p (w_{i} \in a_{D} w_{i - 1} \in a_{D})} \end{matrix}] . \end{array} \end{array}

(4)

Although the Markov transition matrix based on one-dimensional spectral data contains the dynamic characteristics of Markov, it ignores the conditional relationship between the distribution of the wavelength sequence $W$ and the dependence of the step size. The Markov transition field $N$ is formed by expanding the Markov transition matrix by arranging the probabilities in wavelength order:

\begin{array}{l} \begin{matrix} N = [\begin{matrix} N_{11} & N_{12} & \dots & N_{1 n} \\ N_{21} & N_{22} & \dots & N_{2 n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ N_{n 1} & N_{n 2} & \dots & N_{n n} \end{matrix}] \\ = [\begin{matrix} m_{11 p (w_{i} \in a_{1} w_{i - 1} \in a_{1})} & m_{12 p (w_{i} \in a_{1} w_{i - 1} \in a_{2})} & \dots & m_{1 D p (w_{i} \in a_{1} w_{i - 1} \in a_{D})} \\ m_{21 p (w_{i} \in a_{2} w_{i - 1} \in a_{1})} & m_{22 p (w_{i} \in a_{2} w_{i - 1} \in a_{2})} & \dots & m_{2 D p (w_{i} \in a_{2} w_{i - 1} \in a_{D})} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ m_{D 1 p (w_{i} \in a_{D} w_{i - 1} \in a_{1})} & m_{D 2 p (w_{i} \in a_{D} w_{i - 1} \in a_{2})} & \dots & m_{D D p (w_{i} \in a_{D} w_{i - 1} \in a_{D})} \end{matrix}] . \end{matrix} \end{array}

(5)

In the Markov transition field, $a_{i}$ and $a_{j}$ are the quantiles of wavelength steps $i$ and $j$ , respectively, and the transition probability from $a_{i}$ to $d_{j}$ is $M_{i j}$ . By considering the wavelength position, the Markov transition matrix $M^{*}$ containing the amplitude axis is extended to $N$ . For $N_{i j}$ , by assigning the transition probability of wavelength step $i$ to time wavelength step $j$ , the transition probability coding of multi-scale wavelength span by Markov transition field is realized. For example, $N_{i, j | i - j |} = k$ denotes the transition probability between points of a band. In the Markov transition field, each matrix element can be $N_{i j}$ pixels, so the time series signal can be visualized through the Markov transition field.

Swin Transformer transfer learning

Swin Transformer is an attentional mechanism that can be used to replace the backbone of the network proposed by Zhang et al.²⁶ in 2023, which mainly adopts a hierarchical construction method to enhance the sensing field of the target so as to sense more spectral band features and reduces the amount of computation by dividing the feature graphs, and finally adopts the window transform to solve the information transfer between the windows due to the division of the graphs. The Swin Transformer framework diagram is shown in Figure 2.

Figure 2.

Swin Transformer framework diagram.

First, the input is chunked into the Patch Partition module, that is, one Patch for every $4 \times 4$ neighboring pixels, and then spread in the channel direction. Where each Patch has $4 \times 4 = 16$ pixel, each pixel has three values R, G, B, so the depth after flattening is $16 \times 3 = 48$ . So after passing Patch Partition $H \times W \times 3$ becomes $[\frac{H}{4}, \frac{W}{4}, 48]$ . Subsequently, the channel data of each pixel is linearly transformed through a Linear Embeding layer from 48 to C, that is, the image shape is then changed from $[\frac{H}{4}, \frac{W}{4}, 48]$ to $[\frac{H}{4}, \frac{W}{4}, C]$ . Next, feature maps of different sizes are constructed through four Stages, and except for the Stage, which passes through a Linear Embeding layer, the remaining three Stages are first downsampled through a Patch Merging layer for downsampling.

The Swin Transformer uses Shifted Windows Multi-head Self-Attention (SW-MSA) and Windows Multi-head Self-Attention (W-MSA) instead of the standard Multi-head Self-Attention unit (Multi head Self-Attention, MSA) and used on two consecutive Swin Transformers. Residual connections and Layer Norm (LN) are added before Multilayer Perceptron (MLP), W-MSA and SW-MSA to give the model better training stability. The swin transformer block structure diagram is shown in Figure 3.

Figure 3.

Swin Transformer block structure diagram.

The continuous Swain transformer module update rate is:

{\hat{Z}}^{l} = W - M S A (L N (Z^{l - 1})) + Z^{l - 1},

(6)

Z^{l} = M L P (L N ({\hat{Z}}^{l})) + {\hat{Z}}^{l},

(7)

{\hat{Z}}^{l + 1} = S W - M S A (L N (Z^{l})) + Z^{l},

(8)

Z^{l + 1} = M L P (L N ({\hat{Z}}^{l + 1})) + {\hat{Z}}^{l + 1},

(9)

where ${\hat{Z}}^{l}$ and $Z^{l}$ represent the output characteristics of the (S) W-MSA module and the MLP module respectively.

The Windowed Multiple Self-Attention Unit (W-MSA) in the Swin Transformer module shown in Figure 4, divides the input feature map into a series of windows that do not overlap with each other, and performs the attention computation within each window in order to reduce the amount of network computation. However, this window segmentation method leads to a lack of information exchange between different windows, which limits the performance of the network. To solve this problem, a new window layout method of Window Multiple Attention Units (SW-MSA) is introduced. The new window layout involves offsetting the regularly segmented windows by a distance of M/2 pixels from the top-left corner to the bottom-right corner, respectively and performing a cyclic displacement operation. Combining the Window Multihead Self-Attention Unitand the Window Multihead Self-Attention Unit in the Swin Transformer module can effectively reduce the amount of computation of the network as well as ensure the global correlation between pixels and improve the performance of the network.

Figure 4.

Shift windows layout: (a) input images, (b) W-MSA windows segmentation, (c) shifted windows, and (d) SW-MSA windows segmentation.

The dropout function

To reduce the dependency on individual neurons and enhance the network’s generalization capability, a Dropout layer is added to the Multi-Layer Perceptron (MLP) within the Swin Transformer Block of Swin Transformer. The Dropout function is shown as follows:

h^{'} = {\begin{array}{l} 0 & The probability p (h^{'} < p) \\ \frac{h}{1 - p} & other (h > p) \end{array} .

(10)

Add Dropout layer structure diagram is incorporated as illustrated in Figure 5. The Dropout function prevents the network model from overfitting on the training data, thus achieving better performance on the test dataset. The input first passes through a fully connected layer (Linear), then undergoes non-linear transformation via the ReLU activation function. Subsequently, it goes through the Dropout layer to reduce the number of hidden units in the neural network’s hidden layer, thereby decreasing the model’s complexity and preventing overfitting. Finally, the input is fed into the next fully connected layer (Linear).

Figure 5.

Add Dropout layer structure diagram.

Model loss function

Since cashmere/wool fiber prediction is a binary classification task, a binary cross-entropy loss function (BceLoss) was used for prediction quality assessment. The BceLoss function is shown as follows:

BceLoss = - \frac{1}{n} \sum_{i = 1}^{n} [P_{1} (y) + P_{2} (y)],

(11)

where, $P_{1} (y)$ and $P_{2} (y)$ represent $y_{i} \times l o g p (y_{i} = 1)$ and $(1 - y_{i}) \times l o g (1 - p (y_{i} = 1))$ respectively, where $y_{i}$ indicates the binary label value (0 or 1) of the $i$ sample and $p (y_{i} = 1)$ is the model’s predicted value for the $i$ sample, that is, the probability that the model predicts the label value of the $i$ sample as 1. When the predicted result is consistent with the true label, the value of the loss function is close to 0, indicating accurate model prediction. Conversely, indicating model prediction error.

In summary, the local spectral data are encoded into a 2D image by Markov transfer field to strengthen the band correlation features, and the YOLOv8 model architecture is improved – the displacement window hierarchical visual Transformer is used to enhance the local band focus, and the Dropout mechanism is introduced to suppress the overfitting, which significantly improves the NIR spectral prediction accuracy. This method can also be applied in other object recognition²⁷ and face recognition.²⁸

Materials and experiments

Data collection

The near infrared spectral datasets of wool (210 samples) and cashmere (180 samples) cover seven wool breeds and five cashmere goat breeds from Australia, Chifeng, Xinjiang, Qinghe, Afghanistan, Outer Mongolia and Shaanxi in China, respectively. Both the wool and cashmere samples are in the form of loose fibers, and the specific sample images and physical parameters are shown in Figure 6 and Table 1.

Figure 6.

The images of samples: (a) wool fiber and (b) cashmere fiber.

Table 1.

Fiber classes and physical information.

No.	Types	M,D^a ( $μ$ m)	M,L^b (mm)	Locations
1	Australian Wool	21.8	49	Australia
2	Chifeng Aohan Native Combed Wool	19.5	45	Aohan Banner, Chifeng City, Inner Mongolia, China
3	Xinjiang Fine Wool Sheep Wool (Domestic)	19.8	43	Xinjiang, China
4	Qinghe County Mercerized Long Wool	19.5	45	Qinghe County, Hebei Province, China
5	Chifeng Aohan Lamb’s Wool	19.5	38	Aohan Banner, Chifeng City, Inner Mongolia, China
6	Xinjiang Longwool	23.5	46	Xinjiang, China
7	Common Goat Hair	20.5	34	Globally widespread (common goats)
8	Afghan Purple Cashmere	14.5	36	Afghanistan
9	Xinjiang White Cashmere	16.8	34	Xinjiang, China
10	Brown Dehaired Cashmere (Outer Mongolia and Tibet)	16.8	27	Outer Mongolia; Tibet, China
11	Chifeng Aohan Cashmere	15.5	32	Aohan Banner, Chifeng City, Inner Mongolia, China
12	Shaanxi Dehaired Cashmere	14.9	28	Shaanxi Province, China

Note: M,D^a (μm) is the mean diameter; M,L^b (mm) its length.

Spectra collection

The RZNIR 7900 near infrared spectral analyzer was used to collect data in the 1000–2500 nm band using diffuse reflection method. Given that the 1000–1300 nm band is significantly interfered by the dye components, the effective spectral band of 1300–2500 nm was finally selected to construct the fiber prediction model. The specific process is:

(i) Near infrared spectral band data collection stage, near infrared spectral analyzer at room temperature conditions, power on the preheating half an hour. Subsequently, the fibers are laid flat into the detection aperture of the NIR spectral analyzer, so that the fibers are uniformly distributed in the aperture, and ensure that the thickness is not less than 3 mm, and the scale is used to buckle the pressure, so that the fibers and the aperture are more tightly adhered to prevent light leakage resulting in the NIR spectral band data generated by the noise.

(ii) Using the RZNIR 7900 NIR spectrometer, align and scan the fiber samples, ending the measurement after the NIR spectral curve stabilizes.

The reliability of the data was improved by bi-directional scanning, and the spectral dataset was constructed by taking the mean value of each sample after 30 forward/reverse measurements (total 1170 entries: wool 630/cashmere 540). Additionally, we first applied SNV preprocessing followed by SG preprocessing to the spectral data of cashmere fiber and wool fiber, and the raw and pre-processed spectral features are shown in Figures 7 and 8.

Figure 7.

The near infrared spectra curves of the raw cashmere fiber and wool fiber.

Figure 8.

The near infrared spectra curves of the cashmere fiber and wool fiber after pre-treatment (SNV + SG).

In this study, a dataset was constructed based on 630 wool and 540 cashmere samples, which was divided into a training set (60%), a validation set (20%) and a test set (20%) in the ratio of 6:2:2 for model training, weight optimization and validation of prediction efficacy, respectively, and the specific division structure is shown in Table 2.

Table 2.

Cashmere wool sample data set partitioning.

Category	Total number of samples	Training set	Validation set	Test set
Wool	630	403	101	126
Cashmere	540	346	86	108

Experimental environment and parameter setting

Based on the PyTorch deep learning network framework, this experiment establishes the proposed model. The experimental environment consists of a 13th Gen Intel (R) Core (TM) i9-13980HX 2.20 GHz processor, with 1T of computer memory, and an NVIDIA GeForce RTX 4080 Laptop GPU with 12G of memory. To optimize the model and compare it with others, the system is configured with a conda virtual environment, utilizing Python 3.7.0 and PyTorch 1.10.0.

The training algorithm hyperparameters used in this study include but are not limited to learning rate, optimizer, batch size and epochs. The specific settings are illustrated in Table 3.

Table 3.

Parameters of YOLOV8 model.

Name	Parameters
Input	224 × 224
Learning rate update strategy	Cosine annealing
Slover	SGD
Learning rate	0.001
Loss	BCE loss
Batch size	32
Epoch	200

Choice of quantile D in prediction

Measure D dominates the MTF texture distribution characteristics (Figure 9), and its parameter sensitivity analysis (Figure 10) reveals that the optimal recognition accuracy is achieved when D = 6, at which time the texture gradient is significantly enhanced and the distribution of the numerical densities is balanced, and that too small a measure (weakening of the texture features) and too large a setting (density clustering effect) both reduce the classification effectiveness, so D = 6 is established as the optimal solution.

Figure 9.

MTF images with different values of quantile D: (a) D = 2, (b) D = 5, (c) D = 10, (d) D = 15, and (e) D = 20.

Figure 10.

Prediction accuracy of cashmere fiber and wool fiber with different quantile D.

Model evaluation index

In this study, the model is trained and tested by the NIR spectral dataset, the training set is used to iteratively optimize the parameters, the test set to verify the generalization performance, and the metrics such as Recall (R), Accuracy (ACC), F1 (F1-Score), Precision (P) and confusion matrix are mainly used as indicators to evaluate the fiber prediction effect of the model. The evaluation indexes mentioned above are shown as follows:

R = \frac{T P}{T P + F N^{'}}

(12)

ACC = \frac{T P + T N}{T P + T N + F P + F N^{'}}

(13)

F 1 = 2 \frac{P \times R}{P + R^{'}}

(14)

P = \frac{T P}{T P + F P^{'}}

(15)

where $T P$ refers to the number of cashmere or wool fiber samples predicted as a positive class; $T N$ means that non-cashmere fiber and non-wool fiber are correctly identified as negative samples; $F P$ refers to the prediction of non-cashmere wool fiber as the number of cashmere wool; $F N$ refers to the number of positive samples predicted as negative samples. Cashmere/wool fibers were defined as a mutually exclusive binary classification task (cashmere is negative if wool is a positive category, and vice versa), and the model classification performance was quantitatively assessed by the accuracy (Acc tends to 1 to characterize high accuracy) and confusion matrix (columns correspond to the number of predicted categories, rows reflect the true distribution).

Results and discussion

Prediction performance

In the process of collecting near infrared spectral data, it is usually affected by noise, baseline offset and other factors. Therefore, it is necessary to preprocess before establishing the near infrared spectral fiber prediction model of cashmere and wool. In this paper, Savitzky-Golay filtering, Standard Normal Variate Transform (SNV), First-order Derivative (FD) and combined preprocessing methods are used. Among them the smoothing parameter of S-G is a polynomial of degree 2, and the smoothing points are 9. The results of different preprocessing methods on the prediction accuracy of cashmere fiber and wool fiber on KNN, Decision tree, Random forest, AlexNet, VGG16, GooLeNet, ResNet50, YOLOv8 and improved YOLOv8 models are shown in Table 4. The image of the near infrared spectral data of cashmere and wool after SNV + S-G pretreatment is shown in Figure 11.

Table 4.

Modeling accuracy of models by different preprocess methods.

Models	Preprocessing method
Models	Raw spectra%	SNV%	S-G%	FD%	SNV + S-G%	SNV + FD%	S-G + FD%	SNV + S-G+FD%
KNN	83.33	81.20	83.76	75.64	85.47	74.36	73.93	77.35
Decision tree	76.50	75.21	76.07	60.26	74.79	60.68	57.26	57.26
Random forest	87.61	86.75	88.03	75.21	88.0	73.08	71.79	78.21
AlexNet	75.64	68.80	76.92	61.54	84.62	65.38	53.84	75.38
VGG16	82.91	85.90	85.47	66.66	82.05	65.38	64.96	76.24
GooLeNet	81.19	82.08	79.91	69.23	82.05	61.11	64.10	64.96
ResNet50	80.77	83.76	85.47	67.95	82.91	63.68	63.25	63.68
MobileNetV2	82.48	85.04	84.32	63.25	85.04	61.11	67.09	63.68
Dp + ST + YOLOv8	96.44	95.29	95.30	81.19	97.01	79.48	80.29	77.79

Figure 11.

The image of near infrared spectral data of cashmere and wool after SNV + S-G pretreatment.

As can be seen from Table 4, appropriate preprocessing of NIR spectral data can improve the prediction accuracy, in which SNV + S-G preprocessing has the highest number of highest prediction accuracies for different models, indicating that SNV + S-G preprocessing method is applicable to the NIR spectral data of cashmere fibers and wool fibers. Meanwhile, the prediction accuracies of the proposed model in this paper are the highest under different preprocessing methods, which verifies the effectiveness of the prediction accuracy of the proposed model.

Comparison of ablation experiments

To verify the influence of the network proposed in this paper on the prediction effect of cashmere wool fiber, a series of ablation experiments were carried out on the self-built data set. The experiments set by are as follows: YOLOv8 original network, ST + YOLOv8, Dp + YOLOv8, Dp + ST + YOLOv8. The ablation experiment results are shown in Table 5.

Table 5.

Ablation experiment.

Method	Acc (%)	R (%)	P (%)	F1-score (%)
YOLOv8	94.74	90.52	90.78	88.95
Dp + YOLOv8	95.23	91.88	92.13	90.53
ST + YOLOv8	96.87	93.69	92.87	91.64
Dp + ST + YOLOv8	97.01	97.58	96.83	97.21

From Table 5, it can be seen that the addition of Dp module improves Acc and R to a lesser extent, while the addition of ST module improves Acc and R significantly, and the combination of the two and the application of Dp + ST + YOLOv8 to YOLOv8 improves Acc by 2.27% and R by 7.06%. For P and F1-score, the addition of Dp and ST modules alone only improves them by about 2% and 3%, but the combination of the two and application of Dp + ST + YOLOv8 to YOLOv8 improves P by 6.05% and F1-score by 8.26%, which indicates that the proposed Dp + ST + YOLOv8 network model significantly optimizes the fiber prediction accuracy while maintaining the high fiber prediction accuracy while significantly optimizing the core metrics of target detection.

Comparison of detection performance of different methods

The Dp-ST-YOLOv8 method in this paper is used as the recognition framework. From the prediction accuracy, the MTF method in this paper is compared with other data conversion methods such as Grand Angle Field (GAF), Recursive Graph (RG), Graphical Differentiation Method (GDM) and Relative Position Matrix (RPM). The experiment was carried out for 100 iterations and the best results were selected for no testing purpose, and the Acc, R, P, F1-score for the five prediction methods are shown in Table 6.

Table 6.

The recognition accuracy of five methods for cashmere wool fiber prediction.

Method	Acc (%)	R (%)	P (%)	F1-score (%)
GAF	91.88	92.13	92.86	92.49
RG	84.62	84.09	88.1	86.05
GDM	80.77	83.47	80.16	81.78
RPM	78.63	77.94	84.13	80.92
Dp+ST+YOLOv8	97.01	97.58	96.83	97.21

As can be seen from Table 6, the algorithm proposed in this paper shows a significant advantage of 97.01% in Acc compared to GAF (91.88%), RG (84.62%), GDM (80.77%), and RPM (78.63%) under the same experimental setup. In terms of R, Dp + ST + YOLOv8 also shows an improvement of more than 5% compared with other algorithms. With high Acc, Dp + ST + YOLOv8 also improves P by more than 4%. F1-Score, as the reconciled average of Acc and R, is optimized together, and Dp + ST + YOLOv8 obtains more than 4.5% improvement in F1-Score. In conclusion, Dp + ST + YOLOv8 not only improves the detection accuracy, but the high recall provides better detection of small and occluded targets and provides higher versatility.

Comparison of feature aggregation of different methods

The degree of feature aggregation is realized by T-distributed stochastic neighbor embedding (T-SNE). T-SNE converts the similarity between data points into probability, and evaluates the quality of feature visualization through the Kullback Leibler divergence of the joint probability of the original space and the embedded space. The T-SNE feature distribution of different methods are shown in Figure 12.

Figure 12.

T-SNE feature distribution of different methods.

It can be seen from Figure 12 that the feature clustering obtained based on the data conversion method in this paper is clear, the differences between the features are obvious, and the distinction is good, while the feature distance of other methods is not obvious enough. The T-SNE feature distribution shows that the near infrared spectral features of cashmere and wool extracted by this method have good separability.

It can be seen from Table 6 that the fiber prediction accuracy of this method is higher than that of other methods, indicating that MTF transformation of cashmere and wool near infrared spectral data can effectively alleviate the complexity of near infrared spectral data, making this method have better fiber prediction performance and stability.

Comparison of prediction results of different methods

From Table 7, it can be observed that the Dp + ST + YOLOv8 method achieved the highest accuracy in the classification tasks for both types of fibers (cashmere 96.44%, wool 95.29%), outperforming the suboptimal random forest (cash- mere 87.61%) and VGG16 (wool 85.90%) by approximately 8.8% and 9.4%, respectively. This indicates that the method significantly enhances classification performance by integrating the improved YOLOv8 framework with specific optimization strategies (Dp + ST), significantly enhances classification performance.

Table 7.

The prediction results of cashmere fiber and wool fiber in different models.

Method	Fiber class
Method	Cashmere (%)	Wool (%)
KNN	83.33	81.2
Decision tree	76.50	75.21
Random forest	87.61	86.75
AlexNet	75.64	68.80
VGG16	82.91	85.90
GooLeNet	81.19	82.08
ResNet50	80.77	83.76
MobileNetV2	82.48	85.04
Dp + ST + YOLOv8	96.44	95.29

Conclusion

In this paper, the problem of accurate identification of cashmere fibers and wool fibers in near-infrared spectra is investigated, and an improved YOLOv8 algorithm is proposed for accurate identification of cashmere fibers and wool fibers. The algorithm utilizes MTF to convert time series data into images, solves the problem of high similarity of traditional one-dimensional spectral features and effectively extracts the features in near-infrared spectra. Replacing the backbone network with a hierarchical visual transformer enhances the sensitivity to local band differences and improves the prediction accuracy. And Dp was added into the ST module to prevent overfitting, reduce the complexity of the model and improve the generalization. The main conclusions of this paper are summarized as follows.

(i) In the ablation experiment, the Acc and P of the improved YOLOv8 model are improved by 2.28% and 6.05%, respectively. Compared with the original YOLOv8 network, the improved model shows significant improvements in all key metrics. In addition, the improved model shows more reliable performance in tomato detection, with higher Acc and P resulting in lower leakage rates compared to other models.

(ii) The improved YOLOv8 was compared with GAF, RG, GDM and RPM in the dataset experiments. The test results show that the improved YOLOv8 model outperforms other models in several aspects of Acc, R, P, F1-score. Its improved effect is especially prominent in complex scenes, small target detection and high similarity target differentiation tasks, providing a better solution for accurate identification of cashmere fiber and wool fiber.

The study also shows that combining deep learning with NIR spectral band data can effectively extract deeper feature information from NIR spectra. Therefore, future research can introduce updated deep learning network models into the field of fiber prediction by NIR spectroscopy, providing a new direction for the prediction of cashmere fiber and wool fiber using NIR spectroscopy.

Footnotes

ORCID iD

Yongli Liu

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by Shaanxi Provincial Department of Education Research Project (23JC031); Xi’an Science and Technology Project (23DCYJSGG0008 2023); Yulin city science and technology plan project (CXY-2020-052) funding.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Luo

Zhang

, et al. Current status and progress of identification methods for cashmere and wool fibers. Wool Text J 2021; 49(10): 112–117.

Zoccola

Bhavsar

Anceschi

, et al. Analytical methods for the identification and quantitative determination of wool and fine animal fibers: a review. Fibers 2023; 11(8): 67.

Tang

Zhang

Zhou

, et al. A real-time PCR method for quantifying mixed cashmere and wool based on hair mitochondrial DNA. Text Res J 2014; 84(15): 1612–1621.

Zhou

Wang

, et al. Fiber-content measurement of wool-cashmere blends using near-infrared spectroscopy. Appl Spectrosc 2017; 71(10): 2367–2376.

Chrimatopoulos

Tummino

Iliadis

, et al. Attenuated total reflection Fourier transform infrared spectroscopy and chemometrics for the discrimination of animal hair fibers for the textile sector. Appl Spectrosc 2024; 78(1): 1–12. DOI: 10.1177/00037028241292372.

Zhu

JiaYI

, et al. Image identification of cashmere and wool fibers based on the improved xception network. J King Saud Univ - Comput Inf Sci 2022; 34(10): 9301–9310.

Zhu

Liu

, et al. Accurate identification of cashmere and wool fibers based on enhanced ShuffleNetV2 and transfer learning. J Big Data 2023; 10(1): 1–22.

Zhao

Study on identification of cashmere and wool using near infrared spectroscopy. J Beijing Inst Cloth Technol 2010; 30(2): 29–34.

Wang

Ding

Non-destructive identification of wool blended fabrics with near infrared spectroscopy based on support vector machine. Wool Text J 2016; 44(4): 1–5.

10.

Liu

Osadchy

Ashton

, et al. Deep convolutional neural networks for Raman spectrum recognition: a unified solution. Analyst 2017; 142: 4067–4074.

11.

Zheng

Chen

Research on a classification algorithm of near-infrared spectroscopy based on 1d-CNN. Spectrosc Spect Anal 2023; 43(8): 2446–2451.

12.

Liu

. Determination of protein content of wheat using partial least squares regression based on near-infrared spectroscopy preprocessing. In: 2022 4th international conference on robotics and computer vision (ICRCV), Wuhan, China, 2022, pp.7–10. DOI: 10.1109/ICRCV55858.2022.9953240.

13.

Zhang

Cheng

Tian

, et al. Non-intrusive load ldentification based on the Markov transition field and a lightweight network. Power Syst Protect Control 2024; 52(17): 51–61.

14.

Qin

Liu

Zhang

, et al. Improved deep residual shrinkage network on near infrared spectroscopy for tobacco qualitative analysis. Infrared Phys Technol 2023; 129: 104575.

15.

Huang

, et al. Simultaneous detection of fruits and fruiting stems in mango using improved YOLOv8 model deployed by edge device. Comput Electron Agric 2024; 227(P1): 109512.

16.

Al Riza

Tulsi

Momin

Assessing cacao beans fermentation degree with improved YOLOv8 instance segmentation. Comput Electron Agric 2024; 227(P1): 109507.

17.

Wang

Zhang

Chen

, et al. Detection of small foreign objects in Pu-erh sun-dried green tea: an enhanced YOLOv8 neural network model based on Deep Learning. Food Control 2025; 168: 110890.

18.

Wang

Lin

Sun

, et al. Enhanced YOLOv8 with attention mechanisms for accurate detection of colorectal polyps. Biomed Signal Process Control 2025; 100(PC): 106942.

19.

Duan

Wang

, et al. M-YOLOv8s: an improved small target detection algorithm for UAV aerial photography. J Vis Commun Image Represent 2024; 104: 104289.

20.

Huang

, et al. Positioning of mango picking point using an improved YOLOv8 architecture with object detection and instance segmentation. Biosyst Eng 2024; 247: 202–220.

21.

Cao

Zhao

, et al. Pyramid-YOLOv8: a detection algorithm for precise detection of rice leaf blast. Plant Methods 2024; 20(1): 149.

22.

Tao

Zeng

, et al. Pavement crack detection and identification based on improved yolov8. Int J Cogn Inform Nat Intell 2024; 18(1): 1–20.

23.

Khan

Shaheen

From data mining to wisdom mining. J Inf Sci 2023; 49(4): 952–975.

24.

Saxena

Gupta

Daniel

Efficient data augmentation via lexical matching for boosting performance on statistical machine translation for Indic and a low-resource language. Multimed Tools Appl 2024; 83(24): 64255–64269.

25.

Young

Hazarika

Poria

, et al. Recent trends in deep learning based natural language processing [review article]. IEEE Comput Intell Mag 2018; 13(3): 55–75.

26.

Zhang

Chen

Obstacle detection: improved YOLOX-S based on swin transformer-tiny. Optoelectron Lett 2023; 19(11): 698–704.

27.

Tirupal

Kumar

Basha

, et al. OPENCV based smart attendance system using facial recognition. In: 2023 4th international conference for emerging technology (INCET), 2023, pp.1–6. 10.1109/INCET57972.2023.10170456.

28.

Lin

, et al. Intelligent lithology identification based on transfer learning of rock images. J Eng Appl Sci 2021; 29(5): 1075–1092.

An intelligent method for NIR-based prediction of cashmere fiber and wool fiber using Markov transition field and improved YOLOv8

Abstract

Keywords

Introduction

Methods

An improved YOLOv8

Markov transition field

Swin Transformer transfer learning

The dropout function

Model loss function

Materials and experiments

Data collection

Spectra collection

Experimental environment and parameter setting

Choice of quantile D in prediction

Model evaluation index

Results and discussion

Prediction performance

Comparison of ablation experiments

Comparison of detection performance of different methods

Comparison of feature aggregation of different methods

Comparison of prediction results of different methods

Conclusion

Footnotes

ORCID iD

Funding

Declaration of conflicting interests

References