A hyperspectral band selection algorithm for identifying high oleic acid peanuts

Abstract

High oleic acid peanuts have higher oleic acid content and stronger oxidation stability than common peanuts, but their appearances are similar, which imposes difficulties for classifying. Based on this, the study aims to classify high oleic acid peanut to ensure its purity by using hyperspectral imaging technology. However, classification accuracy and efficiency are limited given the large amount of redundant information of hyperspectral images. The band iteration algorithm (BIA) is proposed to select characteristic bands by reducing the redundant information between spectral bands for the peanut classification. Hyperspectral images with 616 bands (from 400 nm to 1100 nm) of 126 high oleic acid peanuts and 126 common peanuts were collected. Then, BIA selected optimal bands as characteristic bands from adjacent bands according to the classification accuracy of each band subsets. Thirdly, three classification models, namely linear discriminant analysis, support vector machine, and partial least squares-discriminant analysis (PLS-DA), were employed to compare the performance of BIA with successive projections algorithm and competitive adaptive reweighted sampling, respectively. The experimental results show that BIA can effectively improve the classification ability of spectral data. The BIA-PLS-DA model had the best classification efficiency, and the accuracy of the test set reached 93.26%. For peanut individuals, only one peanut sample was misclassified with a classification error rate of 1.43%.

Keywords

Hyperspectral imaging band selection band iteration algorithm peanut classification

Introduction

Peanut is a valuable oil and cash crop, known for its abundant protein content, unsaturated fatty acids, and other beneficial nutritional compounds.¹ Based on the oleic acid content, peanut can be categorized into two types: common peanut, which consists of around 45% oleic acid and 35% linoleic acid,² and high oleic acid peanut, which must have over 74% oleic acid content.³ High oleic acid peanut contains abundant nutrients and bioactive ingredients, which can help reduce the risk of cardiovascular diseases and regulate blood glucose levels.⁴ Moreover, it possesses stronger oxidative stability, extending the shelf life in the food processing industry, thus leading to widespread promotion.^5,6 However, it is difficult to identify and segregate high oleic acid peanuts from mixed peanuts during industrial production, for high oleic acid and common peanuts are similar in appearance. Therefore, an accurate and rapid non-destructive testing method is needed to ensure the purity of high oleic acid peanuts during production and processing.

Spectral imaging technologies can record images at hundreds of contiguous wavelengths,⁷ offering a more comprehensive information compared to traditional red, green, and blue imaging technology.⁸ With significant recent evolution in imaging technology, the application of hyperspectral imaging sensors has become increasingly promising in recent years, such as seed purity and vigor testing,^9,10 fruit variety identification and growth monitoring,^11,12 and crop quality evaluation.¹³ Previous research has attempted to utilize near infrared reflectance (NIR) spectroscopy to predict the oleic acid content in peanuts, providing an effective non-destructive quantitative detection method for the early screening of peanuts.^14,15 Yu et al.¹⁶ employed NIR spectroscopy to classify high oleic acid peanuts and predict the content of the main fatty acids. Davis et al.³ used a novel instrument (QSorter Explorer) to evaluate the purity of high oleic acid peanuts. Fox and Cruickshank¹⁷ predicted fatty acid content in peanuts with NIR spectroscopy. NIR spectroscopy has also been applied to analyze peanut qualitatively for the rapidity and convenience of industrial production. O'Connor et al.¹⁸ employed partial least squares-discriminant analysis (PLS-DA) to classify the oleic acid content of peanuts based on peanut spectra, achieving an overall classification error rate of 3.3%. Yu et al.¹⁹ constructed a peanut varietal classification model based on machine learning algorithms with NIR spectroscopy.

However, the high dimensional spectral data introduces information redundancy, which leads to the Hughes phenomenon in data processing, that is, the deterioration of classifier performance as feature dimensions increase beyond a critical threshold.^20,21 This not only affects the efficiency of data processing but also limits the accuracy of classification models.

Therefore, research is progressing to extract valuable information from spectral data with machine learning algorithm. Various dimension reduction methods are employed to identify the most representative bands, aiming to enhance the efficiency and accuracy of detection and analysis. Sun et al.²² proposed the successive projections algorithm (SPA) and combined it with stepwise regression, as well as competitive adaptive reweighted sampling (CARS) with stepwise regression, to select characteristic wavelengths for retrieving moisture content distribution of tea leaves. Shao et al.²³ employed SPA to select characteristic wavelengths from spectral data to establish a regression model for predicting the overall quality of tomatoes. Pham et al.²⁴ utilized principal component analysis (PCA) to select effective spectral bands within the wavelength range from 468 nm to 760 nm for online detection of jujube surface defects. Xu et al.²⁵ applied uninformative variable elimination (UVE) to extract characteristic wavelengths, enabling the construction of a classification model for identifying the vigor of maize seeds. Moreover, combining multiple wavelength selection algorithms can enhance the integrity of characteristic wavelengths that reflect the original data. He et al.²⁶ employed SPA, CARS, and UVE to select a total of 31 optimal wavelengths for classifying diploid and triploid maize.

Band selection algorithms are based on various theories, which will impact the selected bands for specific application. Ranking-based methods, such as PCA,²⁷ identify the most informative bands by spatial projection, replacing the original data. But these approaches change the spatial structure of the original spectral data, potentially leading to the loss of crucial information. Search-based methods, like SPA,²⁸ iteratively search for the least redundant wavelengths. Nevertheless, the chosen bands may only capture strong characteristic information, which usually not represent the complete band information. In recent years, intelligent optimization algorithms have been increasingly employed for band selection, including the simulated annealing algorithm and genetic algorithm.^29,30 These algorithms address wavelength combination optimization problems by simulating natural processes. However, the optimization process of these algorithms is highly complex and prone to getting trapped in local optima.

The main objectives of this study were to propose a band selection algorithm, namely band iteration algorithm (BIA), to eliminate the redundancy of adjacent bands while preserving the spatial structure and spectral details of the original spectrum, and to construct an optimal model for classifying high oleic acid peanuts based on classification accuracy.

Materials and methods

Experimental sample preparation

To preliminarily validate the feasibility of BIA, eight peanut varieties were selected as experimental samples, with a variety of high oleic peanut Huayu 910, while the others are common peanuts, as listed in Table 1. The samples were foundation seeds that produced in 2022 and 2023, and were divided into a training set (91 high oleic acid and 91 common peanuts), a test set (35 high oleic acid and 35 common peanuts), as listed in Table 2.

Table 1.

Typical analysis by variety.^31,32

Peanut variety	Oleic acid content (%)	Type
Huayu 60	45.3	Common
Huayu 6301	48.9	Common
Huayu 910	79.3	High oleic acid
Weihua 8	50.5	Common
Jihua 5	41.1	Common
Luhua 1	38.6	Common
Tianfu 3	52.7	Common
Xinhua 4	40.1	Common

Table 2.

Peanut sample composition of training set and test set.

	Training set		Test set
	High oleic acid	Common peanut	High oleic acid	Common peanut
Foundation seeds in 2022	15	21^b	5	14^a
Foundation seeds in 2023	76	70^c	30	21^b
Total	91	91	35	35

^aTwo samples of each variety of common peanut.

^bThree samples of each variety of common peanut.

^c10 samples of each variety of common peanut.

Spectral image acquisition and preprocessing

Peanut spectral data was collected using a hyperspectral imaging system (ISUZU-HSI-VNIR, Isuzu Optics Corp., Taiwan, China) with a dark background, which acquires spectral information from 280 nm to 1167 nm, with 616 spectral bands and an average spectral resolution of 1.44 nm.

Due to limitations of the experimental instrument, the spectral data below 400 nm and above 1100 nm suffered from significant noise.³³ Thus, the actual spectral wavelength range from 400 nm to 1100 nm was selected, resulting in a total of 481 bands for further analysis.

The spectral images were calibrated with formula (1).³⁴

R = \frac{D_{raw} - D_{dark}}{D_{white} - D_{dark}}

(1)

where

R

is the calibrated hyperspectral image,

D_{raw}

is the raw hyperspectral image,

D_{dark}

is the dark current image captured after covering the camera with a black lens, and

D_{white}

is the white image captured by the camera on a standard white Teflon board.

To eliminate the interference caused by background noise, spectral reflectance is applied to mask peanut spectral images with a threshold of 0.06 (dimensionless). It ensured that the spectral reflectance value of the sample peanuts remained stable while setting the background’s spectral reflectance value to zero. Figure 1(a) shows the average spectra of the eight varieties of peanuts after removing background noise.

Figure 1.

The average spectra (a) after removing background noise, and (b) after moving average preprocessing.

In this study, 50 spectral data points were extracted from the corrected spectral image of each peanut, and these collection points will be guaranteed to be in the normal spectral images of the peanuts. Consequently, there are 9100 points in the training set and 3500 points in the test set.

To further alleviate noise and improve the availability of data, the corrected spectral data was preprocessed by moving average (MA) with 21 smoothing points.³⁵ Figure 1(b) shows the average spectra after preprocessing. There are no significant differences in the spectral reflectance curves of different peanut varieties. Within the spectral range of 400 nm to 900 nm, the reflectance increases rapidly with increasing wavelength. Subsequently, the reflectance decreases gradually, accompanied by the appearance of an absorption valley within the range of 950 nm to 1070 nm.

The absorption region in the NIR spectroscopy curve is influenced by the stretching vibrations of various chemical groups, and the intensity of absorption is proportional to the content of these chemical groups.^36–38 In the spectral region of 950 nm to 1070 nm, the appearance of absorption valleys can be attributed to the third overtone C-H stretches.³⁹ Peanut oil composition contains a significant number of C-H groups, explaining the presence of these absorption valleys. Notably, Huayu 910 has a deeper spectral absorption valley than other peanut varieties, which indicates there is a higher oil content in Huayu 910 than others.

Band iteration algorithm

Proposed method

The flowchart of the BIA is depicted in Figure 2. The algorithm’s main principle is to evaluate the classification performance of the training sets corresponding to individual bands within a set of candidate bands. It selects the bands which can achieve the highest classification accuracy, called characteristic bands. In each iteration, the training data incorporates the previously selected characteristic bands, accumulating their effects as the number of iterations increases.

Figure 2.

The flowchart of the band iteration algorithm.

For the full-spectrum bands $P_{1}, P_{2}, \dots, P_{n}$ ( $n$ denotes the total number of spectral bands) is partitioned into a new data set $I$ , which is defined as:

I = I_{1}, I_{2}, \dots, I_{i}, \dots, I_{s}

(2)

s = {\begin{array}{l} [\frac{n}{t}] + 1, & [\frac{n}{t}] \neq 0 \\ [\frac{n}{t}], & [\frac{n}{t}] = 0 \end{array}

(3)

where,

t

denotes the number of bands for a band subset.

I_{i}

denotes a band subset.

[x]

denotes the largest integer not larger than

x

. When

[\frac{n}{t}] \neq

0 is true, the missing bands’ information will be filled with 0.

The characteristic band set B is initialized to an empty set.

So, $I_{i} = P_{(i - 1) t + 1}, P_{(i - 1) t + 2}, \dots, P_{(i - 1) t + j}, \dots, P_{(i - 1) t + t}$ .

First step, $I_{1}$ is processed. A training set is defined as $Q$ :

Q = Q_{1}, Q_{2}, \dots, Q_{j}, \dots, Q_{t}

(4)

where,

Q_{j} = B \cup P_{j} \cup (I_{2}, I_{3}, \dots, I_{i}, \dots, I_{s}

Each subset $Q_{j}$ in training set Q is input to the classifier, it generates the corresponding training accuracy, and establishes the complete training accuracy set C. If the classification accuracy of $Q_{j}$ is higher, its corresponding band $P_{j}$ is more suitable for classification than other candidate bands of $I_{1}$ . Therefore, band $P_{j}$ is chosen as the characteristic band in $I_{1}$ , which corresponds to the largest accuracy value of training subset. If the maximum accuracy in a same subset corresponds to multiple bands, the band which closer to the middle of subset will be selected.

Then, update dataset $I$ as:

I = I_{2}, I_{3}, \dots, I_{i}, \dots, I_{s}

(5)

and update characteristic band set

B

as:

B = B \cup P_{j}

(6)

Each subset is processed in turn according to the above steps, and the selected characteristic bands are iteratively processed.

After several iterations, the characteristic bands corresponding to the whole dataset $B$ can be obtained. The whole process of the proposed BIA is illustrated in Algorithm 1, which is coded in Python 3.10.6.

Building classification model

To validate the effectiveness of BIA in selecting characteristic bands, a comparison was made with SPA and CARS. Three classification algorithms, namely linear discriminant analysis (LDA),⁴⁰ PLS-DA,⁴¹ and support vector machine (SVM),⁴² were employed to establish the classification models. These algorithms are based on different principles, resulting in variations in their classification performance. Based on the classification results, the best-performing classification model was determined.

Model evaluation method

To analyze the effectiveness of classification models more accurately, four evaluation indexes, accuracy (ACC), precision (P), recall (R), and F1 score (F1), are introduced:

ACC = \frac{TP + TN}{TP + TN + FP + FN}

(7)

P = \frac{TP}{TP + FN}

(8)

R = \frac{TP}{TP + FN}

(9)

F 1 = \frac{2 \cdot P \cdot R}{P + R}

(10)

where

TP

represents the number of high oleic acid peanuts spectral data points that are predicted correctly;

TN

is the number of common peanuts spectral data points that predicted correctly;

FP

represents the number of high oleic acid peanuts spectral data points that are predicted incorrectly;

FN

represents the number of common peanuts spectral data points that are predicted incorrectly.

Results and discussion

Band iteration algorithm characteristic bands selection

The BIA can select characteristic bands at different band intervals, when it is combined with LDA, PLS-DA and SVM classifiers. The band intervals (the value of t), include 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10. Figure 3 illustrates the experimental results of each classifier using different band numbers in the subset. It can be observed that the classification accuracy initially increases and then decreases for each algorithm. This trend indicates that the selection of characteristic bands by BIA effectively reduces redundancy between adjacent bands, which is conducive to enhancing the classification accuracy. However, if the number of bands in the subset is too large, the band selected by BIA may not fully represent the information of the entire band subset, leading to a loss of spectral data and reduction of accuracy. Among the different band subsets, the combination of PLS-DA with BIA, specifically utilizing three bands (named BIA-3) achieved the highest classification accuracy. Hence, it can be concluded that the optimal number of bands in the subset is 3, the optimal classifier for selecting characteristic bands is PLS-DA, and their combined classification model is termed BIA-3-PLS-DA.

Figure 3.

Classification accuracy of different classifiers based on different band intervals.

Classification results

Table 3 shows the classification results of different models. The BIA-3-PLS-DA model achieved a test set accuracy of 93.26%. The test set accuracy of the CARS model is slightly lower than that of BIA, while the SPA model suffered from an underfitting problem. For the classification of peanut types, the BIA-3-PLS-DA model has a higher classification accuracy for high oleic acid than common peanut.

Table 3.

Accuracy of different classification models.

Classification model	Training set accuracy/%			Test set accuracy/%
Classification model	High oleic acid	Common peanut	Average	High oleic acid	Common peanut	Average
SPA-LDA	87.3	73.2	80.3	73.5	76.9	75.2
SPA-SVM	86.5	71.4	78.9	79.2	79.9	79.6
SPA-PLS-DA	88.5	72.2	80.4	75.7	76.7	76.2
CARS-LDA	97.3	95.0	96.2	95.9	89.0	92.5
CARS-SVM	91.2	91.3	91.2	87.7	87.7	87.7
CARS-PLS-DA	97.3	95.0	96.2	95.9	88.9	92.4
BIA-3-PLS-DA	96.8	95.1	96.0	96.5	90.1	93.3

Additional evaluation metrics computed based on the classification results are presented in Table 4. All metrics for BIA-PLS-DA are superior to those of the other models. This observation suggests that the BIA-PLS-DA model is more robust in the classification of high oleic acid peanut, and further verifies the effectiveness of BIA in identifying high oleic acid peanut through the selection of characteristic bands.

Table 4.

Classification model evaluation indexes.

Classification model	p	R	F1
SPA-LDA	0.761	0.74	0.748
SPA-SVM	0.798	0.79	0.795
SPA-PLS-DA	0.765	0.76	0.761
CARS-LDA	0.897	0.96	0.927
CARS-SVM	0.877	0.88	0.877
CARS-PLS-DA	0.896	0.96	0.926
BIA-3-PLS-DA	0.907	0.97	0.935

The classification accuracy of single peanuts is a key metric to evaluate the accuracy of the overall model. The average classification results of 50 spectral data points for each peanut were calculated to reflect the classification of peanut individuals. The classification effect of the BIA-3-PLS-DA model for the single peanut in the test set is shown in Figure 4. Only one common peanut was misclassified, with a classification error rate of 1.43%.

Figure 4.

BIA-3-PLS-DA prediction of the test set (35 high oleic acid and 35 common peanuts).

Classification results visualization

Figure 5 shows visualization results of the best model built by the three band selection algorithms. The visualization sample set consists of 45 peanuts and is divided into sample 1 and sample 2. Sample 1 consists of peanut foundation seeds in 2022 (5 high oleic acid and 15 common peanuts), and sample 2 consists of peanut foundation seeds in 2023 (10 high oleic acid and 10 common peanuts).

Figure 5.

Visualization of classification effect. (White boxes indicate misclassified peanuts).

The pixel-wise classification is applied to each peanut, and a high oleic acid peanut is identified based on the pixel number of high oleic acid class in a peanut area. Specifically, if the ratio of pixels classified as high oleic acid is greater than a given threshold (α), the peanut is classified as high oleic acid peanut.^43,44 The value of threshold directly influenced the classification outcome, and α is determined to be 0.8 based on empirical values.

In Figure 5, common peanuts are represented by red, high oleic acid peanuts are represented by blue, and misclassified peanuts are indicated by white boxes. The results show that BIA-3-PLS-DA has a superior correct classification rate than other models.

Conclusions

In this study, a novel band selection algorithm was proposed, named BIA, which was employed to select characteristic bands for the classification of high oleic acid peanuts. Compared with two other characteristic band selection algorithms (SPA, CARS), the PLS-DA model based on BIA can obtain the highest accuracy on the test set, and the low misclassification rate of peanut individuals. In conclusion, this study offers an effective band selection method to enhance the efficiency and accuracy of identification of high oleic acid peanuts with spectral images. However, the experiment employed a limited number of sample varieties, which could adversely affect the classification performance of high oleic acid peanuts. It is necessary to increase more numbers and varieties of samples to assess the feasibility of BIA and the stability of the classification model in the further studies.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Open Fund of Infrared and Low Temperature Plasma Key Laboratory of Anhui Province [grant number IRKL2023KF04]; Anhui Provincial DOHURD Science Foundation [grant number 2022-YF077]; Hubei Key Laboratory of Optical Information and Pattern Recognition [grant number 202204]; and Doctoral Starting up Foundation of Anhui Jianzhu University [grant number 2020QDZ05, 2022QDZ03].

ORCID iD

Xingyun Li

References

Çiftçi

Suna

. Functional components of peanuts (Arachis hypogaea L.) and health benefits: a review. Future Foods 2022; 5: 100140.

Chamberlin

Grey

Puppala

, et al. Comparison of field emergence and thermal gradient table germination rates of seed from high oleic and low oleic near isogenic peanut lines. Peanut Sci 2021; 48: 131–143.

Davis

Agraz

Kline

, et al. Measurements of high oleic purity in peanut lots using rapid, single kernel near-infrared reflectance spectroscopy. J Am Oil Chem Soc 2021; 98: 621–632.

Derbyshire

. A review of the nutritional composition, organoleptic characteristics and biological effects of the high oleic peanut. Int J Food Sci Nutr 2014; 65: 781–790.

Nawade

Mishra

Radhakrishnan

, et al. High oleic peanut breeding: achievements, perspectives, and prospects. Trends Food Sci Technol 2018; 78: 107–119.

Talcott

Passeretti

Duncan

, et al. Polyphenolic content and sensory properties of normal and high oleic acid peanuts. Food Chem 2005; 90: 379–388.

Cao

Zhang

Chen

, et al. Identification of species and geographical strains of Sitophilus oryzae and Sitophilus zeamais using the visible/near-infrared hyperspectral imaging technique. Pest Manag Sci 2015; 71: 1113–1121.

Okamoto

Murata

Kataoka

, et al. Plant classification for weed detection using hyperspectral imaging with wavelet analysis. Weed Biol Manag 2007; 7: 31–37.

Liu

Zeng

, et al. Rice seed purity identification technology using hyperspectral image with lasso logistic regression model. Sensors 2021; 21: 4384.

10.

Liu

Jiang

, et al. Comparison of partial least squares-discriminant analysis, support vector machines and deep neural networks for spectrometric classification of seed vigour in a broad range of tree species. J Near Infrared Spectrosc 2021; 29: 33–41.

11.

Huang

Yang

Sun

, et al. Identification of apple varieties using a multichannel hyperspectral imaging system. Sensors 2020; 20: 5120.

12.

Benelli

Cevoli

Fabbri

, et al. Ripeness evaluation of kiwifruit by hyperspectral imaging. Biosyst Eng 2022; 223: 42–52.

13.

Jiang

, et al. Rapid and non-destructive detection of natural mildew degree of postharvest Camellia oleifera fruit based on hyperspectral imaging. Infrared Phys Technol 2022; 123: 104169.

14.

Sundaram

Kandala

Butts

, et al. Nondestructive NIR reflectance spectroscopic method for rapid fatty acid analysis of peanut seeds. Peanut Sci 2011; 38: 85–92.

15.

Tillman

Gorbet

Person

. Predicting oleic and linoleic acid content of single peanut seeds using near-infrared reflectance spectroscopy. Crop Sci 2006; 46: 2121–2126.

16.

Liu

Wang

, et al. Evaluation of portable and benchtop NIR for classification of high oleic acid peanuts and fatty acid quantitation. LWT--Food Sci Technol 2020; 128: 109398.

17.

Fox

Cruickshank

. Near infrared reflectance as a rapid and inexpensive surrogate measure for fatty acid composition and oil content of peanuts (Arachis hypogaea L.). J Near Infrared Spectrosc 2005; 13: 287–291.

18.

O’Connor

Meder

Furtado

, et al. Single kernel sorting of high and normal oleic acid peanuts using near infrared spectroscopy. J Near Infrared Spectrosc 2021; 29: 366–370.

19.

Erasmus

Wang

, et al. Rapid classification of peanut varieties for their processing into peanut butters based on near-infrared spectroscopy combined with machine learning. J Food Compos Anal 2023; 120: 105348.

20.

Shahshahani

Landgrebe

. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. IEEE Trans Geosci Rem Sens 1994; 32: 1087–1095.

21.

Hughes

. On the mean accuracy of statistical pattern recognizers. IEEE Trans Inf Theor 1968; 14: 55–63.

22.

Sun

Zhou

, et al. Visualizing distribution of moisture content in tea leaves using optimization algorithms and NIR hyperspectral imaging. Comput Electron Agric 2019; 160: 153–159.

23.

Shao

Shi

Qin

, et al. A new quantitative index for the assessment of tomato quality using Vis-NIR hyperspectral imaging. Food Chem 2022; 386: 132864.

24.

Thien Pham

Liou

N-S

. The development of on-line surface defect detection system for jujubes based on hyperspectral images. Comput Electron Agric 2022; 194: 106743.

25.

Zhang

Tan

, et al. Vigor identification of maize seeds by using hyperspectral imaging combined with multivariate data analysis. Infrared Phys Technol 2022; 126: 104361.

26.

Liu

, et al. Discriminant analysis of maize haploid seeds using near-infrared hyperspectral imaging integrated with multivariate methods. Biosyst Eng 2022; 222: 142–155.

27.

Wold

Esbensen

Geladi

. Principal component analysis. Chemometr Intell Lab Syst 1987; 2: 37–52.

28.

Araújo

Saldanha

Galvão

, et al. The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemometr Intell Lab Syst 2001; 57: 65–73.

29.

Pei

Huang

, et al. A two-step simulated annealing algorithm for spectral data feature extraction. Sensors 2023; 23: 893.

30.

Aghaee

Momeni

Moallem

. Semisupervised band selection from hyperspectral images using levy flight-based genetic algorithm. Geosci Rem Sens Lett IEEE 2022; 19: 1–5.

31.

Institute of Crop Sciences . CAAS. Crop list-CGRIS. 2021, https://www.cgris.net/query/croplist.php (accessed 28 June 2023).

32.

A-seed . Information inquiry of registered varieties 2018. https://www.a-seed.cn/index.php (accessed 28 June 2023).

33.

Zhang

Wei

Zhao

, et al. A reliable methodology for determining seed viability by using hyperspectral data from two sides of wheat seeds. Sensors 2018; 18: 813.

34.

Zhang

Wang

Liu

, et al. Vis-NIR hyperspectral imaging combined with incremental learning for open world maize seed varieties identification. Comput Electron Agric 2022; 199: 107153.

35.

Zhang

Liu

. Identification of coffee bean varieties using hyperspectral imaging: influence of preprocessing methods and pixel-wise spectra analysis. Sci Rep 2018; 8: 2166.

36.

Amsaraj

Mutturi

. Support vector machine-based rapid detection and quantification of butter yellow adulteration in mustard oil using NIR spectra. Infrared Phys Technol 2023; 129: 104543.

37.

Zhang

Gao

Yang

, et al. Rapid identification of the storage age of dried tangerine peel using a hand-held near infrared spectrometer and machine learning. J Near Infrared Spectrosc 2022; 30: 31–39.

38.

Kobayashi

K-I

Matsui

Maebuchi

, et al. Near infrared spectroscopy and hyperspectral imaging for prediction and visualisation of fat and fatty acid content in intact raw beef cuts. J Near Infrared Spectrosc 2010; 18: 301–315.

39.

Rabanera

Guzman

Yaptenco

. Rapid and non-destructive measurement of moisture content of peanut (Arachis hypogaea L.) kernel using a near-infrared hyperspectral imaging technique. J Food Meas Char 2021; 15: 3069–3078.

40.

Chang

Nie

Wang

, et al. Self-weighted learning framework for adaptive locality discriminant analysis. Pattern Recogn 2022; 129: 108778.

41.

Bevilacqua

Marini

. Local classification: locally weighted–partial least squares-discriminant analysis (LW–PLS-DA). Anal Chim Acta 2014; 838: 20–30.

42.

Hoła

Czarnecki

. Random forest algorithm and support vector machine for nondestructive assessment of mass moisture content of brick walls in historic buildings. Autom ConStruct 2023; 149: 104793.

43.

Yuan

Jiang

Gong

, et al. Moldy peanuts identification based on hyperspectral images and point-centered convolutional neural network combined with embedded feature selection. Comput Electron Agric 2022; 197: 106963.

44.

Jiang

Cui

, et al. Moldy peanut kernel identification using wavelet spectral features extracted from hyperspectral images. Food Anal Methods 2020; 13: 445–456.