Review: The evolution of chemometrics coupled with near infrared spectroscopy for fruit quality evaluation. II. The rise of convolutional neural networks

Abstract

The Part 1 prequel to this review evaluated the evolution of modelling techniques used in evaluation of fruit quality over the past three decades and noted a progression towards the use of artificial neural networks (ANNs) and convolutional neural networks (CNNs). In this review, Part 2, the use of CNNs for NIR fruit quality evaluation is explored, given the success of CNNs in various other fields, such as image, video, speech, and audio processing, and the availability of large (open source) datasets of fruit spectra and reference quality attribute, which is required for the training of CNN models. The review provides an overview of deep learning and the CNN architectures and techniques used in NIR spectroscopy for regression modelling, with advantages and disadvantages identified. Studies using CNN for NIR based fruit quality evaluation are then critically examined. Eight publications have presented on models using the same open-source mango dry matter calibration and test set, enabling inter-method comparisons. CNN models have been demonstrated to be accurate, precise and robust. Techniques of transfer learning for CNN models offer an alternative solution to model updating and calibration transfer methods applied in traditional chemometrics. The review has highlighted crucial areas that require resolution and exploration in this application through future research, including, (i) data requirements for training a CNN (ii) optimal spectral pre-processing for CNN (iii) CNN architecture and hyper-parameter selection and tuning for fruit quality evaluation (iv) CNN model interpretability and explainability. Future studies must conduct clearer comparison to partial least squares (PLS) regression and shallow ANNs to better assess the prospective benefit of using CNN, a more complex model. The potential for visualisation of spectra relevance to the CNN model using techniques such as GradCam, currently employed in visualising 2D-CNN models, remains to be explored.

Keywords

Fruit review point spectroscopy chemometrics deep learning artificial neural network dry matter content

Introduction

Scope

Many industries and applications are accelerating their use of artificial intelligence, implementing machine learning over traditional statistical modelling methods. Traditionally, mechanistic models were used to predict future outcomes, with these models based on an understanding of the underlying mechanism, represented with a statistical model or specific rule set. Machine learning is different in that rather than coding the underlying knowledge, machine learning methods automatically learn patterns and relationships from provided examples.¹ This leads to a shift in focus to appropriate training of the models, such as use of sizable data sets and avoiding overfitting by selection of suitable training data.

This trend is relevant to applications involving near infrared (NIR) spectroscopy. As exemplified by the pioneering work in this discipline by Norris,² the traditional approach to model development involved a statistician with an underlying knowledge of chemistry and related spectroscopy of a commodity, i.e., a chemometrician, working with relatively small data sets. Improvements in computing power and availability of large data sets that span sample conditions have supported a trend to use of machine learning techniques.

The preceding review by Anderson and Walsh³ detailed the evolution of modelling techniques employed in the field of NIR spectroscopy for fruit quality evaluation over the past three decades. A progression in modelling technique usage was noted from multiple linear regression (MLR) to the use of partial least squares (PLS) regression and more recently to techniques such as support vector machines (SVM), artificial neural networks (ANN) and convolutional neural networks (CNN). PLS regression models were noted to have dominated horticultural applications for over two decades. It was also noted that many studies used a validation set drawn from the same population (growing conditions) as the calibration set.³ Those studies that dealt with multiple populations varying in growing condition, season, cultivar, etc., noted a need for model updating.³

The availability of larger training datasets has enabled the development of deep learning techniques which have shown promise in their ability to generalise across seasonal differences and potentially instruments. For example, shallow (single hidden layer) ANN models have been employed in spectroscopy for over two decades.⁴ ANNs have been demonstrated to outperform PLS regression models in some specific cases⁵ and have been adopted into commercial use, e.g., across the FOSS Infratec instruments.⁶ Additionally, several large datasets of fruit spectra and quality attributes have been publicly released^7,8 which enables direct comparison of models developed by any researcher using the same training and test sets.

The current review extends the consideration of Anderson and Walsh³ on the evolution of modelling techniques for NIR spectroscopy for fruit quality evaluation, with focus to the use of convolutional neural networks (CNNs), which is a particular form of deep learning. Given that the application of CNN is relatively new to the field of NIR spectroscopy, an introduction is provided to deep learning and the CNN architecture, followed by a consideration of the use of CNNs in NIR spectroscopy in general. For example, while spectral pre-treatment techniques have largely been developed to accentuate information specifically for PLS regression modelling, there is some suggestion that CNN models may require less pre-treatment.⁹ CNN applications for the assessment of fruit quality using NIR spectroscopy are then examined, with advantages and limitations of the technique deduced and areas for refinement and improvement suggested.

Conduct of the literature survey

A systematic approach was adopted to identify relevant articles for review in the sections CNN for NIR spectroscopy and CNN-NIR in fruit quality evaluation. The searches were conducted programmatically across three popular databases, Scopus, Web of Science and Google Scholar, utilising open-source python packages, pybliometrics, wos and scholarly, available from the Python Package Index (PyPI).¹⁰ Search queries were confined to the period from 01/01/2015 to 20/11/2022, and involved search of title, abstract and keywords of each article. Pre-prints and non-peer reviewed articles were excluded.

Slight differences were noted between the querying ability of the three databases used. Scopus provides the functionality to search across title, abstract and keywords. Web of Science, however, can only search each section separately. For example, for a search on the string “CNN” AND “NIR”, an article containing “CNN” in the title and “NIR” in the abstract but not the title will be captured by Scopus but not Web of Science. Google Scholar does not provide the functionality to limit searches to abstract or keywords, only to title or the full text. Many non-relevant results were returned when searching across the full text, so Google Scholar searches were restricted to title only.

The specific query term used was refined over several searches to increase the relevancy of the returned results. Table 1 presents the process of refinement used and the number of results returned from each database for each search. The iterative process of refinement involved review of the results of each search to terms associated with non-relevant articles. For example, as CNN was developed for use in image analysis, there were many results involving hyper and multi-spectral studies for searches using the string “NIR and CNN”. The search was broadened in query six by inclusion of the term deep learning to increase the number articles captured for fruit applications.

Table 1.

Summary of returned results on the 20^th of November 2022 for search queries during refinement.

Query	Search term	Scopus	Web of science	Google scholar
(1)	(“Convolutional neural network” OR “CNN”) AND (“near infrared spectroscopy” OR “NIR”)	397	275	∼6670^a47
(2)	(1) AND NOT “hyperspectral”	357	250	45
(3)	(2) AND NOT “image”	176	183	42
(4)	(3) AND NOT “fruit”	158	176	42
(5)	(3) AND “fruit”	18	8	0
(6)	(“Deep learning” OR “convolutional neural network” OR “CNN”) AND (“near infrared spectroscopy” OR “NIR”) AND NOT “hyperspectral” AND NOT “image” AND “fruit”	26	15	2

^aQuery across full text.

The topic of CNN used in context of NIR spectroscopy in general was reviewed based on query 4, (“convolutional neural network” OR “CNN”) AND (“near infrared spectroscopy” OR “NIR”) AND NOT “hyperspectral” AND NOT “fruit”, with 211 unique articles identified from the three databases. These articles were screened to remove those with the term “classification”, “discrimination”, “detection”, “recognition”, “FNIRS”, “imaging” or “images” present in the title, abstract or keywords. This step was conducted in Python on the combined results rather than the individual database searches. This resulted in 51 articles which were then individually manually screened for relevancy. A further 18 articles were removed due to not meeting the criteria after manual screening, resulting in a total of 33 articles which are reviewed in the section CNN for NIR spectroscopy. This survey method is presented in Figure 1(a).

Figure 1.

Literature survey procedure for the topic of (a) CNN used in NIR spectroscopy, and (b) CNN used in NIR spectroscopy for the application of fruit quality evaluation.

The review was then narrowed to use of CNN in NIR spectroscopy related to fruit quality evaluation using the search term (“deep learning” OR “convolutional neural network” OR “CNN”) AND (“near infrared spectroscopy” OR “NIR”) AND NOT “hyperspectral” AND NOT “image” AND “fruit” (query 6), with 29 unique articles identified and manually screened. A total of 22 articles (2 review papers, nine classification type models and 11 regression models) were reviewed (CNN-NIR in fruit quality evaluation). The survey method for this section is presented in Figure 1(b).

Deep learning

This section provides a brief introduction to ‘deep learning’ with a specific focus on CNN architecture. ANNs are a commonly used modelling technique for non-linear systems inspired by the biological neural networks that make up animal brains.¹¹ Deep learning modelling techniques are classified as ANNs with more than three layers which utilise representation learning which is either supervised, semi-supervised or unsupervised.^12,13 There are many ANN architectures that are classed as deep learning, including CNN, recurrent neural networks and recursive neural networks.¹⁴

Artificial neural networks

An ANN is made from numerous interconnected nodes, known as artificial neurons, which receive and process information to signal connected neurons in the following layer.^13,15 Neurons are arranged into layers of the network referred to as the input, hidden and output layers. Each hidden layer performs a transformation function on inputs, while the final output layer produces the prediction of the model. A neuron typically processes information using a non-linear function, with the exception of the output layer. The connections between neurons are weighted based on the training of the model. A bias can also be used shift the output of a neuron.

An ANN is typically trained via supervised learning using a large training data set containing the input and the desired output, known as the supervisory signal.¹ The ANN classically begins with a random set of weights for each neuron connection. The error of the processed output of the ANN compared to the supervisory signal is then computed before updating the weights based on a predefined learning rule to decrease the error. The iterative learning process is terminated based on an error criterion, upon which the final model weights are set.

ANN allows complex problems with many inputs to be modelled with relative ease with many advantages over traditional statistical models for practical applications.¹⁴ Increasingly complex non-linear systems are being accurately modelled with deeper ANN architectures enabled by enhanced computing power and optimised training method.¹⁶

Convolutional neural networks

A CNN is a specific feedforward ANN architecture consisting of one or more convolutional layers.¹⁷ A convolutional layer performs a mathematical operation on the input data that can extract meaningful features while reducing the dimensionality of the data. The technology was developed in context of object recognition within images, with the LeCun et al.¹⁸ article “Gradient-Based Learning Applied to Document Recognition” (which has over 25,000 paper citations) being a seminal study in the development of CNNs. This work proposed a method for grouping simple features into progressively more complex features for handwritten character recognition. The success of this work is seen in current use of the technique in postal sorting of handwritten postcodes on envelopes.

A typical CNN is comprised of an input and an output layer sandwiching convolution and pooling/sub-sampling layers. The training process involves the development of filters in the convolution layers that extract information (‘features’) relevant to the attribute level. Data is down-sampled with each layer, such that simpler features are learned in the early layers and more complex features are learned in deeper layers. In image analysis using CNN, a visualisation of the convolutional layer output has been used to highlight features used in the model.^19,20

A CNN architecture can offer computational benefits over a fully connected ANN of the same number of neurons, while maintaining prediction accuracy.¹⁷ Three technique advantages have been documented: (i) In a convolutional layer, a neuron is not connected to every neuron of the previous layer, reducing the number of parameters that need to be trained and increasing the speed of convergence when training. (ii) A CNN can further reduce the parameters by setting groups of neuron connections to share weights. (iii) A pooling layer within the CNN down-samples the input data while retaining useful features, which reduces the amount of data and possibly further reduces the number of model parameters.

Convolutions can be one, two or three dimensional, which refers to the number of directions the convolutional filter can move about an axis of the input data. Higher dimensional convolutions require significantly more computational power to train and implement due to their added complexities.¹⁴

CNN applications

The LeCun et al.¹⁸ application of CNN sparked a wide uptake of such models for a range of applications in the 2000s, but they fell by the wayside with the success of new generation machine learning techniques such as support vector machines. This can be attributed to the need for larger training datasets for a deep CNN to achieve better generalisation proficiency and the need for greater computational resources needed to train a CNN with a large dataset. However, modern graphical processing capacity allows CNN models to be trained on massive datasets by parallelising the computation. This increase in speed of training, coupled with the success of the ‘AlexNet’ model by Krizhevsky et al.²¹ has made 2D CNNs the ‘go to’ method for classification and recognition in image analysis.²² In recent years, CNN has also gained popularity with the remote sensing community in dealing with multispectral image sets, typically using visible and short wave NIR.²³

CNN architectures were originally developed for 2D data and in initial applications 1D data was represented in 2D. This approach is computationally expensive and requires large training datasets. CNN architectures have now been specifically developed for 1D data and have become widely used in many signal processing applications.^24,25 The architecture for A 1D-CNN is more compact than those used for 2D data. Spectroscopy datasets are typically 1D and thus benefit computationally from use of a compact 1D-CNN. This may allow implementation on portable devices with limited computing capability.

CNN for NIR spectroscopy

In this section applications of CNN in NIR spectroscopy for regression modelling are discussed. As per the conduct of the literature survey, hyperspectral imaging applications and classification problems were excluded. The first applications of CNN models to spectroscopy ^26–28 did not occur for two decades following the seminal LeCun et al.¹⁸ work. The technique has not been widely adopted by the NIR community, although publications are increasing, as documented in the literature survey (for CNN regression models in NIR spectroscopy, excluding fruit) result of 2, 5, 7, 7 and 12 papers in each of 2018, 2019, 2020, 2021 and 2022, respectively. To the authors knowledge, however, the CNN technique has not been adopted by any NIR instrument vendor, indicating that the advantage of the technique over other methods is not yet fully demonstrated for this application.

Cui and Fearn²⁸ used three different wheat NIR datasets to conduct systematic tests to compare CNN and PLS regression approaches. Their smallest dataset consisted of 415 training and 108 testing samples while their largest dataset had 6987 training and 618 independent test samples. The proposed CNN model, consisting of an input layer, convolutional layer, three fully connected layers and an output layer, was more accurate, less noisy, and more robust than a traditional PLS regression for their datasets. Further, they demonstrated that the output of the first layer of the trained CNN model was similar to the output of a Savitzky-Golay (first derivative) smoothing filter, i.e., the first convolutional layer of the CNN acted similarly to a commonly used spectral pre-processing technique. A method was also proposed to numerically visualise the regression coefficients of the CNN in order to quantify wavelength importance and aid in interpretation of the model. They concluded CNN model hyperparameters can be optimised and selected via cross validation and that CNN models could be used in prediction of sample attributes using NIR data. For completeness, Cui and Fearn²⁸ could have added comparison of a simple ANN model to the results of the CNN and PLS regression. This would have provided evidence of the value of adding additional layers, including a convolutional layer.

A general criticism of the use of CNNs and ANNs is they provide essentially black box predictions. There is an adage in the NIR scientific community that there ‘should be no prediction without interpretation, and no interpretation without prediction’, i.e., that a model tested to predict well should be interpretable in terms of the band assignments of spectral features.²⁹ Indeed, domain expert chemometricians can hand craft models based on the underlying chemistry of the desired attribute in each given sample by the selection of wavelengths. The ability to visualise the features extracted by the convolutions of a CNN is thus important for a general understanding of how the model is working.²⁸ Further work is required to adapt the visualisation approaches used in 2D-CNN, including use of GradCAM and SHAP.³⁰

Since the Cui and Fearn²⁸ initial application, CNN regression models have been tested in a range of other NIR spectroscopy applications, with the dominant application being soil related, with 17 papers found.^31–47 A possible explanation for this focus is that assessing soil attributes using NIR spectra is complex, given the non-homogeneity of the medium, and likely involvement of secondary correlations. This results in researchers exploring new methods to improve NIR in this application. Other application ‘hot spots’ include grain,^{28,37,48–52} organic matter such as leaves, wood and beans,^53–59 food powders,^37,60,61 oil⁶² and brain,⁶³ with 7, 7, 3, one and one papers, respectively.

The methodology employed by most of these papers is a comparison of the performance of the CNN model to a PLS regression model for a particular dataset. This benchmarking is laudable, however PLS regression models require operator involvement in the choice of pre-processing technique, the selection of wavelength inputs and the choice of number of latent variables. There is a risk in these comparative studies that much effort is put into optimising the CNN model while little to no effort is put into optimization of the PLS regression model. The ideal situation involves use of a publicly available data set, with a benchmark optimal PLS regression result and potentially other simpler methods such as support vector regression (SVR), Cubist, memory based learner (MBL) and shallow ANN. By providing a thorough benchmark for a particular dataset, the true value of more complex modelling techniques can be more accurately be assessed.

Due to the complexity of CNN models, it is widely accepted that large datasets are needed to effectively train the model. For example, Ng et al.⁴⁴ examined the influence of training sample size for prediction of soil properties from NIR spectra. They concluded that for their specific CNN architecture and for their soils, the CNN model was more accurate than other techniques only when the size of the training set was greater than 2000. They also reported that the performance of other models plateaued with about 5000 samples, while that of the CNN model continued to increase with more training samples up to the total size of their dataset of 9000 samples. It is expected that as more complex CNN architectures increase the number of free parameters which must be trained, more training samples will be required. However, the production of datasets with NIR spectra and reference values is typically a time consuming and expensive process. Many of the published studies identified are thus based on relatively small datasets, often involving less than 500 samples. Even given a performance benefit, the need for a large data set should temper a cost-benefit consideration of the use of CNNs.

The published studies on use of CNNS in NIR spectroscopy applications are also generally weak on documentation of the computational power and time needed to run the CNNs models in prediction. Handheld spectrometers are commonplace in NIR applications and as such information would inform assessment of the potential to use CNN models in such applications.

In summary, the potential advantages of the use of CNN’s in NIR spectroscopy identified from the literature include:

(i) Multiple attribute prediction: For example, Alzubaidi et al.⁶⁴ applied a multi-task CNN model to NIR spectra of a global set of Australian wheat, barley, field pea and lentils, predicting protein, moisture, and grain type. It was concluded that the technique provides greater accuracy than PLS regression based prediction. As demonstrated by this study, CNN allows multiple attributes to be predicted using the one trained model. In contrast, PLS regression models are typically constructed for each attribute, and at most, two attributes. Einarson et al.⁶⁵ compared a multi attribute prediction from a CNN model to a PLS-2 regression model, also finding the CNN model to be more accurate for global modelling. In general, PLS-2 regression should only be employed where the two predicted attributes are highly correlated, while multiple outputs from CNN models do not need to be correlated.

(ii) Use of multiple inputs with data fusion: A CNN architecture can accept multiple sets of inputs, in so-called multi-block data input. The unique sets of input can then have their own parallel layers, such as convolutional layers, before the data is fused internally. For NIR data, this presents the opportunity to feed the neural network with the raw spectra set coupled with various pre-treatments of the data with the aim of increasing the accuracy of the model.

(iii) Interpretation: In general deep learning models are largely considered black box models. Cui and Fearn²⁸ proposed a method to visualise the output of the convolutional layer in their CNN model, allowing for the interpretation of the effect of the convolutional layer which was found to be similar to a Savitzky-Golay first derivative treatment. Other approaches may be applied to visualisation of the regions of the spectra used by the model, e.g. GradCAM.

(iv) Inbuilt pre-processing: NIR spectra frequently contain baseline offsets that vary between samples. In MLR and PLS regression, these offsets are removed using normalization or a derivative (typically first or second derivative) before modelling. Cui and Fearn²⁸ demonstrated that the first convolutional layer in their CNN model produced an output that approximated a first derivative and performed equally to a model developed on derivative absorbance data.

(v) Improved model performance: studies indicate with large data sets CNN models can outperform traditional approaches such as PLS regression, resulting in more accurate, robust and less noise.

(vi) Transfer learning: this is a deep learning technique where a pre-trained neural network model is fine-tuned for a different for a different but related tasked. This enables the knowledge gained by a neural network on a large diverse dataset to bet reused, reducing the amount of data and computational requirements needed for the new task.

CNN-NIR in fruit quality evaluation

This section focusses on the use of CNN in NIR spectroscopy in the context of fruit quality evaluation. Most attention is given to regression type applications, however a brief summary of classification application studies is also provided.

Classification applications

Classification models have been used in a range of fruit quality evaluation applications, such as variety identification and defect detection. Of 20 non-review papers addressing CNN use in fruit quality evaluation, 10 addressed classification issues, with five involving defect identification,^66–70 three addressing variety identification,^71–73 one the level of fruit freshness⁷⁴ and one the harvest readiness of pears.⁷⁵

Defect classification problems included freeze damaged oranges, insect affected pears, mouldy and water core defects in apples and identification of pesticide residue on melon surfaces. Four of these studies compared the CNN model to other techniques, including partial least squares discriminant analysis (PLSDA) and support vector machines (SVM).^66–69 Three of the four studies reported CNN as the most accurate method while Zhang et al.⁶⁹ reported SVM as the most accurate method (see Table 2).

Table 2.

Reports of CNN-NIR in fruit quality classification applications.

Author	Attribute	Samples	Pre-processing	Models and accuracy
Defect detection
Tian et al.⁶⁶	Orange – freeze damage	Training: 76 Validation: 38	Diameter correction	PLSDA: 84.21% SVM: 81.58% CNN: 91.96%
Hao et al.⁶⁷	Pear – insect damage	Training: 768 Validation: 192	PLSDA: SS + CARS SVM: SNV + CARS CNN: SGS	PLSDA: 90.63% SVM: 81.25% CNN, 92.71% (CNN prediction time of 0.032s)
Chang et al.⁶⁸	Apple – water core	Total: 318	n/a	K-nearest neighbours, ANN, CNN, SVM: (Best) 96%
Zhang et al.⁶⁹	Apple – mouldy core	n/a	PLSDA: principal component analysis SVM: principal component analysis	PLSDA: 89.84% SVM: 93.55% CNN: 98.39%
Yu et al.⁷⁰	Melon – surface pesticide	n/a	n/a	CNN: 99.17%
Variety classification
Ninh et al.⁷¹	Apple, avocado, dragon fruit, guava, mango – fruit type	n/a	SAVGOL first and second derivatives	K-nearest neighbours, naïve bayes, SVM, CNN, Residual network: (Best) 99%
Rong et al.⁷²	Peach – variety	Training + validation: 500	n/a	CNN: 94.4%
Escarate et al.⁷³	Stone fruit – variety	Training: 1424 Test: 356	SNV + MSC + SAVGOL 1st derivative	CNN: 98.9%
Other
Ninh et al.⁷⁴	Apple, avocado, dragon fruit, guava, mango –freshness	n/a	First and second derivative	CNN: 80% Reported to be better than ‘traditional’ machine learning
Liu et al.⁷⁵	Pear – harvest ready	Training: 300 Validation: 60	n/a	PLSA: 41.67% CNN: 88.33%

Hao et al.⁶⁷ presented a methodology for testing the combination of pre-processing techniques with specific model types, such as PLSDA, SVM and CNN. They compared Savitzky-Golay Smoothing (SAVGOL), spectral standardization (SS), max-min normalization (MMN) and standard normal variate transformation (SNV) in combination with competitive adaptive reweighted sampling (CARS) for the optimal selection of features. The optimal pre-processing technique was determined for each model type using the calibration set before testing on the validation set. Their results demonstrated that optimal pre-processing technique can be different for each model type.

Many reports incorporate the use of spectral pre-processing with use of a CNN model (Table 2). For classification of fruit, Ninh et al.⁷¹ reported that using first and second Savitzky-Golay derivative of spectra as inputs improved the accuracy for both CNN and CNN Residual Network models by more than 8%. Hao, Zhang et al.⁶⁷ also reported improvement in CNN model performance with input of spectra derivatives for predicting insect damage in pears.

The need to test for model performance in prediction of fruit from a range of growing conditions and seasons is as valid for classification applications as for regression applications. For classification of pear harvest readiness, Liu et al.⁷⁵ found that accuracy for a model tested on samples from the same year as the training set was 100% for both PLSDA and CNN models. However, when the models were tested on data from the next season, accuracy dropped to 42% and 88% respectively. This demonstrates the importance of using new data for testing models and calls into question the accuracy of models such as Yu et al.,⁷⁰ for which a 99.17% accuracy was reported for samples from the same harvest season and location. The generalisation of the models to fruit from a new harvest event and growing condition cannot be deduced.

Escarate et al.⁷³ report use of a combination of a classification and a regression model. A CNN classification model was used to classify the stone fruit type (peach, yellow pulp nectarine, white pulp nectarine, red plums, black plums and plumcots), with cultivar specific ANN regression models then applied to predict fruit soluble solid content (SSC). An accuracy of 98.9% was reported for the classification task, however again the test set was randomly selected from the total of 1780 fruit.

Regression

NIR-based PLS regression models have been extensively used in prediction of measurable quality attributes of a fruit such as dry matter content (DM), SSC and moisture content (MC). The first application of CNN for NIR attribute estimation of intact fruit occurred in 2021, with an explosion of effort in that 10 papers have been published in 2021 and 2022 (Table 3). Of the publications identified, six involved application to mango DM,^76–81 two to pear attributes,^76,82 two to orange attributes^83,84 and one to pomelo attributes⁸⁵ (Table 3).

Table 3.

Reports of CNN-NIR in fruit quality regression applications. Detail on the Anderson and Walsh dataset is provided in the next section.

Author	Attribute	Datasets	Pre-processing	Model architecture
Yang et al.⁷⁶	Mango – DM Pear – SSC	Mango – Anderson et al.⁷ Pear – Passos et al.⁸ (total: 3300 samples)	nil	1D-CNN (3 convolutional layers), with transfer learning
Mishra and Passos⁸²	Pear – SSC and MC	Original dataset – Training: 413 (used data augmentation) Test: 138	nil	Cui and Fearn²⁸ 1D-CNN with multi-output and a 1D-CNN with 3 convolutional layers and multi-output
Martins et al.⁸³	Orange –SSC	Cavaco et al.⁸⁶ Total: 616 fruit from two seasons	Quantitative normal variant (a variant on SNV)	SpectraNet-53: a deep residual network with 53 layers
Mishra and Passos⁷⁷	Mango – DM	Anderson et al.⁷	Tested nil versus augmented data with raw, SNV, SVGOL, SNV + SAVGOL	Cui and Fearn²⁸ 1D-CNN
Xu et al.⁸⁴	Orange – SSC and MC	Original dataset – Calibration: 76 Test: 30 (random split)	SAVGOL + GA	1D-convolution used for feature extraction with PLSR
Mishra et al.⁷⁸	Mango – DM	Anderson et al.⁷	Tested various – nil, SNV, VSN, SAVGOL, SNV + SNVGOL, VSN + SAVGOL	Cui and Fearn²⁸ 1D-CNN
Mishra and Passos⁷⁹	Mango – DM	Anderson et al.⁷ + new data set of 540 fruit	Same as Mishra and Passos⁷⁷	Same as Mishra and Passos⁷⁷ with transfer learning
Mishra and Passos⁸⁰	Mango – DM	Anderson et al.⁷	Raw spectra split into VIS and NIR blocks	1D-CNN with parallel convolutional layers (extension of Cui and Fearn²⁸)
Xu et al.⁸⁵	Pomelo – SSC and acidity	Original dataset – Total: 100 fruit	MSC + SAVGOL + GA	1D-convolution used for feature extraction with PLSR
Mishra and Passos⁸¹	Mango – DM	Anderson et al.⁷	nil	Cui and Fearn²⁸ 1D-CNN with different transfer learning techniques where some adjusted the architecture

Datasets

Many publications have made use of open access data (Table 3). The publicly-available dataset from Anderson et al.⁷ consists of over 11,000 spectra of mango fruit, with related metadata. Of the four publications not involving this data set, the maximum number of samples used in any one study is 616. For example, a study of total soluble solid content and acidity in pomelo fruit (a large thick-skinned citrus) was based on only 100 samples.⁸⁵ The CNN model was reported to deliver an inferior performance to PLS regression for prediction of acidity, but nonetheless increased performance in prediction of the total soluble solid content, relative to PLS regression models. The validity of such studies on such small datasets is questionable.

Pre-processing

Various combinations of pre-processed spectra have been recommended as input to CNN models, including standard normal variant (SNV), multiplicative scattering correction (MSC), Savitzky-Golay derivatives (SAVGOL) and genetic algorithms (GA). Mishra and Passos⁷⁷ reported that augmentation of the raw mango spectra with several pre-processed data sets resulted in increased accuracy of a CNN model compared to use of the raw spectra only.

In contrast, working with the same Anderson et al.⁷ mango dataset, Mishra et al.⁷⁸ reported that the use of spectra pre-processed using scatter correction techniques reduced the prediction performance of both a PLS regression and CNN model. On the hold-out fourth season, an RMSEP of 0.87 and 0.76% FW was achieved for PLS regression and CNN, respectively, using only the raw absorbance. The best pre-processing for PLS regression was found to be a second derivative, with and RMSEP of 0.88% FW, while standard normal variant (SNV) worked best for CNN (RMSEP of 0.81% FW). The decrease in model performance with input of pre-processed spectra contradicts the general experience of researchers using PLS regression e.g., Anderson et al.⁸⁷ The issue of the value of derivative pre-processing for the fruit application requires resolution.

As noted earlier, Cui and Fearn²⁸ reported that the first convolutional layer a CNN model developed for estimation of wheat attributes from NIR spectra acted to produce an output equivalent to a first derivative. Bai et al.⁸⁸ attempted to employ an ANN architecture with an input of pre-processed spectra instead of a convolutional layer to develop a model for predicting the soluble solid content of apple fruit. A deep ANN was used with a random forest algorithm used to pre-process the spectra. However, this study was based on only 208 apple samples, and it is recommended that the technique be tested with a larger training set and compared with a CNN with no pre-processing.

CNN architecture

Mishra and co-workers adopted a 1D-CNN architecture to develop a mango dry matter content model using the Anderson et al.⁷ data set. The CNN architecture of Cui and Fearn²⁸ was adopted, being comprised of an input layer, one convolution layer, three fully connected layers and an output layer (Figure 2). However, no justification was given for the choice of this architecture beyond its use in other (non-spectroscopy) applications. The CNN model improved the prediction error from the previous best reported RMSEP (achieved by Anderson et al.⁸⁹ with an ensemble of multiple non-linear models) of 0.84 to 0.79% FW.⁷⁷ However, the authors raised the concern that the CNN model might be overfitted to the data set, with a new CNN model proposed in a subsequent paper based on testing on fruit from a new season, cultivar and scanned with a new instrument.⁷⁹ The results of Mishra and co-workers has yet to be replicated by other research groups.

Figure 2.

The 1D-CNN architecture first implemented to NIR spectroscopy by Cui and Fearn.²⁸

A novel development is the use of multi-block parallel inputs. Mishra and Passos⁸⁰ split the visible and near-infrared spectra of mango fruit collected from one instrument to feed into parallel convolutional layers (Figure 3). This technique can also be used to fuse results from two different instruments to improve the prediction accuracy by removing bias from each instrument.⁹⁰

Figure 3.

Adaptation of Cui and Fearn²⁸ 1D-CNN architecture with parallel convolutional layers. Reprinted from Mishra and Pasos⁸⁰ 2021 (CC BY 4.0).

Using a multi-output architecture, a recent study demonstrated simultaneous prediction of moisture content and soluble solids content in pear using one CNN model.⁸² Two 1D CNNs were tested, one with one convolutional layer and the other with three. The 3-layer model slightly outperformed the one convolutional layer network. Only 551 samples were collected, but a data augmentation was performed to assist in training the model. Again, such techniques should be validated on a larger set.

Transfer learning

A common issue for chemometric models is their inability to generalise for variance in the spectra, for example, for spectra from different instruments or from samples from different growing or storage conditions. Various approaches have been used with PLS regression models, including (i) development of a new model following addition of a small amount of new data of samples from the new condition to the calibration set, or corrected for some bias with a (ii) use of a calibration transfer function to make samples from one condition look like samples from another condition, and (iii) removal from spectra the effect of a given influence, e.g., temperature.⁹¹

Using transfer learning, a CNN model which is initially trained on a large common data set can be localised by tweaking the model to generate new weights with the same architecture based on small sample relevant to the new population. This has been demonstrated for the new season mango fruit crop and across instruments.^81,92

Platforms – cloud and portable

In recent years, cloud based machine learning resources have become available. Solihin et al.⁹³ investigated using one such software platform (Orange data mining; orangedatamining.com), in different NIR spectroscopy applications, including the prediction of mango soluble solids content. However, the platform did not have a PLS regression or CNN toolkit. Anderson et al.⁸⁹ evaluated the DataRobot and Hone cloud chemometric platforms in the context of the mango dry matter data set used by Mishra and Passos.⁷⁷ Again, neither platform provided a CNN resource. However, considerable work is being put into the development of these user-friendly machine learning platforms by both the open-source community and private companies, e.g., Amazon Sagemaker (Amazon Web Services; aws.amazon.com/sagemaker/) and Azure Machine Learning (Microsoft Azure; azure.microsoft.com/services/machine-learning/). There are currently no reports in literature of their use by the NIR spectroscopy community. Further investigation of these platforms as they develop is warranted for NIR spectroscopy applications as they allow method evaluation with little to no programming knowledge.

Technique comparison involving publicly available data

The dataset of Anderson et al.,⁷ is the largest publicly-available dataset of NIR spectra and mango dry matter content. This data set contains 11,834 spectra from one F750 Produce Quality Meter (Felix Instruments, Camas, WA, USA) spectrometer using 4685 mango samples, representing 112 unique populations drawn from four growing seasons and multiple cultivars and growing locations (Table 4).

Table 4.

Specifications of the Anderson et al.⁷ mango dry matter content and spectra dataset.

Attribute	Mango dry matter content
Instrument	F750 produce quality meter (Felix Instruments, Camas, WA, USA)
Wavelength range	350–1200 nm, 3 nm steps
Growing regions	Darwin and Katherine regions of Northern Territory, Far north, Burdekin, central and south Queensland
Cultivars	Calypso (Caly), Honey gold (HG), Keitt, Kensington Pride (KP), Lady Grace (LadyG), Lady Jane (LadyJ), R2E2, National mango breeding program (NMBP) 1201, 1243, 4069
Physiological Stages	Ranging from hard green through to eating ripe
Seasons	Season 1: 2015 Season 2: 2016 Season 3: 2017 Season 4: 2018
Unique populations	Total: 112 Train (first three seasons): 94 Test (fourth season): 18
Samples and reference values	Total: 4685 samples Outliers removed: 10 Train (first three seasons): 3950 Test (fourth season): 725
Spectra	Total: 11,834 Outliers removed: 143 Train (first three seasons): 10,243 Test (fourth season): 1448

As the largest publicly available fruit data set, it has been utilised in several studies exploring the use of global modelling techniques.^{76–81,87,89,94} In these works data of the first three seasons is used for training, and the fourth season for independent validation. Table 5 presents a direct comparison of the reported RSMEP on the same independent test.

Table 5.

A comparison of published models based on the three season training set and fourth test set of Anderson, Walsh et al.⁷ Models predict intact mango dry matter content.

Publication(wavelength range)	Outlier removal	Data pre-processing	Model	RMSEP
Anderson et al.⁸⁷(684–990 nm)	No additional	MC + SAVGOL (deriv = 2, window = 17, poly = 2)	PLS regression	1.014
Anderson et al.⁸⁷(684–990 nm)	No additional	MC + SAVGOL (deriv = 2, window = 17, poly = 2)	ANN	0.892
Anderson et al.⁸⁹(684–990 nm)	No additional	MC + SAVGOL (deriv = 2, window = 17, poly = 2)	Gaussian process regression (GPR)	0.898
			Memory based learner (MBL)	0.903
			DataRobot light gradient boosting	0.976
			Support vector regression (SVR)	1.048
			Cubist	1.135
			Hone create stacked ensemble	0.85
			DataRobot ElasticNet ensemble	0.963
Mishra and Passos⁷⁷(684–990 nm)	No additional	Raw absorbance spectra	PLS regression	1.06
	No additional	Data augmented by stacking: - Raw spectra - SNV - SAVGOL (deriv = 1, window = 13, poly = 2) - SAVGOL (deriv = 2, window = 13, poly = 2) - SNV + SAVGOL (deriv = 1, window = 13, poly = 2) - SNV + SAVGOL (deriv = 2, window = 13, poly = 2)	PLS regression	1.03
	Removed using Hotelling’s T2 and Q stats with PLSR decomp Train set: 9914 (2015-2017 seasons) Test set: 1413 (2018 season)		PLS regression	0.990.95**
			Cui and Fearn ²⁸ 1D-CNN (kernal size = 21, batch size = 128) Involves decision making based on user experience to choose best hyper-parameters	0.790.75**
			Cui and Fearn ²⁸ 1D-CNN (kernal size = 19, batch size = 160)	0.80.75**
		Raw absorbance spectra	Cui and Fearn ²⁸ 1D-CNN (kernal size = 21, batch size = 128)	0.95**
		Raw absorbance spectra	Cui and Fearn ²⁸ 1D-CNN (kernal size = 19, batch size = 160)	0.98**
Passos and Mishra⁹⁴(684–990 nm)	Removed using Hotelling’s T2 and Q stats with PLSR decomp Train set: 9914 (2015-2017 seasons)	Data augmented by stacking: - Raw spectra - SNV - SAVGOL (deriv = 1, window = 13, poly = 2) - SAVGOL (deriv = 2, window = 13, poly = 2) - SNV + SAVGOL (deriv = 1, window = 13, poly = 2) - SNV + SAVGOL (deriv = 2, window = 13, poly = 2)	Cui and Fearn²⁸ 1D-CNN with automatic hyperparameter tuning	0.838
Mishra et al.⁷⁸(742–990 nm)	Removed using Hotelling’s T2 and Q stats with PLSR decomp Train set: 10,135 (2015-2017 seasons) Test set: 1285 (2018 season)	Raw absorbance spectra	PLS regression (8 LVs)	0.87**
		Raw absorbance spectra	Cui and Fearn²⁸ 1D-CNN	0.76**
		SNV	PLS regression (7 LVs)	0.96**
		SNV	Cui and Fearn²⁸ 1D-CNN	0.81**
		VSN	PLS regression (7 LVs)	1.0**
		VSN	Cui and Fearn²⁸ 1D-CNN	0.91**
		SAVGOL (deriv = 2, window = 13, poly = 2)	PLS regression (8 LVs)	0.88**
		SAVGOL (deriv = 2, window = 13, poly = 2)	Cui and Fearn²⁸ 1D-CNN	0.92**
		SNV + SAVGOL (deriv = 2, window = 13, poly = 2)	PLS regression (7 LVs)	0.98**
		SNV + SAVGOL (deriv = 2, window = 13, poly = 2)	Cui and Fearn²⁸ 1D-CNN	0.9**
		VSN + SAVGOL (deriv = 2, window = 13, poly = 2)	PLS regression (9 LVs)	0.95**
		VSN + SAVGOL (deriv = 2, window = 13, poly = 2)	Cui and Fearn²⁸ 1D-CNN	0.98**
Mishra and Passos⁸⁰(450–1030 nm)	Removed using Hotelling’s T2 and Q stats with PLSR decomp Train set: 9914 (2015-2017 seasons) No outliers removed from test set	Raw spectra	Cui and Fearn²⁸ 1D-CNN	0.855
		Spectra partitioned into two blocks: −450–697 nm −700–1030 nm	1D-CNN with parallel convolutional layers (extension of Fearn 1D-CNN)	0.818
			Sequential Orthogonalized PLS regression	1.03
Yang et al.⁷⁶(684–990 nm)	No additional	Raw spectra	1D CNN (3 convolutional layers), with transfer learning applied with x% of season 4 data	1*** (x = 5%)
				0.79*** (x = 10%)
				0.74*** (x = 15%)
				0.71*** (x = 20%)
Mishra and Passos⁸¹(684–990 nm)	Removed using Hotelling’s T2 and Q stats	Raw spectra	Cui and Fearn²⁸ 1D-CNN with transfer learning applied with 60% of season 4 data	0.58**

**Additional outliers were removed from the test set.

***Transfer learning applied with data used from and subsequently removed from test set.

Model comparisons

The lowest RMSEP on the independent fourth season was achieved by Mishra and Passos⁷⁷ using the Cui and Fearn²⁸ 1D-CNN model architecture with augmented data utilising various pre-processing techniques. A similar result, although not directly comparable due to different outliers being removed from the test set, was achieved by Mishra et al. (2021) with the same CNN architecture but using only the raw absorbance spectra on a slightly narrower wavelength range. Using a wider wavelength range including the visible spectra, as in Mishra and Passos,⁸⁰ was less successful.

The outlier removal process is an area of contention. The published Anderson et al.⁷ dataset, already has a number of samples removed as outliers. Further outlier removal, (Table 2), prevent direct comparison of model results between the published papers. Additionally, the outlier removal process described by some studies, such as Mishra and Passos,⁷⁷ is not replicable due to their manual nature. It is however evident that implementation of an improved outlier removal process over that originally used by Anderson et al.⁷ can achieve more accurate model predictions.

Mishra et al.⁷⁸ provide no reasoning or justification for using the slightly narrower wavelength range of 742–990 nm compared with 684–990 nm as used by Anderson et al.⁸⁷ Utilising a multi-block CNN model with the visible wavelength range of 450–697 nm is an interesting proposition; however, its rationale is questionable in terms of relationship to dry matter content of the fruit. Anderson et al.⁸⁷ justifies the wavelength range used it terms of the underlying chemistry of the fruit, noting the spectral feature peaking around 680 but stretching out to 720 nm to be associated with chlorophyll. Fruit left on the tree longer than commercial harvest timing will have higher dry matter content and lower carbohydrate, but the relationship between chlorophyll content and dry matter content will vary with fruit maturity and growing condition. In contrast, absorbance features at 840 and 960 can be associated with O-H.⁸⁷ Justification can be made for narrowing the wavelength range to remove wavelengths that offer no correlation for dry matter.

Similarly, various data pre-treatments have been proposed (Table 5). As Cui and Fearn²⁸ discussed, the convolutional layer of a CNN model may remove the need for data-pre-treatment for these models. This is partially supported by the findings of Mishra et al.⁷⁸ who achieved similar results to Mishra and Passos⁷⁷ with augmented pre-treated data using the same CNN architecture. However, further work should be conducted to ensure that results are directly comparable and undertaking evaluation of other pre-treatments.

Yang et al.⁷⁶ experimented with a CNN architecture with three convolutional layers as opposed to the Cui and Fearn²⁸ architecture implemented by Mishra and Passos⁷⁷ which employs only one convolutional layer (Figures 2 and 4). Although the results are not directly comparable, as Yang et al.⁷⁶ used up to 20% of the fourth season dataset to train the model, it can be inferred that the Yang et al.⁷⁶ CNN architecture is much less successful for the mango dry matter application. Yang et al.⁷⁶ could have easily produced a directly comparable result by keeping the fourth season as the independent test set, as per the previous publications.

Figure 4.

1D-CNN architecture proposed by Yang et al.⁷⁶

Data set issues

There is a broad push in the scientific community for the publication of data sets. The Anderson et al.⁷ dataset is an early example of a large open access data set in the NIR-fruit quality space. This resource facilities experimentation with deep learning, given the need for large training data sets, and, with discipline in use of training and test sets, allows for across literature comparison of results. Hopefully this open data set will be expanded, e.g., with the additional data collected in Mishra and Passos.⁷⁹

The slow adoption of newer techniques like CNNs in NIR spectroscopy for fruit quality evaluation may be due to the divide between those proficient in CNN techniques, such as computer scientists and mathematicians, and those with access and domain understanding of large data sets such as chemometrics. This disparity can result in a “siloed” approach and missed opportunities for collaboration between computer science and mathematics on one side and chemometrics on the other. Sharing large public datasets is critical to advancing this field, as it bridges the gap between modalities and encourages interdisciplinary collaboration. This will drive scientific progress in the NIR field.

It is important to exercise caution when selecting the test set from the overall dataset for fruit quality evaluation using NIR spectroscopy. In real-world applications, a model is trained on historical data and applied to new data. Similarly, when evaluating a model for publication purposes, it is recommended to use the same logic. Use of random splits to create training and test sets is not advised if all future variability is not captured in the existing data. Also, prediction results from different studies will not be directly comparable if researchers have altered the test set.

The comparison of techniques across studies using the same dataset in this section highlights the importance of clear and fair benchmarking when presenting new techniques. When utilizing publicly available datasets, authors are encouraged to employ quantitative and repeatable methods. The use of subjective techniques for outlier removal is discouraged as they cannot be reliably replicated in other studies.

Conclusion

CNNs have been demonstrated to be appropriate for use in fruit quality evaluation using NIR. With the increased availability of large open-source datasets and the widespread success of CNNs in other applications, such as image and speech processing, it is likely that this technique will continue to gain popularity and be applied to a wider range of NIR applications in the future.

For future studies is crucial to conduct a thorough comparison of CNN models to traditional chemometric techniques, such as PLS regression, to fully appreciate the benefits of using complex models in this context. While PLS regression operates under the assumption of a linear relationship between the spectra and the attribute of interest, this assumption may not hold true with larger datasets. Furthermore, simply comparing a CNN model to PLS regression is not always sufficient, especially if the PLS regression results have not been optimized through traditional chemometric techniques. To provide a clearer understanding of the advantages of using deep learning techniques such as CNN, a comparison to a shallow ANN should also be performed.

The publication of data sets and open-source code plays a crucial role in facilitating the reproducibility of research results and advancing the field. Open access datasets, provide a valuable resource for experimentation with deep learning and allow for across-literature comparison of results. The tutorial and code repository provided by Passos and Mishra⁹⁴ serve as an excellent example for the NIR community, demonstrating the benefits of open and accessible resources in advancing the field.

The highly encouraging results achieved by Mishra and co-workers in predicting mango dry matter using a 1D-CNN architecture are a significant step towards wider adoption of CNNs in fruit quality evaluation using NIR spectroscopy. So far, most implementations in this application have been based on the CNN architecture proposed by Cui and Fearn,²⁸ which consists of an input layer, one convolution layer, three fully connected layers, and an output layer. Further investigation is necessary to optimize the CNN models for this specific use case and other fruit applications, including exploring different model architectures, hyper-parameter tuning, and data pre-processing.

To the authors knowledge, CNNs are not currently used in any commercial applications of fruit quality evaluation with NIR. At the present, caution should be exercised before utilising them in commercial applications due the lingering issues that require resolution by further research. It cannot be said that CNNs outperform traditional techniques such as PLS regression in all cases, but rather that they outperform PLS in certain scenarios, depending on the instrument and the attribute. It is crucial to consider the context of the application, available resources, and expertise to achieve the “best result” for the application. The best model should not solely be judged by an RMSEP value, as a simpler model may be easier to implement and maintain.

In summary, the rise of CNNs for fruit quality evaluation using NIR spectroscopy is a promising development. Further work is necessary to fully understand the benefits of using complex models in this context and to optimize the models for this specific application. The publication of data sets and open-source code will play a crucial role in advancing the field and facilitating the reproducibility of research results.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the Central Queensland University (Master of Research Part Scholarship).

ORCID iDs

Jeremy Walsh

Nicholas Anderson

References

Janiesch

Zschech

Heinrich

. Machine learning and deep learning. Electron Mark 2021; 31: 685–695. DOI: 10.1007/s12525-021-00475-2.

Norris

. Direct spectrophotometric determination of moisture content of grain and seeds. Proc of the 1963 International Symposium on Humidity and Moisture 1965; 19.

Anderson

Walsh

. Review: the evolution of chemometrics coupled with near infrared spectroscopy for fruit quality evaluation. Journal of Near Infrared Spectroscopy 2022; 30: 3–17. DOI: 10.1177/09670335211057235.

Cirovic

. Feed-forward artificial neural networks: applications to spectroscopy. TrAC Trends in Analytical Chemistry 1997; 16: 148–155. DOI: 10.1016/S0165-9936(97)00007-1.

Santos

Oliveira

FCC

Lima

, et al. A comparative study of diesel analysis by FTIR, FTNIR and FT-Raman spectroscopy using PLS and artificial neural network analysis. Analytica Chimica Acta 2005; 547: 188–196. DOI: 10.1016/j.aca.2005.05.042.

Nilsson

. Annual comparison of the FOSS NIR global ANN calibration against reference methods: global ring test study overview 2007 - 2020. Accessed 23 Jan 2023. Foss 2021 https://www.fossanalytics.com/en-au/products/infratec

Anderson

Walsh

Subedi

. Mango DMC and spectra. Mendeley Data 2020; 1. DOI: 10.17632/46htwnp833.1

Passos

Rodrigues

Cavaco

, et al. Non-destructive soluble solids content determination for 'rocha' pear based on vis-swnir spectroscopy under 'real world' sorting facility conditions. Sensors (Basel) 2019; 19. DOI: 10.3390/s19235165.

Mishra

Passos

Marini

, et al. Deep learning for near-infrared spectral data modelling: Hypes and benefits. TrAC Trends in Analytical Chemistry 2022; 157: 116804. DOI: 10.1016/j.trac.2022.116804.

10.

Rose

Kitchin

. Pybliometrics: scriptable bibliometrics using a python interface to scopus. SoftwareX 2019; 10: 100263. DOI: 10.1016/j.softx.2019.100263.

11.

Chollet

. Deep learning with Python. Shelter Island, NY: Manning Publications Co, 2021.

12.

Sarker

. Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput Sci 2021; 2: 420. DOI: 10.1007/s42979-021-00815-1.

13.

LeCun

Bengio

Hinton

. Deep learning. Nature 2015; 521: 436–444. DOI: 10.1038/nature14539.

14.

Abiodun

Jantan

Omolara

, et al. State-of-the-art in artificial neural network applications: a survey. Heliyon 2018; 4: e00938. DOI: 10.1016/j.heliyon.2018.e00938.

15.

Okwu

Tartibu

. Artificial neural network. In: Metaheuristic optimization: nature-inspired algorithms swarm and computational intelligence, theory and applications. Switzerland AG: Springer Nature, 2021, pp. 133–145.

16.

Goodfellow

Bengio

Courvile

. Deep learning. Cambridge, MA: MIT Press, 2016.

17.

Liu

Yang

, et al. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst 2022; 33: 6999–7019. DOI: 10.1109/TNNLS.2021.3084827

18.

LeCun

Bottou

Bengio

, et al. Gradient-based learning applied to document recognition. Proc IEEE 1998; 86: 2278–2324. DOI: 10.1109/5.726791.

19.

Koirala

Walsh

Wang

, et al. Deep learning for real-time fruit detection and orchard fruit load estimation: benchmarking of ‘MangoYOLO. Precis Agric 2019; 20: 1107–1135. DOI: 10.1007/s11119-019-09642-0.

20.

Zeiler

Fergus

. Visualizing and understanding convolutional networks. European Conference on Computer Vision 2014; 8689: 818–833. DOI: 10.1007/978-3-319-10590-1_53.

21.

Krizhevsky

Sutskever

Hinton

. ImageNet Classification with Deep Convolutional Neural Networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, USA, 3–6 December 2012, pp. 1097–1105. DOI: 10.5555/2999134.2999257.

22.

Koirala

Walsh

Wang

, et al. Deep learning – method overview and review of use for fruit detection and yield estimation. Computers and Electronics in Agriculture 2019; 162: 219–234. DOI: 10.1016/j.compag.2019.04.017

, et al. Review on convolutional neural networks (CNN) in vegetation remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing 2021; 173: 24–49. DOI: 10.1016/j.isprsjprs.2020.12.010

, et al. 1D convolutional neural networks and applications: A survey. Mechanical Systems and Signal Processing 2021; 151: 107398. DOI: 10.1016/j.ymssp.2020.107398.

, et al. Neural architecture search for 1D CNNs-different approaches tests and measurements. Sensors (Basel) 2021; 21: 7990. DOI: 10.3390/s21237990.

. One-dimensional convolutional neural networks for spectroscopic signal regression. Journal of Chemometrics 2018; 32: e2977. DOI: 10.1002/cem.2977.

, et al. Convolutional neural networks for vibrational spectroscopic data analysis. Anal Chim Acta 2017; 954: 22–31. DOI: 10.1016/j.aca.2016.12.010.

28.

Cui

Fearn

. Modern practical convolutional neural networks for multivariate regression: applications to NIR calibration. Chemometrics and Intelligent Laboratory Systems 2018; 182: 9–20. DOI: 10.1016/j.chemolab.2018.07.008.

. The uses of near infra-red spectroscopy in postharvest decision support: a review. Postharvest Biology and Technology 2020; 163: 111139. DOI: 10.1016/j.postharvbio.2020.111139.

, et al. Transparency of deep neural networks for medical image analysis: a review of interpretability methods. Comput Biol Med 2021; 140: 105111. DOI: 10.1016/j.compbiomed.2021.105111

. Achieving joint calibration of soil Vis-NIR spectra across instruments, soil types and properties by an attention-based spectra encoding-spectra/property decoding architecture. Geoderma 2022; 405: 115449. DOI: 10.1016/j.geoderma.2021.115449.

32.

Yang

Wang

, et al. Combination of convolutional neural networks and recurrent neural networks for predicting soil properties using Vis–NIR spectroscopy. Geoderma 2020; 380: 114616. DOI: 10.1016/j.geoderma.2020.114616.

33.

Wang

, et al. Convolutional neural network application in prediction of soil moisture content. Guang Pu Xue Yu Guang Pu Fen Xi/Spectroscopy and Spectral Analysis 2018; 38: 36–41. DOI: 10.3964/j.issn.1000-0593(2018)01-0036-06.

34.

Minasny

Montazerolghaem

, et al. Convolutional neural network for simultaneous prediction of several soil properties using visible/near-infrared, mid-infrared, and their combined spectra. Geoderma 2019; 352: 251–267. DOI: 10.1016/j.geoderma.2019.06.016.

35.

Chen

Whiting

, et al. Convolutional neural network model for soil moisture prediction and its transferability analysis based on laboratory Vis-NIR spectral data. International Journal of Applied Earth Observation and Geoinformation 2021; 104: 102550. DOI: 10.1016/j.jag.2021.102550.

36.

Hong

Chen

, et al. Data mining of urban soil spectral library for estimating organic carbon. Geoderma 2022; 426: 116102. DOI: 10.1016/j.geoderma.2022.116102.

37.

Zhang

Lin

, et al. DeepSpectra: an end-to-end deep learning approach for quantitative spectral analysis. Anal Chim Acta 2019; 1058: 48–57. DOI: 10.1016/j.aca.2019.01.002.

, et al. Effective prediction of soil organic matter by deep SVD concatenation using FT-NIR spectroscopy. Soil and Tillage Research 2022; 215: 105223. DOI: 10.1016/j.still.2021.105223.

39.

Bai

Xie

, et al. Estimation of soil organic carbon using vis-nir spectral data and spectral feature bands selection in southern Xinjiang, China. Sensors (Basel) 2022; 22: 6124–/27. DOI: 10.3390/s22166124.

, et al. Estimation of heavy metals using deep neural network with visible and infrared spectroscopy of soil. Sci Total Environ 2020; 741: 140162–142020. DOI: 10.1016/j.scitotenv.2020.140162.

. Fusion of Vis-NIR and XRF spectra for estimation of key soil attributes. Geoderma 2021; 385: 114851. DOI: 10.1016/j.geoderma.2020.114851.

42.

Jia

Huang

, et al. Multi-task convolution neural network regression prediction model based on vis-NIR spectroscopy. IOP Conf Ser: Mater Sci Eng 2020; 768: 072049.

43.

Yin

Cong

, et al. Simultaneous prediction of soil properties using multi_CNN model. Sensors (Basel) 2020; 20: 6271. DOI: 10.3390/s20216271.

, et al. The influence of training sample size on the accuracy of deep learning models for the prediction of soil properties with near-infrared spectroscopy data. Soil 2020; 6: 565–578. DOI: 10.5194/soil-6-565-2020.

, et al. Using a one-dimensional convolutional neural network on visible and near-infrared spectroscopy to improve soil phosphorus prediction in madagascar. Remote Sensing 2021; 13: 1519. DOI: 10.3390/rs13081519.

. Using deep learning to predict soil properties from regional spectral data. Geoderma Regional 2019; 16: e00198. DOI: 10.1016/j.geodrs.2018.e00198.

. Prediction of various soil properties for a national spatial dataset of scottish soils based on four different chemometric approaches: a comparison of near infrared and mid-infrared spectroscopy. Geoderma 2021; 396: 115071. DOI: 10.1016/j.geoderma.2021.115071.

48.

Chen

Wang

. End-to-end quantitative analysis modeling of near-infrared spectroscopy based on convolutional neural network. Journal of Chemometrics 2019; 33: e3122. DOI: 10.1002/cem.3122.

. Markov transition field combined with convolutional neural network improved the predictive performance of near-infrared spectroscopy models for determination of aflatoxin b(1) in maize. Foods 2022; 11: 2210–2307. DOI: 10.3390/foods11152210.

, et al. NIR instruments and prediction methods for rapid access to grain protein content in multiple cereals. Sensors (Basel) 2022; 22: 3710. DOI: 10.3390/s22103710.

51.

Gan

Luo

. Simple dilated convolutional neural network for quantitative modeling based on near infrared spectroscopy techniques. Chemometrics and Intelligent Laboratory Systems 2023; 232: 104710. DOI: 10.1016/j.chemolab.2022.104710

, et al. Multi-task deep learning of near infrared spectra for improved grain quality trait predictions. Journal of Near Infrared Spectroscopy 2020; 28: 275–286. DOI: 10.1177/0967033520939318.

53.

Jiang

Peng

. The protective effect of decoction of rehmanniae via PI3K/Akt/mTOR pathway in MPP⁺-induced Parkinson's disease model cells. J Recept Signal Transduct Res 2021; 41: 74–84. DOI: 10.37965/jait.2020.0037.

, et al. Convolutional neural networks for quantitative prediction of different organic materials using near-infrared spectrum. In: Proceedings of the 14th international joint conference on biomedical engineering systems and technologies, Vienna, Austria, 11–13 February 2021, pp. 169–176

, et al. Determination of leaf water content with a portable NIRS system based on deep learning and information fusion analysis. Trans ASABE 2021; 64: 127–135. DOI: 10.13031/trans.13989.

56.

Tian

Wang

, et al. Nicotinamide nucleotide transhydrogenase mutation analysis in Chinese patients with thyroid dysgenesis. Am J Med Genet A 2022; 188: 89–98. DOI: 10.1177/09670335211057234.

57.

Zhang

. Prediction approach of larch wood density from visible-near-infrared spectroscopy based on parameter calibrating and transfer learning. Front Plant Sci 2022; 13: 1006292. DOI: 10.3389/fpls.2022.1006292.

, et al. Quantitative Regression Modeling of Cocoa Bean Content Based on Gated Dilated Convolution Network. https://hdl.handle.net/2381/14186198.v1 (2021, accessed 1 Feb 2023).

59.

Wang

Tao

. Variable weighted convolutional neural network for the nitrogen content quantization of Masson pine seedling leaves with near-infrared spectroscopy. Spectrochim Acta A Mol Biomol Spectrosc 2019; 209: 32–39. DOI: 10.1016/j.saa.2018.10.028.2018/10/22.

60.

Zhou

Tan

Zhang

, et al. A portable NIR-system for mixture powdery food analysis using deep learning. Lwt 2022; 153: 112456. DOI: 10.1016/j.lwt.2021.112456.

61.

Aulia

Khodra

Koesoema

. Predicting macronutrient of baby food using near infrared spectroscopy and deep learning approach. IOP Conf Ser: Mater Sci Eng 2020; 803: 012019.

62.

Chen

Y-y

Wang

Z-b

. Feature selection based convolutional neural network pruning and its application in calibration modeling for NIR spectroscopy. Chemometrics and Intelligent Laboratory Systems 2019; 191: 103–108. DOI: 10.1016/j.chemolab.2019.06.004.

63.

Wang

Huang

Chou

, et al. Characteristics of brain connectivity during verbal fluency test: Convolutional neural network for functional near-infrared spectroscopy analysis. J Biophotonics 2022; 15: e202100180. DOI: 10.1002/jbio.202100180.

64.

Alzubaidi

Zhang

Humaidi

, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 2021; 8: 53. DOI: 10.1186/s40537-021-00444-8.

65.

Einarson

Baum

Olsen

, et al. Predicting pectin performance strength using near-infrared spectroscopic data: a comparative evaluation of 1-D convolutional neural network, partial least squares, and ridge regression modeling. Journal of Chemometrics 2021; 36. DOI: 10.1002/cem.3348.

66.

Tian

Wang

. Early detection of freezing damage in oranges by online Vis/NIR transmission coupled with diameter correction method and deep 1D-CNN. Computers and Electronics in Agriculture 2022; 193: 106638. DOI: 10.1016/j.compag.2021.106638.

67.

Hao

Zhang

, et al. Establishment of online deep learning model for insect-affected pests in "Yali" pears based on visible-near-infrared spectroscopy. Front Nutr 2022; 9: 1026730. DOI: 10.3389/fnut.2022.1026730.

68.

Chang

Tian

, et al. Non-destructive identification of internal watercore in apples based on online vis/nir spectroscopy. Trans ASABE 2020; 63: 1711–1721. DOI: 10.13031/TRANS.13844.

69.

Zhang

Liu

, et al. Nondestructive detection of moldy core in apples based on one-dimensional convolutional Neural Networks. In ASABE annual international virtual meeting. St. Joseph, MI: ASABE, 2021, p. 1.

70.

Chen

, et al. Nondestructive identification of pesticide residues on the Hami melon surface using deep feature fusion by Vis/NIR spectroscopy and 1D-CNN. J Food Process Eng 2020; 44. DOI: 10.1111/jfpe.13602.

71.

Ninh

Doan

T-N-C

Ninh

, et al. Fruit recognition based on near-infrared spectroscopy using deep neural networks. In: The 5th International Conference on Machine Learning and Soft Computing. Da nang, Viet Nam, 29-31 January 2021, p. 90–95.Association for Computing Machinery

72.

Rong

Wang

Ying

, et al. Peach variety detection using VIS-NIR spectroscopy and deep learning. Computers and Electronics in Agriculture 2020; 175: 105553. DOI: 10.1016/j.compag.2020.105553.

73.

Escarate

Farias

Naranjo

, et al. Estimation of soluble solids for stone fruit varieties based on near-infrared spectra using machine learning techniques. Sensors (Basel) 2022; 22: 6081. DOI: 10.3390/s22166081.

74.

Ninh

Phan

Ninh

, et al. Determination of fruit freshness using near-infrared spectroscopy and machine learning techniques. Intel Sys Netw, 2022; 471: 455–464. DOI: 10.1007/978-981-19-3394-3_52.

75.

Liu

H-J

Wei

C-Y

Han

, et al. Determination of huanghua pear's harvest time based on convolutional neural networks by visible-near infrared spectroscopy | 基于全卷积神经网络的黄花梨采收期可见-近红外光谱检测方法. Guang Pu Xue Yu Guang Pu Fen Xi/Spectroscopy and Spectral Analysis 2020; 40: 2932–2936. DOI: 10.3964/j.issn.1000-0593(2020)09-2932-05.

76.

Yang

Luo

Zhang

, et al. A deep learning approach to improving spectral analysis of fruit quality under interseason variation. Food Control 2022; 140: 109108. DOI: 10.1016/j.foodcont.2022.109108.

77.

Mishra

Passos

. A synergistic use of chemometrics and deep learning improved the predictive performance of near-infrared spectroscopy models for dry matter prediction in mango fruit. Chemometrics and Intelligent Laboratory Systems 2021; 212: 104287. DOI: 10.1016/j.chemolab.2021.104287.

78.

Mishra

Rutledge

Roger

, et al. Chemometric pre-processing can negatively affect the performance of near-infrared spectroscopy models for fruit quality prediction. Talanta 2021; 229: 122303. DOI: 10.1016/j.talanta.2021.122303

79.

Mishra

Passos

. Deep chemometrics: validation and transfer of a global deep near-infrared fruit model to use it on a new portable instrument. Journal of Chemometrics 2021; 35. DOI: 10.1002/cem.3367.

80.

Mishra

Passos

. Deep multiblock predictive modelling using parallel input convolutional neural networks. Anal Chim Acta 2021; 1163: 338520. DOI: 10.1016/j.aca.2021.338520.

81.

Mishra

Passos

. Realizing transfer learning for updating deep learning models of spectral data to be used in new scenarios. Chemometrics and Intelligent Laboratory Systems 2021; 212: 104283. DOI: 10.1016/j.chemolab.2021.104283.

82.

Mishra

Passos

. Multi-output 1-dimensional convolutional neural networks for simultaneous prediction of different traits of fruit based on near-infrared spectroscopy. Postharvest Biology and Technology 2022; 183: 111741. DOI: 10.1016/j.postharvbio.2021.111741.

83.

Martins

Guerra

Pires

, et al. SpectraNet–53: a deep residual learning architecture for predicting soluble solids content with VIS–NIR spectroscopy. Computers and Electronics in Agriculture 2022; 197: 106945. DOI: 10.1016/j.compag.2022.106945.

84.

Ference

, et al. An accuracy improvement method based on multi-source information fusion and deep learning for TSSC and water content nondestructive detection in “luogang” orange. Electronics 2021; 10: 80. DOI: 10.3390/electronics10010080.

85.

Wang

, et al. Nondestructive detection of internal flavor in ‘shatian’ pomelo fruit based on visible/near infrared spectroscopy. HortScience 2021; 56: 1325–1330. DOI: 10.21273/hortsci16136-21.

86.

Cavaco

Pires

Antunes

, et al. Validation of short wave near infrared calibration models for the quality and ripening of ‘Newhall’ orange on tree across years and orchards. Postharvest Biology and Technology 2018; 141: 86–97, DOI: 10.1016/j.postharvbio.2018.03.013.

, et al. Achieving robustness across season, location and cultivar for a NIRS model for intact mango fruit dry matter content. Postharvest Biology and Technology 2020; 168: 111202. DOI: 10.1016/j.postharvbio.2020.111202.

, et al. Accurate prediction of soluble solid content of apples from multiple geographical regions by combining deep learning with spectral fingerprint features. Postharvest Biology and Technology 2019; 156: 110943. DOI: 10.1016/j.postharvbio.2019.110943.

89.

Anderson