Abstract

Bringing artificial intelligence to analytical imaging
Analytical chemistry increasingly relies on imaging techniques from visible and near-infrared (NIR) hyperspectral cameras, Raman, and X-ray computed tomography (CT) systems to visualize spatial chemical distributions. Yet, while these techniques capture large, information-rich datasets, extracting meaningful spatial features remains challenging. Traditional chemometric tools like PCA or PLS excel at interpreting spectral data but are less adept at handling the complex textures, edges, and structural nuances present in images. Deep learning (DL) particularly pre-trained convolutional neural networks (CNNs) which can automatically extract hierarchical, multiscale image features. These “deep features” capture fine spatial patterns without the need for manually designed filters. Our recent MATLAB 1 tutorial bridges a critical gap, guiding analytical scientists in combining pre-trained deep models with chemometric analysis for diverse imaging modalities. Rather than training networks from scratch, this tutorial leverages open-source CNNs (such as ResNet 2 or VGG) as plug-and-play feature extractors. The workflow is simple, load a pre-trained model (e.g. ResNet-18). Resize and normalize images to match model input dimensions (e.g. 224 × 224 × 3). Pass images through the network to extract high-level feature vectors. Apply multivariate models typically PLS regression to link deep features with analytical targets.
Deep feature extraction in MATLAB: From pixels to predictive power
In deep feature extraction, raw image data are transformed into condensed numerical vectors that summarize the spatial characteristics learned by CNNs. For example, the ResNet-18 model, trained on millions of ImageNet images, automatically detects patterns at multiple abstraction levels edges, shapes, and textures through convolutional layers and residual connections. When an image is passed through such a network, intermediate layers generate feature maps representing the response of thousands of filters. A Global Average Pooling (GAP) operation condenses these maps into a compact 512-dimensional feature vector. This vector can be viewed as an “image spectrum” multivariate but abstract, ready for chemometric analysis. Importantly, these features are generic they can be extracted from RGB photographs, NIR or Raman pseudo-color maps, X-ray projections, or hyperspectral images after compression into three representative bands. The resulting deep features are then linked to target variables such as composition, texture, or quality using methods like PLS regression, which handle multicollinearity and high dimensionality efficiently.
Hands-on demonstrations
To make these ideas tangible, three case studies are demonstrated using MATLAB codes that readers can adapt for their own datasets.
RGB imaging of plant-based meat analogues
The first case involves RGB images of 82 plant-based meat analogues designed to mimic fibrous muscle texture. 3 Each sample was photographed under controlled lighting and rated by experts for visual fibrousness on a 0–100 scale. By extracting deep features using ResNet-18 and correlating them with expert ratings via PLS regression, a model with R² > 0.9 and RMSEP ≈ 9 was achieved. This means visual texture a complex, subjective property could be objectively quantified using deep learning and chemometrics. Traditional texture features (e.g. GLCM statistics) would require manual engineering; deep features, however, capture fibrousness patterns automatically, illustrating the power of AI-driven feature extraction.
X-ray CT imaging of beef rib chops
The second example uses X-ray CT images of beef rib sections to predict fat content. 4 Typically, CT-derived Hounsfield values are computed from full 3D scans a time- and data-intensive process. Instead, the tutorial extracts deep features from single 2D X-ray projections to approximate the same predictions. Although predictions using 2D features showed slightly higher errors (RMSEP ≈ 196 g) compared to full CT-based models (RMSEP ≈ 130 g), the approach is computationally efficient and far more practical for rapid industrial screening. This highlights how deep features can serve as surrogates for physical image metrics like Hounsfield values, enabling near real-time estimation of quality attributes from inexpensive imaging setups.
Hyperspectral imaging of pork belly fat
The third demonstration combines visible-NIR hyperspectral imaging and deep learning to predict pork belly fat hardness,5,6 a key quality trait linked to fatty acid composition. Here, two information streams are available, Spectral with mean reflectance across wavelengths (386–1015 nm). Spatial with deep features from pseudo-RGB images created from selected bands (750, 670, and 500 nm). The two data sources were fused using Sequentially Orthogonalized PLS (SO-PLS), 7 a multiblock chemometric method that integrates complementary information. The results are striking with spatial-only model Rc² = 0.75, Rp² = 0.69, RMSEP = 0.32, spectral-only model Rc² = 0.82, Rp² = 0.67, RMSEP = 0.32, and a Fusion model Rc² = 0.91, Rp² = 0.78, RMSEP = 0.27. The fused model outperformed either modality alone, underscoring the benefit of combining spatial and spectral features. This “spatial–spectral synergy” opens new possibilities for non-invasive meat quality assessment, aligning with the growing emphasis on digitalization and sustainability in the food industry.
Practical guidelines for researchers
Start with pre-trained models. 1. MATLAB’s Deep Learning Toolbox includes ready-to-use CNNs such as ResNet, VGG, and DenseNet. Install them via Add-Ons > Deep Learning Toolbox Model for ResNet-18.
Resize and normalize images. 1. Use imresize() and channel-wise normalization to match the model’s input (typically 224 × 224 × 3 pixels).
Extract activations. 1. Use: 2. The “pool5” layer typically yields rich, condensed features. Users can experiment with other layers for different abstraction levels.
Apply chemometric modeling. 1. Use MATLAB’s PLS regression (plsregress) to correlate features with reference values. Cross-validation helps determine the optimal number of latent variables.
Fuse data when available. 1. Combine deep spatial features with spectra using multi-block methods (e.g. SO-PLS), which preserve each modality’s independent information.
Conclusions
This tutorial demonstrates how deep feature extraction can empower analytical chemists to leverage the power of AI without extensive coding or retraining. By using pre-trained deep learning models as feature generators and combining them with chemometric modeling, spatial and spectral information can be integrated for accurate, non-destructive, and interpretable predictions. From plant-based meat texture to beef fat content and pork belly firmness, these case studies show that deep learning chemometrics hybrid methods are not just theoretical they are ready for application in real-world analytical workflows. Researchers are encouraged to adapt the MATLAB scripts to their own datasets and explore the diverse potential of deep features in food science, pharmaceutical analysis, and environmental monitoring.
Footnotes
Acknowledgements
The authors thank the MATLAB Deep Learning community for accessible model libraries and acknowledge ChatGPT for editorial grammar correction.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the KB-54 Sustainable Nutrition & Health program at Wageningen University & Research, financed by the Dutch Ministry of Agriculture, Fisheries, Food Security and Nature, and by the Project RTI2018-096993-B-I00 (Spanish Ministry of Science and Innovation and FEDER).
