Abstract
This research introduces a model utilizing machine learning for forecasting Chemical Ionization Mass Spectrometry (CIMS) signal intensity in pesticide detection, using dibromomethane (DBrMe) as a reagent. Accurate detection of pesticides is crucial for agricultural safety and compliance. The model explores the relationship between signal intensity and ten molecular features, including molar mass, COO, N-O, N-N, N-S, C-C, S, Cl, P, and pesticide concentration in DBrMe (ppm), using algorithms like Decision Tree, AdaBoost, Random Forest, and Ensemble Learning. A dataset of 2460 samples was used for training and validation. Among the features, pesticide concentration had the strongest influence, followed by N-O, COO, and molar mass. SHAP analysis confirmed these trends, while a Leverage-based method was used to identify and remove outliers, improving model reliability. Random Forest outperformed other models, achieving the highest R2 (0.401) and lowest error. In contrast, Decision Tree and AdaBoost showed overfitting issues. Sensitivity analysis demonstrated that all variables contribute to the prediction, highlighting the model's robustness. This approach offers a cost-effective, accurate alternative to traditional experimental methods for estimating CIMS signal intensity across various pesticides and conditions, supporting faster and more efficient chemical analysis in agricultural monitoring.
Keywords
Get full access to this article
View all access options for this article.
