Abstract
We have built infrared spectroscopy-based partial least-squares (PLS) models for molecular polarizabilities with the use of a 97-member training set and a 59-member independent prediction set. These 156 compounds span a very wide range of chemical structures. Our goal was to use this well-defined chemical property to test the breadth of application of a method whose end use is aimed at predicting poorly defined, environmentally important properties and activity parameters (e.g., microbial transformation rate constants). Separate models were built by using gas-phase mid-infrared spectra and, alternatively, their Fourier transformations (i.e., interferograms). The optimum spectrum- and interferogram-based models produced approximately the same error (root mean square deviation divided by the parameter value range) for the independent prediction set, 9.53 and 9.92%, respectively. With spectrum-based models, we found that deresolving the spectra from a point spacing of 6 cm−1 to about 40 cm−1 produced much lower error (under leave-one-out cross-validation) when all 156 compounds were included, but much higher error when a model was built by using a structurally narrow subset of the compounds (namely, 38 alkanes). Qualitative interpretation of the first PLS weight-loading vector from the spectrum-based model provided important information on the relationship between chemical structure and molecular polarizability.
Keywords
Get full access to this article
View all access options for this article.
