Abstract
Chemicals possessing persistence (P) and high mobility (M) can present a hazard to drinking water resources by traversing natural barriers like riverbanks and artificial barriers found in water treatment plants. If the chemical is also toxic (T), i.e. classifiable as a PMT, the agent might be of particular concern as a potential drinking water contaminant. During routine water sampling, detection and quantitation of polar substances with high mobility can be problematic. The German Environment Agency (UBA) is considering the use of the Log Koc value as a proxy for mobility (M). Log Koc is related to Log P by the equation Log Koc = 0.69 Log P + 0.22. In this study, we demonstrate that chemicals with log P values at or very close to 2.0, 3.0 or 4.0 (and their concomitant log Koc values) can vary significantly in their chemical structures, molecular weights, molar volumes, and calculated molar refractivity (CMR), which is related to the mean polarizability of a molecule. The large degree of potential diversity in chemical structure and molecular parameters related to chemical behavior at a particular log P or log Koc value suggests that log Koc might not contain enough information to function as a standalone surrogate for the mobility (M) of a chemical, i.e. as related to its ability to move from a drinking water resource through the water plant purification process.
Introduction
Certain persistent (P) and mobile (M) chemicals inadvertently released into the environment can pass through natural barriers like riverbanks and soil layers above aquifers, and through artificial barriers found in water plants, eventually working their way into drinking water. 1 The time required for an environmental chemical to traverse a particular natural barrier and make its way to a water treatment plant varies greatly, from a few days for surface water runoff, 1 to 2 weeks to percolate through a riverbank, or up to the scale of years to enter groundwater wells. 1 Several factors can influence the time lag between a chemical’s release into the environment and its entry into drinking water. First, higher quantity emissions increase the probability of downstream contamination. Second, persistent chemicals (P) have a longer half-life thereby increasing the chance of eventually entering drinking water. Third, chemicals with higher mobility (M) are more likely to complete the journey from point of environmental entry to drinking water. 1
The German Environment Agency (UBA) is considering the use of the Log Koc value as a proxy for mobility (M). 2 –6 The Soil Adsorption Coefficient is denoted by either Kd or Kf. The Soil Adsorption Coefficient (Kd or Kf) measures the amount of a chemical adsorbed onto soil per amount of water. [The Kf designation comes from the Freundlich solid-water distribution coefficient.] Soil Adsorption Coefficient values vary greatly because the organic chemical compositional percentage of soil varies greatly. Adsorption of chemicals onto soil occurs predominantly by partition into the soil organic matter. Therefore, the Kd or Kf is normalized to the amount of organic carbon in a soil and is expressed as either Koc or Kfoc, which are interchangeable for practical purposes. The Koc is known as the organic carbon-water partition coefficient. 7,8 Log Koc is related to Log P (Log of the octanol/water partition coefficient) by the equation Log Koc = 0.69 Log P + 0.22. 9,10
In this study, we demonstrate that 3 sets of 25 chemicals with log P values at or very close to 2.0, 3.0 or 4.0 (and their concomitant log Koc values), respectively, can vary significantly in their Hansch Quantitative Structure-Activity Relationship (QSAR) parameters. The parameters examined in this study are the calculated base 10 logarithm of the octanol–water partition coefficient (ClogP), the McGowan molecular volume (MgVol), and the calculated molar refractivity (CMR). These QSAR parameters represent hydrophobic (ClogP) effects, steric/size (MgVol) considerations, and the total polarizability (CMR) of a mole of the substance of a chemical on its biological activity. These parameters have proven extremely useful in developing QSAR models describing the quantitative relationships between the biological activity of chemicals and their physicochemical characteristics.
Previously, Smith et al. 11 showed that smaller molecular volumes were found to be associated with higher levels of tumorigenicity. Lower rather than higher levels of lipophilicity were found to be associated with higher levels of tumorigenicity. Positive Ames test results were positively correlated with overall tumorigenicity and with possession of structural alerts of carcinogenicity. Since larger organic molecules have more chemical reaction centers, it was not surprising that higher ClogP values were positively correlated with the number of structural alerts of carcinogenicity. The results from this earlier study 11 demonstrated the ability to devise rational rules for relative tumorigenicity in rodents that correlated with known parameters of toxicity.
Log P values of 2.0 represent 100-fold more solubility in lipid than in water, log P values of 3.0 represent 1000-fold more lipid solubility, and log P values of 4.0 represent 10,000-fold more lipid solubility. Thus, the 75 molecules (3 sets of 25 molecules) examined in this study span a wide range of lipid solubility. As log Koc is linearly related to log P, the range of log Koc values spanned in this study is also very large. The large degree of potential diversity in chemical structure and molecular parameters related to chemical behavior at a particular log P or log Koc value suggests that log Koc might not contain enough information to function as a standalone surrogate for the mobility (M) of a chemical as related to its ability to move from a drinking water resource through the water plant purification process.
Methods
The octanol-water partition coefficients (P) for the 75 chemicals were reported in the literature and located by developing a web scraper searching for chemicals from the Super Natural II Database (http://bioinf-applied.charite.de/supernatural_new/index.php) with log P values of 2.0, 3.0 and 4.0. 12 Super Natural II is a highly curated online database for natural products. Initially designed in 2006, the database contains 325,508 natural products extracted from various resources, including vendor information. Natural II offers both search and analysis options. It provides the toxicity prediction for the database compounds. 12 The log Koc values were calculated from the literature values for log P using the equation 10 :
Calculation of Clog P, CMR and MgVol
Bio-Loom (version 1.6; Biobyte Corp., Claremont, CA, USA) 13 was used to compute the three parameters used in our QSAR analysis from the simplified molecular input line entry system representation of each chemical compound: ClogP, CMR, and MgVol (Online Appendix 4). The utility of Bio-Loom for comparative QSAR (C-QSAR) analysis in comparative correlation analysis has been discussed in Hansch and Leo. 14 The parameters used in this study are also discussed in detail in Hansch and Leo. 14 In brief, ClogP is the calculated logarithm of the partition coefficient in octanol/water and is a measure of hydrophobicity (or lipophilicity) of a chemical. 14,15 MgVol is the molar volume calculated by the method of Abraham and McGowan 16,17 and CMR is the calculated molar refractivity (MR) for the whole molecule. MR is calculated as follows:
where n is the refractive index, MW is the molecular weight, and d is the density of a substance. Since there is very little variation in n, 18 MR is largely a measure of volume with a small correction for polarizability. The MR values are scaled by 0.1. MR can be used for a substituent or for the whole molecule. Clog P and CMR are for the neutral form of partially ionized chemicals. CMR values obtained are calculated using the same program as that used to calculate ClogP. 13 Note that the Clog P values are for the neutral form of acids and bases that may be partially ionized. If the degree of ionization is about the same for a set of congeners, the ionization factor can be neglected; otherwise, good correlation can be obtained using electronic terms. 14,18 The correlation between experimental Log P and Clog P values for 13,815 chemicals in the CLOG program, which is a part of Bio-Loom, 13 is 0.98 (experimental Log P = 1:00 Clog P − 0:03 (n = 13,815, r = 0.98, s = 0.35)). Clog P parameter that was used in this study has been widely used and cited by the QSAR community, both for environmental studies and for drug design. 19 –30 A very high correlation (r = 0.98) between experimental Log P and Clog P gives confidence in using Clog P values whenever experimental Log P values are not available.
Statistical methods
Analysis of variance (ANOVA) for one factor 30
The following scenario is considered. A set of data consisting of
A test for the statistical significance of differences in the means
Thus, the Total sum of squares
Under the assumption (Null Hypothesis) that all the x’s are independent observations taken from the same population with a normal distribution (fixed mean and standard deviation) the Total, Between families, and Within families sum of squares after each being divided by the variance of this normal distribution will have
Fisher goes on to define the correlation
The modification
Results
For the chemicals with log P values at or near 2.0 (for log P of 2, concomitant log Koc value is 1.6), the molecular weights vary widely from a low value of 150.104 to a high of 888.451 (Table 1). In ascending order, the molecular weights of the chemicals with log P values at or near 2.0 are as follows: 150.104; 163.124;164.131; 220.085; 256.167; 260.105; 270.183; 288.16; 288.171; 308.116; 328.058; 348.157; 350.19; 354.067; 358.069; 376.225; 404.255; 420.19; 432.323; 480.183; 482.217; 483.226; 486.262; 594.377; and 888.451. For the chemicals with log P values at or near 3.0 (for log P of 3, concomitant log Koc value is 2.29), the molecular weights range from 125.12 to 694.263 (Table 1). In ascending order, the molecular weights of the chemicals with log P values at or near 3.0 are as follows: 125.12; 156.151; 198.068; 203.095; 222.126; 250.157; 252.173; 317.128; 328.067; 330.183; 334.214; 342.194; 352.154; 356.126; 385.213; 389.257; 395.185; 398.137; 424.271; 436.163; 476.241; 489.154; 493.273; 530.324; and 694.263. For the chemicals with log P values at or near 4.0 (for log P of 4, concomitant log Koc value is 2.98), the molecular weights range from 216.151 to 839.555 (Table 1). In ascending order, the molecular weights of the chemicals with log P values at or near 4.0 are as follows: 216.151; 218.167; 228.115; 250.063; 254.094; 259.194; 260.095; 266.182; 304.167; 320.046; 324.184; 339.183; 351.147; 357.984; 376.142; 377.126; 402.11; 439.307; 450.14; 451.076; 472.319; 486.173; 538.387; 583.295; and 839.555. In summary, chemicals sharing very similar log P values of 2.0, 3.0 and 4.0 display quite wide ranges of molecular weights indicative of the great diversity of chemical structures capable of having similar or the same log P or log Koc values.
A diverse set of chemicals with log p values at or very near 2.0, 3.0 or 4.0.
log Koc = 0.69 log P + 0.22
For the chemicals with log P values at or near 2.0 (for log P of 2, concomitant log Koc value is 1.6), the volumes of 1 mole of each compound at Standard Temperature and Standard Pressure (MgVols) display a wide range from a low value of 1.34 to a high of 6.54 (Table 1). In ascending order, the MgVols of the chemicals with log P values at or near 2.0 are as follows: 1.34; 1.39; 1.41; 1.6; 1.92; 2.07; 2.12; 2.19; 2.19; 2.31; 2.33; 2.4; 2.42; 2.66; 2.68; 2.94; 3.03; 3.16; 3.5; 3.56; 3.58; 3.63; 3.78; 4.66; and 6.54. For the chemicals with log P values at or near 3.0 (for log P of 3, concomitant log Koc value is 2.29), the MgVols display a wide range from a low value of 1.25 to a high of 4.89 (Table 1). In ascending order, the MgVols of the chemicals with log P values at or near 3.0 are as follows: 1.25; 1.47; 1.54; 1.58; 1.77; 2.05; 2.1; 2.16; 2.34; 2.49; 2.54; 2.65; 2.65; 2.71; 2.84; 2.96; 2.97; 3.18; 3.19; 3.36; 3.39; 3.53; 3.74; 4.24; and 4.89. For the chemicals with log P values at or near 4.0 (for log P of 4, concomitant log Koc value is 2.98), the MgVols fall over a wide range from a low value of 1.61 to a high value of 6.72 (Table 1). In ascending order, the MgVols of the chemicals with log P values at or near 4.0 are as follows: 1.61; 1.76; 1.86; 1.91; 1.93; 1.94; 1.96; 2.2; 2.22; 2.34; 2.38; 2.55; 2.69; 2.72; 2.74; 2.78; 2.83; 2.98; 3.09; 3.59; 3.62; 3.88; 4.4; 4.41; and 6.72. In summary, chemicals sharing very similar log P values of 2.0, 3.0 and 4.0 display quite wide ranges of molar volumes indicative of the great diversity of chemical structures capable of having similar or the same log P or log Koc values.
For the chemicals with log P values at or near 2.0 (for log P of 2, concomitant log Koc value is 1.6), the CMRs of each compound display a wide range from a low value of 4.79 to a high of 22.65 (Table 1). In ascending order, the CMRs of the chemicals with log P values at or near 2.0 are as follows: 4.79; 4.98; 5.08; 5.79; 6.92; 6.96; 7.49; 7.94; 8.06; 8.15; 8.65; 8.87; 9.37; 9.49; 9.84; 10.25; 10.69; 11.31; 12.49; 12.81; 12.98; 13.05; 13.57; 15.83; and 22.65. For the chemicals with log P values at or near 3.0 (for log P of 3, concomitant log Koc value is 2.29), the CMRs display a wide range from a low value of 3.9 to a high of 17.67 (Table 1). In ascending order, the CMRs of the chemicals with log P values at or near 3.0 are as follows: 3.9; 4.79; 5.83; 5.85; 6.22; 7.15; 7.17; 7.77; 9.08; 9.24; 9.31; 9.36; 9.61; 10.09; 10.32; 10.83; 10.84; 11.08; 11.79; 12.1; 12.3; 12.7; 13.9; 14.27; and 17.67. For the chemicals with log P values at or near 4.0 (for log P of 4, concomitant log Koc value is 2.98), the CMRs fall over a wide range from a low value of 4.68 to a high value of 23.32 (Table 1). In ascending order, the CMRs of the chemicals with log P values at or near 4.0 are as follows: 4.68; 6.84; 6.9; 6.94; 7.22; 7.36; 7.93; 8.23; 8.27; 8.3; 8.77; 9.43; 10.01; 10.14; 10.43; 10.67; 10.8; 10.82; 12.12; 12.96; 13.47; 13.8; 14.89; 15.54; and 23.32. In summary, chemicals sharing very similar log P values of 2.0, 3.0 and 4.0 display quite wide ranges of CMRs indicative of the great diversity of chemical structures capable of having similar or the same log P or log Koc values.
Table 2 shows the results from an analysis of variance (ANOVA) for one factor in which the means of the molecular parameters MgVol, CMR, and molecular weight were shown to not be related to the log Koc values derived from log Ps at or very near to 2, 3 or 4. The p value for the comparison of the means of MgVol values at the three different log Koc values was 0.896065. The p value for the comparison of the means of CMR values at the three different log Koc values was 0.759057. The p value for the comparison of the means of molecular weights at the three different log Koc values was 0.889793. Therefore, MgVol, CMR and molecular weight are not related to log Koc.
ANOVA of log koc (from log Ps at or near 2, 3 or 4) with MgVol, CMR, and molecular weight.
Discussion
There are additional examples that illustrate the utility of employing Hansch molecular parameters toward better understanding of complex toxicological issues impacting the environment. Previously, Garg and Smith 32 conducted a QSAR study to address an important problem encountered in the prediction of the bioconcentration factor (BCF) of highly hydrophobic chemicals. They noted that the linear relationship between the BCF and hydrophobic parameter, i.e. calculated octanol-water partition coefficient (ClogP), breaks down for highly hydrophobic chemicals. Their results suggested that a non-linear relationship between BCF and the hydrophobic parameter, along with inclusion of additional molecular size, weight and/or volume parameters, should be considered while developing a QSAR model for more reliable prediction of the BCF of highly hydrophobic chemicals.
In the current study, 75 randomly selected compounds with log P values at or very near 2.0, 3.0 or 4.0 were selected searching the Super Natural II database with a web scraper. Many more chemicals could have been selected via the same method, but 25 chemicals in each log P/log Koc category well illustrated the high degree of structural diversity that can occur at the same log P/log Koc value. The width of the molecular weight ranges found within a given log P/log Koc value were notable, i.e. 150.104-888.451 at log P = 2/log Koc = 1.6; 125.12-694.263 at log P = 3/log Koc = 2.29; 216.151-839.555 at log P = 4/log Koc = 2.98. There are similarly large ranges within a given log P/log Koc value for molar volumes and for CMRs, which are related to polarizability. The German Environment Agency (UBA) is considering the use of the Log Koc value as a proxy for mobility (M). The large degree of potential diversity in chemical structure and molecular parameters related to chemical behavior at a particular log P or log Koc value suggests that log Koc might not contain enough information to function as a standalone surrogate for the mobility (M) of a chemical as related to its ability to move from a drinking water resource through the water plant purification process.
Industrial chemicals and pesticides do not share similar use and release patterns into the environment. Development of regulatory schema based on persistent, mobile, toxic (PMT) and very persistent, very mobile (vPvM) criteria should account for the differences between these chemical classes. 1 Application of synthetic pesticides and herbicides to agricultural fields is a precise, highly technical, expensive and time-consuming process. 33 The number of applications required is specific to the soil type and fertility, rainfall, erosion and weathering, field slope and runoff pathways, potency toward the intended pests, and crop type. 34 –40 US EPA and the European Union (EU) promulgate regulations toward minimization of pesticide use via programs of Integrated Pest Management (IPM). 41,42 Several factors influence the environmental fate of pesticides including rates of abiotic or biotic degradation and dissipation, bioconcentration and sorption (mobility). In turn, these factors vary with sunlight intensity, pH of the water or soil, hydroxyl radical concentration, number and type of microbial organisms in contact with the pesticide, soil composition, and qualitative characteristics of the organic carbons present. 7,43
In summary, the German Environment Agency (UBA) is considering additional steps to protect the integrity of the drinking water supply. The chemicals of particular concern to the UBA as potential drinking water contaminants represent a subset of chemicals classified under REACH as PMT or vPvM substances. 2 –5 During routine water sampling, detection and quantitation of polar substances with high mobility can be problematic. 5,44,45 The UBA is considering the use of the Log Koc value as a proxy for mobility (M). Log Koc is related to Log P by the equation Log Koc = 0.69 Log P + 0.22. In this study, we demonstrate that chemicals with log P values at or very close to 2.0, 3.0 or 4.0 (and their concomitant log Koc values) can vary significantly in their chemical structures, molecular weights, molar volumes, and calculated molar refractivity (CMR), which is related to the mean polarizability of a molecule. The large degree of potential diversity in chemical structure and molecular parameters related to chemical behavior at a particular log P or log Koc value suggests that log Koc might not contain enough information to function as a standalone surrogate for the mobility (M) of a chemical, i.e. as related to its ability to move from a drinking water resource through the water plant purification process. 46
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
