Abstract
A quantitative structure-activity relationship (QSAR) system for estimating skin
sensitization potency has been developed that incorporates skin metabolism and
considers the potential of parent chemicals and/or their activated metabolites to
react with skin proteins. A training set of diverse chemicals was compiled and their
skin sensitization potency assigned to one of three classes. These three classes
were, significant, weak, or nonsensitizing. Because skin sensitization potential
depends upon the ability of chemicals to react with skin proteins either directly or
after appropriate metabolism, a metabolic simulator was constructed to mimic the
enzyme activation of chemicals in the skin. This simulator contains 203
hierarchically ordered spontaneous and enzyme controlled reactions. Phase I and phase
II metabolism were simulated by using 102 and 9 principal transformations,
respectively. The covalent interactions of chemicals and their metabolites with skin
proteins were described by 83 reactions that fall within 39 alerting groups. The
SAR/QSAR system developed was able to correctly classify about 80% of the chemicals
with significant sensitizing effect and 72% of nonsensitizing chemicals. For some
alerting groups, three-dimensional (3D)-QSARs were developed to describe the
multiplicity of physicochemical, steric, and electronic parameters. These 3D-QSARs,
so-called pattern recognition-type models, were applied each time a latent alerting
group was identified in a parent chemical or its generated metabolite(s). The concept
of the mutual influence amongst atoms in a molecule was used to define the structural
domain of the skin sensitization model. The utility of the structural model domain
and the predictability of the model were evaluated using sensitization potency data
for 96 chemicals not used in the model building. The
Allergic contact dermatitis (ACD) is a skin condition estimated to affect a significant proportion of the general worldwide population (∼1%) (Smith and Hotchkiss 2001). In the United Kingdom, skin disease accounts for 22% of all occupational diseases; 80% of these are due to ACD (Gawkrodger 2001). There is a steady increase in the incidence of ACD due to repeated exposure to sensitizing chemicals. In vivo methods are still used for assessing the sensitization potential of new chemical entities (NCEs) (Smith Pease, Basketter, and Patlewicz 2003).
The guinea pig maximization test (GPMT) developed by Magnusson and Kligman (1970) and the occluded patch test of Buehler (Buehler 1965) have provided the core of predictive skin sensitization testing for many years. More recently, the local lymph node assay (LLNA) has been accepted as a valid alternative to skin sensitization hazard identification (Basketter, Smith Pease, and Patlewicz 2003). In the guinea pig maximization test, chemicals are classified as skin sensitizers if at least 30% of tested animals show a positive response; in a Buehler test classification results if at least 15% of tested animals show a positive response (e.g., EU classification). In the LLNA, a threefold or greater increase in lymph node cell proliferation activity in treated groups as compared to concurrent vehicle controls is required for classification. The LLNA is now an acceptable hazard identification assay not only from an animal welfare point of view, but also because it provides a quantitative measure of sensitization potency.
In February 2001, the Commission of the European Communities presented in a White Paper entitled “Strategy for a Future Chemicals Policy” that proposed about 30,000 existing chemicals to be evaluated for a range of toxic effects (Commission of the European Communities 2001). This will be an extremely expensive and time- and animal-consuming undertaking. In addition, the adoption of the 7th Amendment in the EU Cosmetics Directive is likely to lead to a virtual ban on practically all animal testing for evaluating toxicity hazard of cosmetic ingredients by 2009 (some tests will be exempted until 2013) (The EU Cosmetic Directive—The 7th Amendment 2002).
Alternative hazard testing methods and strategies could potentially reduce the time and
monetary cost to the chemical industry needed for compliance within REACH (Registration,
Evaluation, Authorization of Chemicals) and other EU legislative initiatives such as the
7th Amendment. One alternative to animal testing for toxicity is the use of quantitative
structure-activity relationships (QSARs). These are mathematically derived rules or models
that qualitatively or quantitatively describe a property or activity in terms of
descriptors for a chemical structure. By mathematically modeling the binding of the
sensitizing chemical to the skin protein as a competition between chemical reaction and
loss of sensitizer by extraction from a lipid environment into a polar lymphatic fluid,
Roberts and Williams (1982) derived a
relative alkylation index (RAI). The RAI quantified the relative extent of sensitizer
binding to skin protein as a function of the dose given, the chemical reactivity (which may
be expressed in terms of relative rate constants for reaction with model nucleophile or in
terms of Hammett or Taft substituent constants), and the hydrophobicity expressed as a
partition coefficient. They derived quantitative relationships between RAI and skin
sensitization potential for saturated 1,3-and 1,4-sultones, and
Often these models were derived for specific chemical classes. Attempts have also been made
to analyze diverse chemical data sets. Discriminant and correspondence analyses have been
applied to model sensitization potential for heterogeneous databases (Cronin and Basketter 1994; Magee, Hostynek, and Maibach 1994; Cronin and Dearden 1997). Enslein et al. (1997) developed QSAR models for assessing
chemical sensitization for 315 chemicals. These models were incorporated in the TOP-KAT
system (acronym for “
Skin sensitization is a complex toxicological process involving a number of biochemical and
physiological events. This suggests a need for parallel QSARs to reflect the potential of
chemicals to cross the stratum corneum in sufficient quantities, the steric effects around
the active center, overall molecular size and shape, secondary toxicities such as skin
irritancy as well as skin metabolism (Ashby et al.
1995). Metabolism is frequently acknowledged to significantly affect
sensitization (Ashby et al. 1995; Smith and Hotchkiss 2001; Smith Pease, Basketter, and Patlewicz 2003). The complexity in
describing it might explain why traditional QSAR approaches for modeling skin sensitization
have shown limited success. The recently developed modeling tool,
MATERIALS AND METHODS
Skin Sensitization Data
The model was derived from a data set compiled from chemicals tested in the LLNA (185 chemicals), GPMT (307 chemicals), as well as from the BgVV list (248 chemicals).
The LLNA was initially developed as an alternative approach to hazard identification
(Kimber and Basketter 1992; Kimber et al. 1994, 2002; Dearman,
Basketter, and Kimber 1999; Gerberick et al. 2000; Basketter et
al. 2002). In the local lymph node assay, the response of the immune system
to the topical application of chemicals is measured by the lymph node cell
proliferation. Typically, groups of mice are exposed to various concentrations of a
test chemical or to the vehicle alone for 3 consecutive days on the dorsum of both
ears. The practice is to select three consecutive suitable concentrations of the
chemical in ideally a standard vehicle, looking for the highest possible
concentration, whilst avoiding the unacceptable dermal trauma or systemic toxicity.
Five days after the initiation of the exposure, mice are injected intravenously with
[3H]thymidine. Five hours later, the animals are sacrificed and for
each dose group, a stimulation index (SI) is determined by measuring the increase in
[3H]thymidine incorporation in the lymph nodes relative to the
vehicle-treated control group. Currently, the preference is to estimate the
concentration of a chemical required to generate a three-fold greater increase in
radioisotope incorporation as compared to the control, known as the
The guinea pig maximization test was specifically designed for hazard identification and is not well suited to potency estimation (ECETOC 2000). First, a pilot study is performed to determine the maximum nonirritating dose. Test and control groups contain 15 to 20 animals. Testing lasts approximately 5 weeks and includes an induction phase, rest period, and 24-h topical challenge. Red and swollen sites on test animals indicate sensitization. Sensitization potential is characterized based on the percentage of animals exhibiting reactions. The classification of sensitization potential for GPMT is presented in Table 1 (Barratt et al. 1994; ECETOC 2003). Classification schemes for the LLNA and both guinea pig test methods have recently been proposed by ECETOC (2003).
At the German Federal Institute for Health Protection of Consumers and Veterinary Medicine (BgVV), a group of experts including dermatologists from universities and representatives of chemical industry and from regulatory authorities was established in 1985 (Schlede et al. 2003). The aim of the project was to bring together a broad range of expertise to collect and evaluate data from the literature on substances with documented contact allergic properties in humans and animal experiments. The evaluation results for selected chemicals were published as a loose-leaf book (Kayser and Schlede 2001). In this publication, chemicals were listed as category A (significant contact allergen), category B (solid-based indication for a contact allergenic potential), or category C (contact allergen with insignificant or questionable contact allergenic potential). The definition of these categories is given in Table 1. The authors realize that the BgVV classification scheme differs from the pure hazard identification schemes applied in the LLNA or guinea pig testing because classification according the BgVV can also take into account prevalence in the population or clinical relevance.
Ideally modeling would have been carried out using one experimental protocol,
preferentially the LLNA. Because the goal was to derive a QSAR as widely applicable
as possible, the different training set test protocols were combined to expand the
chemical diversity and the sensitization data reassessed using a unified potency
scale. This did mean that some of the Category C BgVV data needed to be re-reviewed
by the authors and reclassified from weak sensitizers to nonsensitizers. This unified
classification scheme (Table 2)
mirrors that for the LLNA and the
It should be noted that 17 chemicals exist in all three data sets; 34 chemicals were
present in both the LLNA and BgVV data - sets; 44 chemicals were common to both GPMT
and BgVV data sets, and 41 chemicals in both the LLNA and GPMT - data sets. The
strength of the association between the different data sets was estimated using the
adjusted Pearson’s contingency coefficient,
Modeling Approach
The single biodegradation pathway probabilistic scheme used in CATABOL (microbial metabolism simulation model) (Jaworska et al. 2001) was modified in TIMES (Mekenyan et al. 2004) to enable the modelling of multiple biotransformation pathways in the skin. The multipathway scheme was predicated on the condition that a chemical could be metabolically transformed across both the most probable pathway as well as less significant pathways. The abiotic molecular transformations and enzyme-mediated reactions such as oxidative, redox, reductive, hydrolytic, and conjugative reactions, and reactions with skin proteins, were mimicked by a hierarchically ordered list of principal molecular transformations.
Each molecular transformation comprises a parent submolecular fragment, transformation products, inhibiting fragment (masks), and a reactivity SAR/QSAR model. The masks act as reaction inhibitors; thus, a fragment assigned as a mask attached to a target subfragment could prevent the transformation on the parent chemical from occurring. The presence of groups that can promote or inhibit enzymatic reactions significantly increases the number of principal transformations. A probability of occurrence is assigned to each transformation to determine its hierarchy in the transformation list.
A parent molecule with the source fragment is matched against a list of hierarchically ordered transformations, starting with those having the highest probability. This effectively produces a set of first level metabolites. Each of these derived metabolites is then submitted to the same list of hierarchically ordered transformations, to produce a second level of metabolites. The procedure is continued until a constraint for metabolism propagation is satisfied [e.g., low probability of obtaining a metabolite; application of Phase II reaction (glucuronidation, sulfate conjugation, etc.); protein conjugation or formation of a hydrophilic metabolite]. The metabolic profile map that is generated is then used to calculate both the quantities of metabolites and the probabilities of the predicted metabolites. TIMES is completely integrated with both a QSAR library and a modelling engine to provide an estimate of physicochemical properties and the toxicity of each metabolite. A detailed explanation of the mathematical formalism for the TIMES approach for metabolite simulation and a discussion of the practical applications for forecasting plausible activated metabolites have been published elsewhere (see Mekenyan et al. 2004).
The initial simulation of metabolism by TIMES is based mainly on two-dimensional (2D)
chemical structures. Whilst it is a widely accepted hypothesis that skin
sensitization involves the covalent binding of the sensitizing chemical to protein in
skin, models based only on 2D molecular structure may be insufficient to account for
reactivity and, hence, sensitization potential. For this reason, the
RESULTS AND DISCUSSION
Model Construction
There is a long-established connection between the ability of chemicals to react with
proteins to form covalently linked conjugates and their skin sensitization potential
(Landsteiner and Jacobs 1936; Dupuis and Benezra 1982; Basketter et al. 1995; Leppoittevin et al. 1998). For skin
sensitization to occur, once a chemical has penetrated, it must be able to partition
into relevant cellular compartments of the epidermis, in order to be sufficiently
bioavailable. More importantly, in terms of mechanism, the chemical, or its
metabolite, must be sufficiently electrophilic to react covalently with nucleophilic
groups on skin protein to produce the complete antigens capable of invoking an immune
response. The skin sensitization model was built as a composite of the following
submodels: Skin metabolism simulator: This mimics the metabolic fate of parent chemical
controlled by skin enzymes and thus the potential formation of protein
adducts with reactive agents. 2D structural information of parent chemicals
is used to model metabolism. Metabolic pathways are generated based on a set
of hierarchically ordered principal transformations including spontaneous
reactions, enzyme-catalyzed phase I and phase II drug metabolism reactions,
and reactions with protein nucleophiles. The formation of macromolecular
immunogens was used to identify probable structural alerts in parent
chemicals or their metabolites. COREPA 3D-QSARs for intrinsic reactivity of compounds having substructures
associated with activity: These models depend on both the structural alert
and the rate of skin sensitization. Steric effects around the active site,
molecular size, shape, solubility, lipophilicity, and electronic properties
are taken into account. These models generally may involve combinations of
molecular parameters or descriptors, which trigger (“fire”) the alerting
group.
Skin Metabolism Model
The core of the skin metabolism model is a set of 203 principal transformations.
These were defined on the basis of empirical and theoretical knowledge and hence
peer-reviewed by human experts. A representative subset of these transformations, as
well as their reliability, probability, and inhibiting masks are outlined in Table 3. Due to the limited quantitative
skin metabolism data reported for chemicals, many transformation reliabilities were
assigned on the basis of available information or expert knowledge in one of five
confidence levels. The highest confidence level of reliability (
The principal transformations were separated into two major classes:
non–rate-determining and rate-determining reactions. Non–rate-determining included
abiotic and biologically mediated transformations, which occur at very high rates.
They also included nine non–rate-determining reactions predominantly hydrolysis of
salts. Eighty-three other transformations simulating the reactions of highly reactive
groups and intermediates such as quinones,
Phase I reactions were modelled through 102 transformations. These transformations
included amongst others,
The set of hierarchically ordered principal transformations within TIMES effectively
simulate metabolism by generating plausible and likely metabolic pathways. The
established hierarchy between the transformations and hydrophobicity effectively
control the propagation of the generated metabolic pathways. The process of metabolic
tree generation continues until user-defined thresholds for probability of obtaining
of metabolites and their hydrophobicity (log
Isoeugenol is shown to undergo demethylation to give rise to the corresponding
catechol, followed by oxidation to the
Modeling skin metabolism led to the following percentages of correctly classified S sensitizers: LLNA 78%, GPMT 81%, and BgVV 82%. It is worth mentioning that these percentages were lower than 50% when transformations modeling metabolic activation were not considered. Good predictions were also obtained for nonsensitizers (N): 68%, 70%, and 57%, respectively, for the LLNA, GPMT, and BgVV datasets. The percentage of correctly classified weak (W) sensitizers was 26%, 22%, and 39%, respectively, in the three data sets. The poor classifications for weak sensitizing chemicals may be attributed primarily to the use of 2D structural information and some weakness in the underlying biological data set. The majority of chemicals with structural alerts were classified as significant (S) sensitizers.
QSAR Models of Reactivity
Pattern recognition models were derived for ketones; sulfuric, sulfonic, or phosphonic acid esters; aldehydes; and conjugated double bonds containing O, S, or N atoms in order to improve the classification of chemicals (containing these functional groups) on the basis of skin metabolism simulation. Models were derived at a probability threshold of 0.6 (Mekenyan et al. 1997, 1999; Bradbury et al. 2000). These four specific alerting groups and the number of experimentally observed S, W, and N sensitizers are presented in Table 4.
Due to limited data, chemicals having only one of the above mentioned four alerts from the three datasets were combined to form the corresponding training sets. For some of alerts, such as ketones, acid esters and aldehydes, QSAR models were only trained to discriminate between S and N (or between S and W) sensitizers as there were insufficient data for weak (or nonsensitizing) chemicals available.
For example, the COREPA decision model for classifying chemicals with conjugated
double bonds containing either O, S, or N atoms as strong sensitizers uses as
discriminating parameters in the logic boxes of the decision tree the lowest
unoccupied molecular orbital (
Integration of Skin Metabolism and COREPA Models
QSAR models derived for identifying skin sensitization potency for some alerting groups were used to reduce misclassifications based on skin metabolism only. During metabolism simulation, chemicals having select alerts (such as ketones; sulfuric, sulfonic, or phosphonic acid esters; aldehydes; and conjugated double bonds containing O, S, or N atoms) were screened by the COREPA models to determine the level of activation of the alerting groups and subsequently their ability to interact with protein nucleophiles. The results of the combined application of skin metabolism and partial COREPA models in predicting the skin sensitization potential (as S, W, or N) of the chemicals in the LLNA, PGMT, and BgVV datasets are summarized in Figure 2.
Model predictions for the group of W sensitizers were most significantly improved with the percentage of chemicals correctly classified increased to 42%. As can be seen from Figure 2, COREPA models do not significantly improve the identification of S and N sensitizers. The present QSAR model has some limitations with chemicals such as acrylates, fused acyclic hydrocarbons, and multifunctionalized chemicals containing phenolic, ether, ester, lactone, azolactone, or isothiocyanate moieties in the same molecule. In addition, metals are limited primarily to sodium (predominately as sulfonate salts or acetate salts); lead as lead acetate; mercury as methylmercury ion; and zinc as its thiocarbamate salt.
Applicability Domain of the Skin Sensitization Model
The number and breadth of chemicals underpinning the model are limited and not universally applicable. Consequently, it is critical to establish whether the model is applicable or appropriate for a given new query chemical submitted for screening. The basic principles of the theory of chemical structure introduce the concept of the mutual influence amongst the atoms in a molecule and this can be used to help define the applicability domain of the model. It is common knowledge that besides the first neighbors, all other atoms in a molecule affect each other. For example, a chlorine atom attached to a carbon atom differs substantially depending on whether that carbon atom belongs to an alkane, alkene, or aromatic-ring fragment.
Based on the concept of mutual influence between the atoms in a molecule, the
structural domain of an atom participating in a number of molecules, ions, or
radicals can be defined by the set of all atom-centered fragments containing this
atom and its first, second, etc., neighbors. The influencing effect of neighbors
decreases with the increase of their distance to a specified atom; it also depends on
the atomic type of the neighbors. For example, conjugated systems and aromatic-ring
systems may transmit the electron withdrawing or donating effects of an atom much
further. The following rules are proposed to reflect the effect of different
neighbors on the specified atom: Hydrogen atoms are treated as an inherent part of the atom to which they are
bound; so effectively, they are ignored and are considered to be a
characteristic of the atoms to which they are bound. The first, second, …, and nth neighbors are selected to determine the atom
centered fragment. If some of the first, second, …, and nth neighbors are aromatic carbon atoms
(C{ar}), then the entire aromatic ring containing this atom is considered as
a single neighbor. If the nth neighbor is C{sp3}- or C{ar}-atom, the (n+1)th,
(n+2)th, and further, neighbors are assumed to have insignificant effect on
the properties of the centered atom. If the nth neighbor is not C{sp3}- or C{ar}-atom, then the
atom-centered fragment is propagated until a C{sp3}- or
C{ar}-atom is reached.
These rules were used to determine the structural domain of the training set and applicability domain of the QSAR model. In the first case, the set of substructures is extracted from all chemicals that belong to the training set. In the second case, the fragments are extracted among those chemicals from the training set for which the QSAR model correctly predicts the modeled end point. The extracted atom-centered fragments can be used to determine whether a certain chemical belongs to the training set domain and applicability domain of the model or not. The latter is a part of the training set domain where one could expect a good performance of the model.
In the present work, the chemicals from the training set that were correctly classified as significant, weak, and nonsensitizers were used to extract the domain of the model. The first or second neighbors of the atom-centered fragments were used to define structural similarity between chemicals for which the model worked correctly. The above approach was applied to assess the structural domain of the training set and QSAR model for skin sensitization. Table 5 summarizes the statistics of the obtained model domain. The categorization of the chemicals from the training set identified all correctly predicted chemicals as belonging to the model domain. Additional chemicals were incorrectly classified as a part of the model domain. This misclassification may potentially be explained by such factors as experimental error, model inadequacy, or the empirical approach used to determine the structural domain. Regardless of the empirical nature of the outlined approach, the goodness of the determined model domain was very high if the atom-centered fragments included the second neighbors. As can be expected, determination of the model domain accounting for first neighbors only resulted in a significant increase of chemicals that were incorrectly classified as elements of the model domain.
External Model Validation
From the various validation approaches available, external validation was applied. This is the most demanding way of testing the validity of a model. The approach consists of making predictions for an independent set of data not used in the model training (Eriksson, Jonsson, and Berglind 1993).
A set of 96 chemicals tested for skin sensitization by LLNA (47 chemicals) and GPMT (49 chemicals) tests were collected randomly from unpublished and published data sets (Cronin and Basketter 1994). These chemicals were not used in the model building process. The chemicals identified as in the model domain and their predictions for their skin sensitization potentials are presented in Table 6. As can be seen from the table, the model predicts the external data compounds reasonably well if the model domain is determined by accounting for the second neighbors of centered atoms. In this case, the correctness of predictions is 87% as compared with 71% if the model domain is determined by accounting for first neighbors only. It should be mentioned that ignoring the model domain reduces the predictability of the model to only 52%.
SUMMARY AND CONCLUSIONS
The German ‘Sachverständigenrat für Umveltfragen’ estimated in 1999 that the socioeconomic costs of allergies alone in Europe were €29 billion per year (EU 2001). Chemical substances are considered to play a major role in inducing allergies and even small advances in understanding the factors that are responsible for inducing skin sensitization may have a significant impact on reducing allergy-related health and economic costs. The focus of the present study is to develop a QSAR model for assessing the potential sensitization of chemicals based on a mechanistic understanding of the fate of chemicals in skin.
The study began with an analysis of skin sensitization data collected from three
databases: LLNA with 185
It should be acknowledged that the combination of three data sets of different origin might not be the ideal choice; however, it allowed the use of a larger dataset containing a wider range of chemical classes. Given the latest advances in categorization of data based on the relation between LLNA and GPMT described by Kimber et al. (2003), further work is envisioned to apply the present modeling approach to a more detailed categorization scheme. This would require more good quality test data (especially LLNA) covering a larger domain or diversity of chemical structures.
To mimic the metabolism of chemicals in the skin compartment, a metabolism simulator was
developed in the present QSAR model. A total of 203 principal transformations, separated
into two classes (non–rate-determining and rate-determining) was defined based on
literature data and expert knowledge. Non–rate-determining transformations were used to
model hydrolysis of salts and molecular transformations or reactions with cell proteins
of highly reactive groups, such as quinones,
A set of 96 chemicals tested for skin sensitization by LLNA (47 chemicals) and GPMT (49 chemicals) and not used in the training set were used for external validation of the model. The model predicts the external data fairly well if the model domain was determined based on the second neighbors of centered atoms. In this case, the correctness of predictions is 87% compared with 71% if the model domain was determined accounting for first neighbors only. It should be mentioned that ignoring the model domain reduced the predictability of the model to 52%.
The proposed modelling approach of skin sensitization is mechanistically sound and transparent. Integration of the skin metabolism simulator with 3D pattern recognition techniques enables the reactivity of parent chemicals (and their metabolites) towards proteins to be translated into a suitable mathematical formalism for sensitization assessment. This proposed skin sensitization model will need to be improved by further refinement of both metabolic simulator and COREPA pattern recognition models, by increasing the chemical domain and diversity of structural alerting groups, and by continued external validation. Further investigation is needed to take into account factors not included in the present model such as skin absorption and non-covalent modes of protein-hapten association, etc. It is hoped that the present model will accelerate the further development of mechanistic and metabolism-based computational predictive tools that would serve as alternative methods in our efforts to reduce the reliance on the use of animals for skin sensitization testing and classification.
Footnotes
This work was partially funded by Exxon Mobil, who with the support of Unilever, Procter & Gamble, and the Danish EPA have helped in the building and validation of the skin sensitization model.
