Abstract
The two-year cancer bioassay in rodents remains the primary testing strategy for in-life screening of compounds that might pose a potential cancer hazard. Yet experimental evidence shows that cancer is often secondary to a biological precursor effect, the mode of action is sometimes not relevant to humans, and key events leading to cancer in rodents from nongenotoxic agents usually occur well before tumorigenesis and at the same or lower doses than those producing tumors. The International Life Sciences Institute (ILSI) Health and Environmental Sciences Institute (HESI) hypothesized that the signals of importance for human cancer hazard identification can be detected in shorter-term studies. Using the National Toxicology Program (NTP) database, a retrospective analysis was conducted on sixteen chemicals with liver, lung, or kidney tumors in two-year rodent cancer bioassays, and for which short-term data were also available. For nongenotoxic compounds, results showed that cellular changes indicative of a tumorigenic endpoint can be identified for many, but not all, of the chemicals producing tumors in two-year studies after thirteen weeks utilizing conventional endpoints. Additional endpoints are needed to identify some signals not detected with routine evaluation. This effort defined critical questions that should be explored to improve the predictivity of human carcinogenic risk.
Keywords
I. Introduction
As part of the hazard identification of regulated chemicals and of those substances nominated to programs such as the National Toxicology Program (NTP), a lifetime bioassay of carcinogenic potential is routinely undertaken in rats and mice. This applies to most drugs, depending on likely duration of treatment, pesticides, veterinary medicines, and food additives. Many industrial chemicals and natural compounds are also subject to such testing. The rodent bioassay used for this purpose was originally developed in the 1940s and 1950s (Berenblum 1969;E. Weisburger 1981, 1983), and the underlying principles of the assay have remained largely unchanged since that time. The bioassay was based on the observation that exposure of experimental animals, as well as humans, to a number of chemicals led to development of cancer. However, at the time, there was little mechanistic understanding of chemical carcinogenesis. Analysis of the results of initial studies led to the conclusion that the “majority of all cancer” is caused by chemical or environmental factors (Epstein 1979; Roe 1989). However, it should be noted that, at that time, “environmental” (which simply meant that the etiological factor was extrinsic) was often assumed to mean “chemical.” This led to a major focus on identifying chemical carcinogens on the assumption that this would enable the burden of cancer to be substantially reduced.
Inherent in the use of animals for the carcinogenicity bioassay is the assumption that humans and animals behave in a similar way (interspecies extrapolation). In addition, two experimental concepts form the scientific basis on which the bioassay is based.
The first is the empirical relationship developed by Druckrey (1967).
where
The experimental work, mostly with nitrosamines in liver, which led to this relationship, indicated that tumor incidence was directly proportional to dose (dose extrapolation). Thus, incidence could be increased by increasing the dose, or the time to tumor could be decreased, although there was a minimum interval before tumors developed. This approach, however, only worked for genotoxic (DNA reactive) carcinogens. It implied a multistage process for carcinogenesis. A version of this multistage theory derived from epidemiologic data had been previously postulated by Armitage and Doll (1954). However, numerous human tumors, such as Hodgkin lymphoma, breast cancer, osteogenic sarcomas, and childhood tumors, did not show this age relationship.
The second concept is that carcinogenesis comprises multistep stages, which was first demonstrated by the model of tumor initiation and promotion. This was developed to explain the observed data for chemical carcinogenesis in mouse skin by Berenblum and Shubik (1947 Berenblum and Shubik (1949). These studies showed that skin carcinogenesis first required a short exposure to certain chemicals, resulting in an irreversible change that was termed “initiation.” This had to be followed by prolonged exposure to other chemicals that acted to promote the initiated cells, the effects of which were reversible up to a certain time. This was termed “promotion.” In this model, chemicals that act as promoters do not act as initiators. Promotion has to be preceded by initiation; promotion does not need to commence immediately after initiation. It is now recognized that this distinction is not as clear-cut as once believed (see Goodman and Watson 2002).
The model was later shown to apply to a number of other cancer types in rats and mice. It also subsequently acquired a mechanistic interpretation, although the molecular events responsible for the two stages have yet to be completely defined. It is now known that initiation usually involves primary damage to DNA, leading to a critical mutation; while promotion involves proliferation and subsequent steps allowing expression of oncogenicity through acquisition of other changes, which are either genetic or epigenetic (Foulds 1954; Hanahan and Weinberg 2000).
Numerous difficulties were identified with the initiation-promotion model (Cohen 1998b; Cohen and Ellwein 1991). A more definitive model of carcinogenesis, incorporating the concepts of time, genetics, and multiple stages, was postulated by Knudson (1971) based on his investigation of retinoblastoma in children. This model led to the concept of tumor suppressor genes. Utilizing DNA damage and increased cell proliferation (the two fundamental precepts set forth in Knudson’s model), Moolgavkar and Knudson (1981) and Greenfield, Ellwein, and Cohen (1984) developed more generalized models based on epidemiologic and animal studies, respectively. J. Weisburger and Williams (1981) also distinguished two classes of carcinogens: genotoxic (more specifically, DNA reactive) and nongenotoxic. Cohen and colleagues have shown that the common factor for the nongenotoxic carcinogens is increased cell proliferation. Although not precisely correct (Cohen and Ellwein 1991), many have used the term “initiator” interchangeably with genotoxic carcinogen and “promoter” with nongenotoxic (non-DNA reactive) carcinogen.
The current carcinogenicity bioassay owes much to lessons learned from the NTP bioassay program originally developed at the National Cancer Institute (NCI). In establishing this program, a key consideration was that because chemically induced tumors are relatively rare, rather than use very large numbers of animals, the maximum dose should be the highest tolerated by the animals (see Haseman 1984), a natural conclusion from the relationship established by Druckrey (1967). The early studies were designed to determine whether industrial chemicals, with structural similarities to established rodent carcinogens such as 2-acetylaminofluorene (2-AAF) and benzo[
From the late 1960s to mid 1970s, on the assumption that most carcinogens were DNA-reactive, considerable effort was spent in developing reliable, short-term tests of genotoxicity. The most significant outcome of this effort was the Salmonella bacterial mutation assay (Ames et al. 1973). It was initially believed that tests such as this could predict most carcinogens. Indeed, as the majority of chemical carcinogens identified up to that time were potent, DNA-reactive compounds, the Ames test was > 90% predictive. The concept was clearly stated in the title of a manuscript by Ames et al. (1973): “Carcinogens are mutagens: a simple test system combining liver homogenates for activation and bacteria for detection.” Reflecting the views of a number of scientists at the time, in another paper Ames (1973) stated, “We … suggest that the combined bacteria/liver system be used as a simple procedure for carcinogen detection.”
As the number and chemical diversity of those chemicals tested within programs such as the NTP increased, a range of chemicals with no structural similarities to known DNA-reactive carcinogens and negative in the Ames assay were found to be carcinogenic, the proportion of these that were positive for carcinogenic activity in rats and mice being similar to that for DNA-reactive compounds. However, the tumor profile obtained with these chemicals differed (Fung, Barrett, and Huff 1995). It subsequently became apparent that many of these chemicals caused cancer by mechanisms that did not involve direct reactivity with DNA, and, indeed, they were negative in short-term tests of genotoxicity. It is now known that there are a number of mechanisms by which a chemical can increase tumor incidence in rats and mice in addition to genotoxicity (MacDonald and Scribner 1999).
In general, such nongenotoxic carcinogens act by increasing DNA replications in the pluripotential cells of a tissue, either by increasing cell proliferation and/or by inhibiting apoptosis (Cohen and Ellwein 1990 Cohen and Ellwein 1991; Greenfield, Ellwein, and Cohen 1984; Moolgavkar and Knudson 1981). This increases the probability of producing or selecting cells that develop spontaneous errors or of damage induced by primary initiators or secondary mediators such as reactive oxygen species (Ames and Gold 1997; Cohen 1998a). Hence, although many nongenotoxic carcinogens may act through mechanisms that include a DNA damage component (Klein and Klein 1984), a biological threshold for the carcinogenic response to such compounds will exist (Butterworth and Bogdanffy 1999). To induce the production of a secondary genotoxic species, the nongenotoxic carcinogen still has to achieve a threshold concentration to trigger the precipitating biological event, such as cytotoxicity or inflammation (Butterworth and Bogdanffy 1999; Cohen and Ellwein 1991). This contrasts with genotoxic carcinogens for which, at least in theory, there is the potential of a linear, nonthreshold response (U.S. Environmental Protection Agency [EPA] 2005), although a number of groups are strongly challenging this assumption (e.g., Hoshi et al. 2004; Swenberg et al. 2002; Williams, Iatropoulos, and Jeffrey 2004).
The current testing strategy for carcinogenic potential is based on a dual approach:
assessment of genotoxic potential and
assessment of carcinogenic potential in the lifetime bioassay in rats and mice.
The results from such studies may be supported by other investigations to determine the mode of action (MOA) and its relevance to humans (dose, metabolism, etc.). Such studies have shown that for those compounds that cause cancer by a nongenotoxic MOA, it is usually a secondary consequence of another toxicological perturbation, such as inflammation or cytotoxicity (Cohen et al. 2004; Sonich-Mullin et al. 2001). Indeed, there is evidence that under the right circumstances, almost any agent can cause cancer in experimental animals (Ashby and Purchase 1993; Norton 1981). A key consideration in this respect is that the high doses necessarily used in the cancer bioassay often cause effects unrelated to those observed at lower doses (MacDonald and Scribner 1999). Effects seen under such circumstances often have no relevance to the assessment of human risk.
As knowledge of MOAs of nongenotoxic carcinogens has increased, three concepts have emerged:
A number of MOAs for carcinogenicity are rodent-specific.
Tumors occur at detectable incidences at the same, and often only at higher, doses than the primary toxicological perturbation.
There is a biological threshold for carcinogens with such MOAs.
As a consequence of the above, there is increasing concern that the current cancer bioassay in the rat and mouse in which compounds are tested at up to the maximum tolerated dose is not very predictive of the potential for human carcinogenicity, and in particular that it has a high false positive rate (Alden et al. 1996; Cohen 2004; Ennever and Lave 2003; Gaylor 2005; Rhomberg et al. 2007; Van Oosterhout et al. 1997). Compounds that are carcinogenic as a consequence of direct reactivity with DNA are now identifiable in short-term tests for genotoxicity (Kirkland et al. 2005). Hence, the majority of compounds subject to a cancer bioassay today that give a positive result act by a nongenotoxic mechanism. Many regulatory authorities will permit exposure to compounds, albeit usually for less than a lifetime, that are negative in an adequate range of genotoxicity tests (U.S. Food and Drug Administration [FDA] 1996). This implies that the bioassay is currently required to identify only those compounds that can cause cancer by a nongenotoxic mechanism. In part, the unreliability of the bioassay in rats and mice is due to the existence of rodent-specific mechanisms of carcinogenicity or responsiveness at very high doses via mechanisms not occurring at lower doses.
Advances in both genotoxicity testing and in biomedical sciences justify a critical reevaluation of the need for the cancer bioassay, or even the proposed genetically engineered mouse alternatives (MacDonald et al. 2004), and whether these tests can be replaced by a more systematic, mechanistically based approach. Currently, use of the bioassay results in risk communication problems, requires consumption of significant development costs, and is a difficult system in which to apply advances in biomedical science. Often, these mechanistic research approaches are used only retrospectively to explain false positives, not prospectively to help in evaluation. During product development, it is the elimination of compounds with the potential to cause cancer that is of primary concern, rather than whether they will definitively produce tumors if given for a lifetime. Hence, an important goal is the development of an efficient, reliable, and cost-effective means of assessing human carcinogenic
The Health and Environmental Sciences Institute (HESI) established a project to explore the feasibility of such an approach. This was based on developments in the analysis of MOA for chemical carcinogenesis and its human relevance primarily by the International Life Sciences Institute (ILSI) and the International Programme on Chemical Safety (IPCS) (Boobis et al. 2006; Meek et al. 2003; Sonich-Mullen et al. 2001). An MOA is characterized by a series of key events, which are the biological processes occurring on the causal path to cancer. Qualitative and quantitative consideration of these key events between experimental animals and humans enables conclusions to be reached about human relevance of the carcinogenic effect of the chemical. The ultimate objective of the HESI initiative was to test a strategy in which compounds are evaluated for carcinogenic potential in rats and mice after exclusion of those that are genotoxic or immunosuppressive as determined in relevant tests routinely undertaken in hazard identification and characterization on the premise that such compounds are known often to possess carcinogenic potential (Cohen 2004). Compounds negative for such effects would be evaluated in subchronic tests (initially thirteen-week studies) for the induction of key events associated with known MOAs for carcinogenicity, which should then be predictive of carcinogenic potential in rats and mice. The human relevance of the MOA would then need to be evaluated.
In this study, the NTP database was evaluated for all compounds that were positive for liver, kidney, or lung tumors over the period 2000 to 2005. This database was selected because it contains comprehensive data on both chronic and thirteen-week studies in rats and mice, is publicly accessible, and is among the most comprehensive available. The period 2000 to 2005 was chosen because, prior to this time, the information available was not comprehensive, precluding full evaluation of the compounds. The target tissues were those most commonly showing a tumorigenic response in rats and mice.
The reliability of short-term tests of genotoxicity to detect genotoxic carcinogens was critically evaluated, that is, it was necessary to establish the confidence that could be placed in a negative response and where the weaknesses were, if any. Precursor effects were sought for nongenotoxic carcinogenicity in subchronic tests (thirteen weeks). As part of this exercise, MOAs for which there were no suitable conventional endpoints would be identified. Potentially suitable endpoints to cover these deficiencies that, if possible, could be assessed in conventional subchronic studies, would be identified. These studies would enable the false negative rate to be determined, that is, those compounds for which no relevant key events could be identified. In a subsequent stage it would be necessary to establish the false positive rate, where the occurrence of key events was not accompanied by a carcinogenic response. Analysis of the key events and carcinogenic response would test the hypothesis that protection against such effects would be adequately protective against carcinogenicity and that by understanding the key events, it would be possible to determine human relevance. Ultimately, it is hoped that the results of such studies will enable the development of a science-based, hierarchical approach to assessing the carcinogenic potential of compounds.
The current study was designed to test the hypothesis that the signals of importance for human cancer hazard identification can be detected in shorter-term studies, rather than routinely relying on data from two-year cancer bioassays in rats and mice.
II. Methods
The HESI Cancer Hazard Identification Strategies (CHIS) Project Committee elected to use the NTP database in this study because it constitutes the most comprehensive, accessible repository of matching subchronic and long-term information on both pathology and other endpoints, for example, clinical chemistry, available to the participants. The Project Committee was greatly assisted in interrogating the database by scientists from the NTP.
The period 2000 to 2005 was evaluated. Prior to 2000, the NTP database does not contain comprehensive information on all aspects of hazard relevant to the project objectives. Hence, only those reports from the beginning of the year 2000 were considered for evaluation. The study was based solely on the tumorigenicity and genotoxicity data available in the NTP database for the chosen compounds, although some slides from male kidneys in the thirteen-week studies were reviewed (see below). Additional literature searches were not conducted.
Compounds in the NTP database were queried for carcinogenic effects in at least one “cell” of the bioassay, that is, male or female mice or male or female rats, in liver, kidney, or lung. These tissues/organs were selected for study as they are by far the most common targets for carcinogenicity of chemicals in rats and mice. Sixteen chemicals were identified on this basis. Only studies in Fisher 344 (F344/N) rats and/or B6C3F1 mice were included in the analysis because these two strains were most frequently used by NTP during that period. A customized, defined query tool (Excel spreadsheet) was developed by the CHIS Project Committee to assist with searching and recording the results of subchronic (thirteen weeks) rat and mouse toxicity studies for each of the sixteen carcinogenic compounds identified from the two-year rodent bioassay database. Teams of scientists (the authors) reviewed thirteen-week data and data from mutagenicity assays for the sixteen chemicals, with the objective of identifying early signals of carcinogenic potential, for example, cytotoxicity, hyperplasia, and local irritation. Peer reviewers checked the data recorded. Some peer reviewers were CHIS Project Committee participants; others were not. In all cases, the peer reviews were independent of the query exercise.
Table 1 shows the chemicals identified in the NTP database that were found to produce tumors in one or more of the target organ systems and in one or both of the species tested. To obtain some information on false positives as well as false negatives, the histomorphologic findings from thirteen-week studies were examined for all sixteen compounds for all target tissues, including those without evidence of tumors in two-year studies.
III. Genotoxicity
A. Methods
For the sixteen chemicals in Table 1, genotoxicity data were obtained from the NTP database, summarized, and evaluated. An overall call for genotoxic, not genotoxic, or equivocal was assigned based on all the test data. A summary table (Table 2) indicates potential
B. Results (Table 2)
In seven cases, data were available for at least three tests (Ames,
In three cases, the “call” used by NTP for
Indium phosphide was considered to be negative by NTP; the present authors considered it equivocal because the level of micronucleated polychromatic erythrocytes (MN-PCE) increased from 1.7 in controls to 4.11 in treated females, with a lesser increase in micronucleated normochromatic erythrocytes (MN-NCE).
Propylene glycol mono-t-butyl ether was considered a weak positive in female mice by NTP; the authors of this article considered it negative because the maximum level of micronuclei seen in females was in the range of concurrent (male) and historical controls for the data set examined.
Decalin was considered a weak positive in male and negative in female by NTP; the present authors considered it equivocal or negative overall, again because the maximum level of micronuclei seen was in the range of historical controls for the data set examined.
In some cases, the data from the
The data for anthraquinone are considered suspect because other carcinogenicity studies were negative, and the NTP carcinogenicity study used a batch of anthraquinone contaminated with the potent mutagen 9-nitroanthracene at a level of 1,200 ppm (Butterworth, Mathre, and Ballinger 2001). (A purified sample was negative in the Ames test.) Certainly, it can be said that the material used by the NTP was mutagenic (Doi, Irwin, and Bucher 2005).
C. Discussion/Conclusions
Overall, only three compounds were clearly genotoxic (that is, 2-methylimidazole, riddelliine, and urethane), in addition to anthraquinone. One of these was called positive based on
Of the nongenotoxic conclusions, four were based on data from two tests (Ames and
Of the equivocal conclusions, three were based on equivocal
It is unusual to find positive results
An examination of the hematology data from these studies indicates that in some cases regenerative anemia associated with altered erythropoiesis may have caused a false positive result, that is, one not associated with genotoxicity. Indium phosphide and 2-methylimidazole both had changes including hematopoietic cell proliferation of the spleen; decalin and o-nitrotoluene did not. It is interesting that even anthraquinone had increases in red blood cell proliferation with large increases in circulating reticulocytes. This might explain why increases in micronuclei were seen only after a three-month treatment and not after three daily doses of anthraquinone and points to the difficulty of interpreting results
Other possible explanations for
For complete transparency, all sixteen compounds, regardless of genotoxicity results, were included in the tabular presentations of data in this study.
IV. Immunosuppressive Activity
A. Methods
The NTP database was searched for clinical and anatomical pathology findings related to disorders of the immune system, including evidence of downregulation (possible immune suppression) and proliferation. The database was also searched for any evidence of neoplasia related to elements of the immune system.
Data were obtained from subchronic studies (thirteen weeks) in F344 rats and B6C3F1 mice for all sixteen compounds derived from the NTP database (Table 1). Findings included changes in hematology (total leukocyte, segmented neutrophil, lymphocyte, and monocyte counts); changes in spleen and/or thymus weights; and histopathological findings in the bone marrow, spleen, thymus, and lymph nodes.
Of the sixteen chemicals evaluated, information for riddelliine, triethanolamine, and Fumonisin B1 was very limited, and the absence of any effect on the immune system for these three chemicals should be considered with caution.
B. Results (Table 3)
There were ten out of sixteen chemicals with changes in one or more data endpoints related to the immune system. Of these ten chemicals, eight chemicals had changes suggesting down-regulation of the immune system, which, in all cases, were likely secondary to significant stress or illness with release of endogenous glucocorticoids. Two chemicals (o-nitrotoluene and Elmiron [sodium pentosanpolysulfate]) caused a slight increase in lymphocyte counts in male and female rats and the accumulation of vacuolated histiocytes in multiple organs including the lung (see Lung section). It has been suggested that Elmiron may induce a lysosomal disorder that is characterized by histiocytes containing mucins and lipidic material within membrane-bound vacuoles (Nyska et al. 2002).
Gallium arsenide caused contact dermatitis in female mice during a contact hypersensitivity study but no evidence of immunotoxicity in standard toxicity studies. There was an increased incidence of mononuclear cell leukemia in female rats at the end of the two-year carcinogenicity study, with incidences of twenty-two, eighteen, twenty-one, and thirty-three of fifty in the control, low-, mid- and high-dose groups, respectively. This finding was originally considered significant, but this interpretation is debatable. The pathogenesis of this putative increased incidence of mononuclear cell leukemia is uncertain, and it is unlikely related to immunosuppression because the doses used in the two-year studies were significantly (75X) less than the high dose used in the thirteen-week study where there was no evidence of direct immunosuppression (NTP 2000a).
C. Discussion/Conclusions
The interest in evaluating immunosuppressive activity is related to the putative protective role of the immune system in development of cancer. Current ICH guidance lists increased incidence of tumors as one of the five signs of possible immunosuppression in short-term toxicity studies (ICH 2005). The relationship between immunosuppression and cancer is still under investigation, and immunosuppression is currently linked to neoplasia mostly related to infectious agents. In humans, these include Epstein-Barr virus, human herpes virus-8, hepatitis B and C viruses, human papilloma viruses, and
Of the sixteen chemicals reviewed, none caused direct immunosuppression in thirteen-week studies in rats and mice. Many chemicals (eight/sixteen) caused down-regulation of the immune system by one or more standard endpoints in subchronic studies, but in all instances, these were attributed to stress. There were no instances where chemicals that did not show any evidence of immunosuppression in subchronic studies were subsequently tested in a specific immunotoxicity study. Current ICH guidance for chemicals supports the view that specific immunotoxicity investigation is not warranted in these situations (ICH 2005).
There was no clear evidence of neoplasia in elements of the immune system.
For complete transparency, all sixteen compounds, regardless of immunosuppressive activity, were included in the tabular presentations of data in this study.
V. Liver
A. Methods
In the subchronic (thirteen-week) toxicity studies, the recorded organ weight, clinical pathology, and histopathology data were reviewed for each compound. These included increased relative liver weight, hepatocellular hypertrophy, altered foci, hepatocyte necrosis, hepatocyte vacuolation, hepatocyte degeneration, bile duct hyperplasia, increased alanine transaminase (ALT) levels, increased sorbitol dehydrogenase (SDH) levels, and increased bile acid/bilirubin levels. In a similar analysis by Allen et al. (2004), the predictive value of hepatocyte hypertrophy, necrosis, cytomegaly, and increased liver weight in subchronic studies was investigated. In this study, the authors concluded that these four criteria detected 100% of potential liver carcinogens; however, the detection rate included several false positives.
B. Results (Table 4)
For the sixteen chemicals in our evaluation, thirteen were recorded as rodent liver carcinogens (for which increased incidences of hepatocellular adenomas and/or carcinomas occurred, except where footnoted) (either rat [male/female] or mouse [male/female] or both species). For each sex and each species, the tumor outcome, histopathologic changes, significant clinical pathology, and increased relative liver weight are illustrated.
Increased relative liver weight was recorded for at least one sex of one species (rat/mouse) from the thirteen-week NTP toxicity studies in ten of thirteen positive liver carcinogenic compounds. Other single endpoints at thirteen weeks were associated less frequently with tumor outcomes. These included (in at least one sex of one species) hepatocellular hypertrophy or increased bile acids for five of thirteen carcinogens, hepatocellular necrosis or increased ALT levels for four of thirteen carcinogens, hepatocellular vacuolation/degeneration for three of thirteen carcinogens, and altered foci or increased SDH levels for two of thirteen carcinogens.
Association with tumor outcome was strengthened by grouping together thirteen-week toxicological endpoints. Combining the presence of hepatocellular hypertrophy and/or necrosis with increased relative organ weight demonstrated an association with twelve of thirteen liver chemical carcinogens for at least one sex of one species of the NTP bioassay. This increased predictive rate is similar to the results of the previous retrospective study (Allen et al. 2004) (see Liver Discussion section below).
When positive tumor outcomes were collectively considered for both sexes of both species of the cancer bioassay for this particular set of thirteen liver carcinogens, no false positives were recorded. Therefore, if liver-associated changes were observed in any sex/species from the thirteen-week studies, there were always tumors apparent in one of the long-term bioassays.
However, several false positives occurred if single associations are considered between one sex and one species. For example, the male and female rat exposed to benzophenone demonstrated no treatment-related liver tumor response, but in the thirteen-week studies there was increased relative liver weight, an increased incidence of hepatocellular hypertrophy and vacuolation, increased bile acids, and increased SDH levels. Likewise, Elmiron, while inducing increased relative organ weight and hepatocellular vacuolation in the male rat at thirteen weeks, did not induce an increased incidence of liver tumors after two years of treatment.
Similarly, when positive tumor outcomes are collectively considered for both sexes and both species of the cancer bioassay for these thirteen liver carcinogens, only one false negative was apparent. Inhalation exposure to indium phosphide resulted in liver tumors in the male and female mouse long-term bioassay, while there were no changes observed at thirteen weeks.
C. Discussion
Increased relative liver weight, histopathological changes, and increases in clinical pathology parameters in rat and/or mouse thirteen-week subchronic toxicity studies in the NTP database were positively associated with the majority of tumorigenic outcomes. As mentioned above, this concurs with a previous retrospective study using the NTP database (Allen et al. 2004).
Similar to the set of thirteen liver carcinogens examined here, Allen et al. (2004) demonstrated that an increased liver weight was associated with eight of eleven rat liver carcinogens. When considered as separate entities, hepatocellular hypertrophy identified five of eleven carcinogens, and hepatocellular necrosis identified four of eleven carcinogens. Pooling/grouping hepatocyte hypertrophy + necrosis + cytomegaly + increased liver weight identified eleven of eleven liver carcinogens.
Likewise, in another retrospective review of nine nongenotoxic NTP carcinogens, increased relative liver weight was the most highly specific predictor of mouse liver tumors (Elcombe et al. 2002). It has also been noted by the U.S. EPA (2002) that when hepatocellular hypertrophy (and corresponding increased liver size/weight) is accompanied by another more severe toxic change (e.g., clinical pathology changes/other histopathology changes), the combination of these changes may reflect underlying carcinogenic potential in rats and mice.
D. Conclusions
Conventional mammalian toxicological endpoints identified at thirteen weeks are associated with most tumor outcomes as mentioned above, but these indicators produce a number of false positives for compounds tested in the overall NTP database. Conventional endpoints such as increased relative liver weights and corresponding hepatocellular hypertrophy often represent temporal adaptations that demonstrate reversibility upon withdrawal of treatment. In future studies, it will be important to analyze the magnitude and dose response for these effects to determine whether predictivity can be improved. One chemical, indium phosphide, was a false negative on the basis of an absence of any treatment-related, conventional liver changes for male and female mice at thirteen weeks (as similarly reported by Allen et al. 2004). (Note: Supporting evidence for this chemical compound’s tumorigenic response in the mouse was demonstrated by increased incidences of nonneoplastic eosinophilic foci in a dose-response relationship [for both sexes], as compared to controls, at the two-year time point [as described in the NTP report].)
The authors conclude that conventional liver endpoints currently identified in subchronic (thirteen-week) toxicity studies in rats and mice are not adequate to identify all chemicals with carcinogenic potential.
Additional endpoints may identify other key events that might more accurately predict carcinogenic potential in rats and mice. These key events, in turn, will enhance analysis for defining MOAs to better assess human carcinogenic potential/risk. Specifically, these endpoints include increases in cell proliferation (S-phase response) and induction/inhibition of apoptosis (measurement of labeling indices for both events), constitutive androstane receptor (CAR) nuclear receptor activation (reporter assays), cytochrome P450 induction (direct biochemical measurement), and peroxisome proliferation (measurement of palmitoyl coenzyme A oxidase activity). Such key precursor events could be measured in short-term investigative studies, using three-, seven-, fourteen-, twenty-eight-, and/or ninety-day exposure scenarios.
Further key indicators may be identified from the variety of developing -omics technology platforms, particularly as MOA studies expand into exploring genomic signatures and pathway mapping associated with commonly accepted key events, including CAR activation and peroxisome proliferation.
VI. Kidney
The renal tumors referred to in this section are of renal tubular origin. The histologic changes are indicators of tubule injury or change.
A. Methods
Five of the sixteen chemicals identified in the NTP database produced tumors in the rat kidney. No kidney tumors were induced in mice. Four of these chemicals (benzophenone, decalin, Fumonisin B1, methyleugenol) produced kidney tumors only in the male rat, not in the female. Anthraquinone produced tumors in both the female and male rat.
Initial evaluation for assessment of renal alterations after thirteen weeks of study included parameters that were reported in the histopathology tables by the NTP. The renal alterations and data presented included hyaline droplets, inflammation, chronic progressive nephropathy, and absolute and relative kidney weights. The histopathology evaluation was based on the NTP report, except for Fumonisin B1, which was based on results of short-term studies that had been previously published (Dragan et al. 2001; Howard et al. 2001; Voss et al. 1995). Kidney weights for Fumonisin B1 were not available.
Subsequently, and as part of a concurrent evaluation conducted by the NTP, an author of this article (Dr. Gordon Hard) reviewed the slides from male rat kidneys from thirteen-week studies for most of the sixteen chemicals (except Fumonisin B1), including the additional histopathologic indicators of necrosis/apoptosis, hyperplasia, karyomegaly, vacuolization, tubular basophilia (not associated with chronic progressive nephropathy), and increased mitotic activity. The slides for Fumonisin B1 were not reexamined during this review because they were not available. However, the slides for Fumonisin B1 had been reviewed as part of another project (Hard et al. 2001; Bucci et al. 1998).
B. Results (Table 5)
All four chemicals that produced kidney tumors, and for which data were available regarding kidney weight at the thirteen-week time point (anthraquinone, benzophenone, decalin, methyleugenol), had elevated kidney weights (Table 5), both absolute and relative to body weight. For anthraquinone, kidney weight was elevated in both the female and male rats, and both sexes developed renal tumors. Benzophenone treatment increased kidney weight in both female and male rats, but tumors only occurred in the male. Decalin and methyleugenol increased the kidney weights and caused renal tumors only in male rats.
The kidney findings for all of the chemicals are listed in Table 5. The standard histopathologic criteria for evaluating the kidney resulted in a lack of detection of renal alterations after thirteen weeks of treatment with anthraquinone, benzophenone, decalin, or methyleugenol. There was evidence of regeneration associated with benzophenone and decalin treatment. In contrast, Fumonisin B1 induced extensive apoptosis, and degenerative and regenerative changes at early time points (Dragan et al. 2001; Howard et al. 2001).
Additional targeted analysis that described renal alterations in greater detail than is typical in the standard NTP report demonstrated that there was a significant increase in renal tissue responses with a number of chemicals including the tumorigens. Hyaline droplets were present in female and male rat kidneys following anthraquinone administration, and decalin treatment in male rats only. Regenerative changes were present in the kidneys from male rats treated with benzophenone and decalin. Chronic progressive nephropathy was increased in female rats treated with anthraquinone to a limited extent but significantly in the male rats treated with anthraquinone. Inflammatory changes were also present in male rats treated with benzophenone and decalin. No changes were seen in the mouse kidneys for the five chemicals producing kidney tumors in rats except for nonspecific cellular alterations in male mice administered decalin.
Of the eleven chemicals evaluated in this study that did not produce kidney tumors, five (urethane, oxymetholone, 2-methylimidazole, propylene glycol t-butyl ether, and indium phosphide) produced alterations in the kidneys after thirteen weeks of treatment. Urethane produced nephropathy (not further defined) in male and female mice and male and female rats. Oxymetholone treatment resulted in an increase in kidney weight in the female mouse and in the female and male rat. In addition, there were regenerative changes in kidneys of the female and male rat administered oxymetholone. No renal lesions were seen in the male mouse treated with oxymetholone. 2-Methylimidazole treatment resulted in increased nephropathy in the male rat. No renal alterations were present in the female rat, and there were no elevations of kidney weight in either sex of either species. Indium phosphide exposure resulted in chronic progressive nephropathy in both female and male rats. No increase in kidney weight or other kidney findings were found with indium phosphide.
C. Discussion/Conclusions
Based on this limited sample of chemicals that produced kidney tumors in rats in two-year bioassays, all caused detectable alterations after thirteen weeks of treatment. The feature that consistently gave a positive signal was the nonspecific finding of an increase in kidney weight, both absolute and relative. This is similar to what has been reported for rodent liver carcinogens (Allen et al. 2004). Significantly, all exposure groups that had no effects in the kidney after thirteen weeks of treatment had no renal tumors after two years, and all exposure groups that had tumors after two years had renal alterations at thirteen weeks.
In addition to kidney weight, the additional criteria including changes that indicate cell death (necrosis and/or apoptosis) and evidence of regeneration (basophilia, karyomegaly, mitoses) were not consistently diagnosed in the kidneys that were positive for rodent kidney carcinogens in the standard NTP study report. All of the rodent renal carcinogens could be detected in the thirteen-week assays due to diagnosis of hyaline droplets and increased chronic progressive nephropathy, in addition to the above lesions diagnosed on subsequent review. In this set of studies evaluated, there were no false negatives; however, there were false positives in that some exposures caused renal lesions after thirteen weeks, but no renal tumors in two-year bioassays. Thus, utilizing kidney weight and thorough histologic review of the kidneys after thirteen weeks of treatment detected all of the rodent renal tumorigens in this set of studies. For screening purposes, it is essential that false negatives do not occur.
This screening approach does not directly demonstrate mode of toxic or carcinogenic action, nor does it provide definitive information on likelihood of human carcinogenicity. However, the findings in these short-term studies, combined with the genotoxicity assessment, can provide helpful clues. For example, for Fumonisin B1, the MOA appears to include the key events of stimulation of extensive apoptosis with significant regeneration that could lead to kidney tumors (Dragan et al. 2001). Such an MOA potentially could occur in humans. In contrast, the other four chemicals that induced kidney tumors in the two-year bioassay from the current group of chemicals produced kidney tumors by either an increase in chronic progressive nephropathy or by binding to α2u-globulin (as indicated by increased hyaline droplets), leading to tubular cytotoxicity, regeneration, and eventually tumors. These two MOAs are detectable in the thirteen-week screening process. However, neither of these MOAs is considered to be relevant for human cancer risk (Dybing and Sanner 1999; Hard, Johnson, and Cohen 2009; Lock and Hard 2004).
VII. Lung
A. Methods
This section reviews data from the NTP database obtained for the sixteen compounds in Table 1 and focuses on and focuses on evidence of histomorphologic alterations of the lung identified in thirteen-week studies in two species (B6C3F1 mice and F344 rats) and the presence or absence of lung tumors in these same species from two-year carcinogenicity studies. This evaluation attempts to draw correlations between the occurrences of pulmonary pathology identified in thirteen-week studies with the subsequent emergence of lung tumors. It is important to note that the routes of exposure are variable among the compounds tested and include dosing by drinking water, feed, and inhalation. Therefore, care must be exercised in interpreting the outcomes of localized intrapulmonary high particle burden versus systemic exposure.
As the purpose of this exercise was to identify signals in thirteen-week studies that might predict tumor generation, the data do not take into account the presence or absence of similar signals of inflammation or hyperplasia identified in the two-year bioassay itself. The analysis is only concerned with the presence of those signals at thirteen weeks under the conditions of that particular study. The lack of a lesion, such as inflammation, at thirteen weeks does not nullify a mechanistic association with the emergence of a tumor—only that it was not detected with these routine evaluations at a time point that would allow such signals to be consistent predictors of subsequent tumor formation. Such a lesion might yet occur at a time beyond thirteen weeks and possibly still be associated with the final tumorigenic outcome. Should this be the case, consideration would need to be given as to how it might be taken into account in developing a cancer hazard identification strategy based on the findings of the present study.
The following diagnostic terms for histomorphologic alterations were used by NTP to describe lung lesions in thirteen-week studies: chronic active inflammation, inflammation NOS (not otherwise specified), alveolar epithelial hyperplasia, bronchiolar hyperplasia, proteinosis, fibrosis, histiocytic infiltration, and foreign body. The following diagnostic terms for lung tumors were used by NTP in the two-year bioassay studies: alveolar/bronchiolar adenoma, alveolar/bronchiolar carcinoma, alveolar/bronchiolar adenoma or carcinoma, and squamous cell carcinoma.
It is important to recognize that the diagnoses described herein are based solely on the written terms presented in various reports and tables in the NTP archives. As per NTP procedures, Pathology Working Groups reviewed the accuracy of lesion diagnoses and descriptive nomenclature at the time each study was conducted. However, examination of histology slides was not repeated for the purposes of this data review.
B. Results (Table 6)
The data in Table 1 show that the correlation of genotoxicity with lung tumor outcome is poor. Two (one equivocal) compounds were genotoxic but failed to induce lung tumors (2-methylimidazole and anthraquinone), three compounds were not genotoxic but did induce lung tumors (oxymetholone, gallium arsenide, vanadium pentoxide), and two compounds were positive for both genotoxicity and lung tumor formation (urethane, riddelliine). It is of value to note that lung tumors were identified in animals given compound by different routes of exposure (Table 6) that include inhalation (gallium arsenide, vanadium pentoxide, indium phosphide), oral gavage (oxymetholone, riddelliine), drinking water (urethane), and diet (o-nitrotoluene), suggesting that direct irritancy that might occur during inhalation is not a prerequisite for initiation of lung tumors, and that other mechanisms of action are also relevant.
As shown in Table 1, seven of the sixteen compounds were identified as inducing lung tumor formation in at least one cell of the two-year bioassay. Four of these seven compounds (urethane with/without 5% ethanol, vanadium pentoxide, indium phosphide, gallium arsenide) also had diagnoses of inflammation and/or hyperplasia at thirteen weeks (Table 6). For animals given urethane (with/without 5% ethanol), inflammation, hyperplasia, and lung tumors were seen only in male and female B6C3F1 mice, but not F344 rats. Riddelliine induced lung tumors only in female B6C3F1 mice without any prior diagnoses of inflammation or hyperplasia at thirteen weeks. Vanadium pentoxide was associated with inflammation, hyperplasia, and lung tumors in male and female B6C3F1 mice and male F344 rats; female F344 rats were without lung tumors. Indium phosphide was associated with inflammation, hyperplasia, proteinosis, fibrosis, foreign body at thirteen weeks, and lung tumors in the two-year studies in male and female B6C3F1 mice and male and female F344 rats. Gallium arsenide was associated with inflammation and hyperplasia in both species in the thirteen-week study, but lung tumors were only identified in female F344 rats in the two-year bioassay.
The presence of inflammation and/or hyperplasia at thirteen weeks without emergence of lung tumors at two years was seen in animals given Elmiron or benzophenone. The lung lesion identified in Elmiron-treated rats was a combination of chronic inflammation and infiltration of alveoli by histiocytes, and has been suggested to be a drug-induced lysosomal storage disorder (Nyska et al. 2002). The lesion identified in benzophenone-treated rats was identified only as chronic active inflammation. These data would be considered false positive findings for the thirteen-week studies as predictors for lung tumor formation.
False negative findings were identified for oxymetholone, riddelliine, and o-nitrotoluene based on the absence of lung pathology identified from the thirteen-week studies but positive findings of lung tumors in the two-year bioassays. Of these, riddelliine is considered clearly genotoxic. A review of the incidence data for the bioassay studies for each of these compounds clearly supports the identification of compound-induced lung tumors for each.
Seven of sixteen compounds (Fumonisin B1, triethanolamine, propylene glycol mono-t-butyl ether, methyleugenol, 2-methylimidazole, anthraquinone, and decalin) that were given to males and females of both species had no lung pathology at thirteen weeks and no lung tumors at two years. These results were rated as being a positive correlation between the findings of the thirteen-week studies and the lack of lung tumors in the two-year bioassay.
Carcinogenicity studies were only conducted in a single species for oxymetholone or urethane.
C. Discussion/Conclusions
The presence or absence of inflammation and/or alveolar hyperplasia within the lung following thirteen weeks of exposure appeared to correlate with the presence or absence of lung tumors in eleven of sixteen of the chemicals tested, suggesting an association of events occurring after thirteen weeks of exposure with the ultimate expression of neoplasia. However, there were two false positives in which the identification of inflammation and/or alveolar epithelial hyperplasia did not correctly predict the emergence of tumors in a two-year study. There were two cases of compounds considered nongenotoxic in which lung tumors were identified in a two-year study in the absence of lung pathology in a thirteen-week study in either species tested. It is perhaps not surprising that two of the three compounds administered by inhalation (vanadium pentoxide and indium phosphide) induced the broadest and most consistent degree of pulmonary inflammation and subsequent lung tumors in all species tested. The association of particle burden-induced inflammation in the lung and the occurrence of lung tumors have been well studied (Oberdörster 1995), and the results of the analysis presented are consistent with previous findings.
VIII. Summary and Conclusions
Genotoxicity
Four of the sixteen chemicals were considered genotoxic based on NTP data (riddelliine, urethane, 2-methylimidazole, and the anthraquinone preparation). Fumonisin B1 had some published positive genotoxicity data, and three others were considered equivocal genotoxins (decalin, indium phosphide, and o-nitrotoluene).
Immunosuppressive Activity
None of the sixteen chemicals showed evidence of direct immunosuppression at doses relevant to the bioassay. There was no clear evidence of neoplasia in elements of the immune system.
Liver
Six of the sixteen chemicals evaluated in the HESI CHIS project showed hepatocellular tumors in rats in the two-year bioassay. Of these six, one parameter alone (liver weight) correctly predicted five of six tumor outcomes. Grouping any other precursor with liver weight (i.e., hypertrophy, necrosis, vacuolation, degeneration, liver enzyme) resulted in six of six correct predictions. For mouse liver, nine of the sixteen chemicals showed hepatocellular tumors. Of these nine, liver weight correctly predicted six of nine tumor outcomes. Grouping other precursors with liver weight (i.e., hypertrophy and cellular foci) resulted in eight of nine correct predictions.
Kidney
Five of the sixteen chemicals showed kidney tumors in the rat two-year bioassay, and none caused kidney tumors in mice. All five chemicals caused detectable renal alterations in rats after thirteen weeks of treatment. The feature that consistently gave a positive signal was the nonspecific finding of an increase in kidney weight, both absolute and relative. The combination of kidney weight and a thorough histologic review of the kidneys after thirteen weeks of treatment detected all of the rodent renal tumorigens in this set of studies.
Lung
Seven of the sixteen chemicals produced tumors of the lung in either rats and/or mice. The presence of inflammation and/or alveolar hyperplasia in the lung following thirteen weeks of treatment was observed for four of these sixteen chemicals and for three others, suggesting some degree of a possible correlation between short-term events and the ultimate expression of neoplasia. Two compounds that were not clearly genotoxic produced lung tumors in the absence of any discernible precursor effects in the lung.
Overall Conclusions
Cellular changes indicative of a tumorigenic endpoint can be identified for most, but not all, of the chemicals producing tumors in two-year studies after thirteen weeks of chemical administration using routine evaluations (see Table 7). Thirteen-week studies utilizing conventional endpoints are currently not adequate to identify all nongenotoxic chemicals that will eventually produce tumors in rats and mice after two years.
Additional endpoints are needed to identify some signals not detected with routine evaluation. Such endpoints might include BrdU labeling and a measure of apoptosis.
Detection of “critical” endpoints, or a critical magnitude of effect, in thirteen-week studies may help distinguish between chemicals that will and will not be tumorigenic after two years (i.e., exclude false positives).
The information obtained in the present study provides a foundation for developing alternative strategies for cancer hazard identification. However, a number of issues were identified that will need to be addressed before such a strategy can be implemented with confidence. A key component of the strategy is the identification of compounds that may be carcinogenic because of their ability to damage DNA directly. For this purpose, a series of genotoxicity tests is used, that is,
The strategy also relies upon the reliable detection of direct immunosuppressive effects of compounds from conventional endpoints measured in short-term studies (e.g., twenty-eight or ninety days). To the extent that it was possible to test this on the basis of the chemicals studied, the approach appears reliable. However, further work is necessary using a range of known positive and negative compounds.
The endpoints assessed in the thirteen-week studies were based on common key events in the MOAs that have been established for nongenotoxic carcinogens (e.g., organ weight as a surrogate for hyperplasia and inflammation). While many of the compounds that were carcinogenic caused signal effects in thirteen-week studies consistent with a nongenotoxic MOA, there were exceptions, particularly in lung and to a lesser extent in liver. However, a known limitation of the study was that the endpoints studied at thirteen weeks did not encompass all of the known key events for potential MOAs of concern. Hence, there was no direct information available on cell proliferation rate, hyperplasia, or apoptosis. For the proposed strategy to succeed, measures of these endpoints will need to be incorporated into conventional study design, or novel biomarkers of these effects will have to be developed and included in some screening level assessment. This could either be in short-term (perhaps even
The rapid advances in toxicogenomics hold promise of delivering biomarkers that will enable identification of the key biological pathways affected by chemicals. This should provide a basis for defining potential MOAs for these compounds (Frijters et al. 2007).
The present study was designed such that it was possible to evaluate the false negative rate of the proposed strategy. The false positive rate was not determined systematically—it would be necessary to evaluate all of the chemicals in the database over the interval 2000 to 2005. However, even with the limited number of chemicals studied here, it was apparent that the false positive rate in the sixteen that were carcinogenic in at least one of the target organs studied was not inconsiderable. Further work is necessary to determine the basis of this. It is possible that more detailed analysis of the magnitude of the response and the dose-response relationship for carcinogens and noncarcinogens would permit such discrimination. In the longer term, incorporation of some of the novel endpoints discussed above should enable much better discrimination between true and false positives.
The two-year bioassay in rats and mice is, at best, only an indicator of potential hazard. Where the MOA for the (nongenotoxic) carcinogenic response is known, it is apparent that the results of the two-year bioassay are frequently falsely positive with respect to risk of human carcinogenicity (Boobis et al. 2006; Cohen 2004; Holsapple et al. 2006; Meek et al. 2003). This suggests that findings in thirteen-week studies would also be falsely positive with respect to their relevance to cancer in humans. The goal of the proposed strategy is the detection of compounds that are potentially carcinogenic to humans. Hence, rather than having to detect all carcinogens in rats and mice by utilizing histopathologic and other biomarkers of key events for MOAs relevant to humans, such as degeneration, apoptosis, and regeneration, combined with knowledge of the pathways leading to these effects, it would be possible to focus effort on those compounds that are of potential concern. This is an issue that requires critical consideration, since the overall intent of these screening assays, whether two-year bioassays or otherwise, is to detect potential human carcinogens.
The association between MOAs and key events needs to be evaluated in terms of human relevance. Such an evaluation needs to include an understanding of exposure levels in terms of both compound kinetics and dynamics in the rodent and human model (Cohen 2004; Holsapple et al. 2006). It is therefore proposed that there should be a prospective approach to define and understand key carcinogenic events with a well-defined dose-response relationship. This information should then be used to determine human relevance in association with human exposure risk assessment (Cohen et al. 2004). This new approach would mitigate routinely relying on data from the two-year bioassay in rats and mice.
The successful development of a strategy such as that proposed here would enable a more mechanistic, science-based approach to the identification of cancer hazard of chemicals. It would provide a systematic means of implementing the insights provided by consideration of MOA and human relevance. Ultimately, the decisions made would be more reliable yet less resource-consuming.
About HESI
The Health and Environmental Sciences Institute (HESI) is a global branch of the International Life Sciences Institute (ILSI), a public, nonprofit scientific foundation with branches throughout the world. HESI provides an international forum to advance the understanding and application of scientific issues related to human health, toxicology, risk assessment and the environment. HESI is widely recognized among scientists from government, industry and academia as an objective, science-based organization within which important issues of mutual concern can be discussed and resolved in the interest of improving public health. As part of its public benefit mandate, HESI’s activities are carried out in the public domain, generating data and other information for broad scientific use and application. Further information about HESI can be found at http://www.hesiglobal.org.
Footnotes
Tables
Acknowledgments
The authors extend their sincere appreciation to the following scientists: Dr. John R. Bucher for providing guidance for searching and extracting data from the National Toxicology Program (NTP) database; Dr. Vijay Reddy for analysis of structural alerts for mutagenic potential and Dr. John Ashby for confirming the results; Drs. Julian Preston, James Klaunig, Mark Cartwright, and Michael Holsapple for peer-reviewing the results of NTP database queries; and Dr. Jay Goodman for his leadership and guidance during the Health and Environmental Sciences Institute (HESI) peer review of this article prior to journal submission.
Conflict of Interests: The authors have not declared any conflict of interests.
This article does not necessarily reflect the opinions or policy of the U.S. Environmental Protection Agency; nor does mention of trade names constitute endorsement. James S. MacDonald’s current affiliation is Chrysalis Pharma Consulting, LLC.
