Abstract
The National Toxicology Program (NTP) has historically used Fischer 344/N (F344/N) rats for the majority of its bioassays. Recently the NTP began using the Harlan Sprague Dawley (SD) as the primary rat model for NTP studies. The NTP had previously used female SD rats in nine bioassays. This article compares historical control (HC) tumor incidence rates from these nine SD rat studies with HC tumor rates from matched NTP F344/N rat bioassays to identify similarities and differences. Matching on sex, laboratory, diet, and route led to nine comparable F344/N rat studies. Our analyses revealed statistically significant strain differences, with female SD rats having lower incidence rates for clitoral gland adenoma (0.2% vs. 5.8%) and mononuclear cell leukemia (0.9% vs. 16.7%) and higher incidence rates for mammary gland fibroadenoma (67.4% vs. 48.4%), mammary gland carcinoma (10.2% vs. 2.4%), and thyroid gland C cell adenoma (25.4% vs. 13.6%) relative to female F344/N rats. These represent five of the seven most common tumor types among female SD and F344/N rats in the NTP HC database. When vehicle was included as an additional matching criterion, the number of comparable F344/N rat studies dropped to four, but similar results were obtained.
Keywords
Introduction
The inbred F344/N rat substrain has been used for NTP rodent toxicity and carcinogenicity bioassays for more than thirty years. Over time, this specific colony of rats developed certain undesirable traits such as decreased fecundity, sporadic seizures, and idiopathic chylothorax. Moreover, the spontaneous incidences of mononuclear cell leukemia and testicular interstitial cell tumors increased over time. According to the May 2009 NTP HC report (http://ntp.niehs.nih.gov; all routes/all vehicles), the average incidence rate for leukemia in males was 38.3% (536/1398; range 8–58%) and in females was 21.3% (288/1350; range 8–40%), and the average incidence rate for testicular interstitial cell tumors was 82.9% (1159/1398; range 58–98%). Such high background tumor incidence rates and wide ranges in control animals may decrease the ability to detect subtle treatment-related effects or may discount statistically significant increases in test animals relative to concurrent controls when the tumor rate falls within its historical control range.
Consequently, the NTP decided to use the Harlan Sprague Dawley (Hsd:Sprague Dawley SD) rat for most of its future bioassays (King-Herbert, Sills, and Bucher 2009; King-Herbert and Thayer 2006). Previously in NTP studies, different rat models were used for different study types. Nine NTP bioassays had been performed using female Harlan SD rats to evaluate the chronic toxicity and carcinogenicity of dioxins and dioxin-like compounds. Historical data on spontaneous tumor rates in vehicle controls were available from those studies (http://ntp.niehs.nih.gov, accessed 12/2009). The SD rat is commonly used in NTP reproductive and developmental studies, so its use as the “default” rat model would allow all types of NTP studies, including toxicity, carcinogenicity, metabolism, and kinetic studies, to be performed within the same strain.
The goal of this project was to compare the HC tumor incidence rates between F344/N and SD rats used in NTP bioassays, after accounting for potential sources of variability beyond strain. After matching on sex, laboratory, diet, route, and in some cases vehicle, formal statistical analysis was used to compare these two strains with adjustments for body weight, survival, chronologic time, study-to-study variability, and multiple testing.
Materials and Methods
Studies
The data for our analyses were obtained from the control groups of 18 NTP chronic rodent carcinogenicity bioassays: nine SD rat bioassays and nine F344/N rat bioassays. We used all of the conventional NTP studies conducted in SD rats. Only females were used in seven of these nine SD rat studies; therefore, we restricted our analyses to female rats. Female Harlan SD (Hsd:Sprague Dawley SD) rats were acquired from Harlan Laboratories, Inc. (Indianapolis, IN, USA). All nine SD rat studies were performed at Battelle Columbus Laboratory (Columbus, OH, USA), under contract from the NTP. In each study, the SD rats were fed the NTP-2000 diet, the route of exposure was gavage, and the vehicle was corn oil.
For comparison, we selected all F344/N rat studies that matched the SD rat studies with respect to sex (female), laboratory (Battelle Columbus), diet (NTP-2000), and route of exposure (gavage). This matching led to nine F344/N rat studies, in which corn oil was the vehicle in four and water was the vehicle in the other five. These female F344/N rats were obtained from Taconic Farms, Inc. (Germantown, NY, USA).
Fifty to fifty-three female rats were used in the control group of each core study (Table 1), which led to 473 SD rats and 450 F344/N rats, for a total of 923 rats. These rats were sacrificed when moribund or after two years on study. The NTP technical report numbers for these eighteen studies are listed in Table 1, along with other summary information, including strain and vehicle. For access to the full technical reports and all the data, see the NTP Web site (http://ntp.niehs.nih.gov).
Summary information for control female rats from 18 long-term carcinogenicity bioassays conducted by the U.S. National Toxicology Program (NTP), matched on laboratory (Battelle Columbus), diet (NTP-2000), and route (gavage).
Animal handling and husbandry were conducted in accordance with National Institutes of Health (NIH) and Institutional Animal Care and Use Committee (IACUC) policies and guidelines (http://www.iacuc.org/index.html). Animal studies were conducted in accordance with NTP two-year study protocol (http://ntp.niehs.nih.gov/go/9989) and NTP specifications (http://ntp.niehs.nih.gov/files/Specifications_2006Oct1.pdf).
Data
Our primary focus was on strain differences in tumor incidence. We retrieved response information on many tumor classifications, but we excluded any with fewer than three occurrences in the entire NTP HC database when combined across strains, as those tumors were considered too rare. Although there were at least three tumors of each type in the HC database, in some cases there were only two, one, or even no tumors in our subset of eighteen studies. After further excluding metastases and combinations of tumors, such as “carcinomas or adenomas” or “all malignant tumors,” a total of eighty-two tumor types remained (Table 2).
Tumor rates (percentages) and p-values for strain differences between control female Fischer 344/N and Hsd:Sprague Dawley SD rats in eighteen NTP studies matched on laboratory (Battelle Columbus), diet (NTP-2000), and route (gavage).
a This list includes the eighty-two tumor types with at least three occurrences in the entire NTP HC database when combined across both strains, even though some totals are less than three for the eighteen studies analyzed here.
b An asterisk (*) indicates the p value for a strain difference is statistically significant at the .05 level, in the absence of a correction for multiple testing. An additional superscript B indicates the p value for a strain difference remains statistically significant after applying a Bonferroni correction for multiple testing, to adjust for having performed eighty-two tests.
c Malignant lymphoma includes histiocytic, lymphocytic, mixed, not otherwise specified, and undifferentiated cell types.
In addition to tumor incidence data, we retrieved information on chronologic time, survival, and body weight. Chronologic time was summarized by the year in which each study began, and study-specific summaries of survival and body weight were calculated from individual animal data (Table 1). For each rat, time on study was recorded in days and body weight was measured in grams at various times during the experiment. Study-specific survival was summarized in terms of both the average life span and the proportion of rats surviving to the end of the two-year bioassay. Study-specific body weight was summarized by the average body weight at one year on study. Only those body weights recorded between days 355 and 375 were averaged; if a rat was weighed multiple times during that period, the weight closest to day 365 was used; and any rat not weighed during that period, including those dying before one year, did not contribute to the average.
Statistical Methods
We used a linear mixed model to analyze tumor incidence as a function of tumor type, rat strain, body weight, and chronologic time, while accounting for study-to-study variability. Rather than focusing directly on the empirical tumor rate, which is the number of tumor-bearing rats divided by the total number of rats, we incorporated the poly-3 survival adjustment of Bailer and Portier (1988). This survival adjustment, which accounts for the fact that not all rats had equal life spans, is important because rats that died early were at less risk of developing tumors than rats that died late. In addition, we applied an arcsine-root transformation to the survival-adjusted tumor rate to stabilize the variance of the response variable, which is important because our analysis assumes homogeneous variances. This linear mixed model analysis was performed using Proc Mixed in the SAS software package, version 9.00 (SAS Institute Inc., 2002, Cary, NC, USA).
Our analysis adjusted for the various explanatory factors as follows. We included a separate indicator variable for each of the eighty-two tumor types, each of the two rat strains, and each of the eighteen studies. This approach is flexible and avoids assumptions about the exact form of the relationship between tumor response and the explanatory factors. Because some increases in tumor incidence rates are associated with increased body weight (Haseman et al. 2003), we wanted to adjust our strain comparisons for differences in body weight. Thus, for each control group, we incorporated a quantitative variable equal to the average body weight (in grams) after one year on study. To adjust for chronologic time, we also included a quantitative variable equal to the year the study started, which ranged from 1995 to 2004. We modeled the mean of the response variable as a linear function of these explanatory variables, plus strain-by-tumor interaction terms to allow different strain effects for different tumor types. Our analysis treated study, nested within strain, as a random effect to account for study-to-study variability. We treated all other factors as fixed effects.
Our analysis evaluated the statistical significance of each factor with an F test from an analysis of variance (ANOVA). We also calculated a two-sided t-test based on least squares means to compare the strains for each tumor type. Each t-test assessed the null hypothesis that the two strains had equal mean responses, after adjusting for possible effects of study, chronologic time, and body weight. In a broad search for any tumor types with incidence rates that appeared to differ between female F344/N and SD rats, we examined the individual p values for all 82 tumor types and considered any p value below .05 as possibly being statistically significant. However, we also conducted a more conservative analysis, which incorporated a Bonferroni correction for multiple testing to account for having performed eighty-two tests. This stricter test declared a strain difference to be significant if the observed p value was below .05/82 (or approximately 0.0006) rather than below the nominal .05 level, thus controlling the familywise error rate at .05.
We performed two sets of analyses. One analysis compared the nine SD rat studies with the nine F344/N rat studies that matched on all criteria except vehicle, and the other analysis compared the nine SD rat studies with the four F344/N rat studies that matched on all criteria, including vehicle (corn oil).
Results
Body Weight
The two strains of female rats differed with respect to body weight, with average study-specific weights at one year ranging from 255 to 293 g for F344/N rats and from 317 to 345 g for SD rats (Table 1). Among studies matched on sex, lab, diet, and route, the overall average body weight at one year was 276 g for F344/N rats and 331 g for SD rats (Table 3). When further matched on corn oil vehicle, the average one-year body weights did not change appreciably.
Comparison of Fischer 344/N and Hsd:Sprague Dawley (SD) rats with respect to mean body weight, survival percentage, and life span in eighteen NTP studies (nine per strain), matched on laboratory (Battelle Columbus), diet (NTP-2000), and route (gavage).
Survival
The two strains also exhibited different mortality patterns, with female F344/N rats tending to live longer than female SD rats. Among the eighteen NTP bioassays matched on sex, lab, diet, and route, a greater proportion of F344/N rats (range 60–76%) per control group survived to the end of the two-year study than did their SD counterparts (range 28–51%; Table 1), with respective overall means of 67% and 43% (Table 3). Similarly, the average time on study ranged from 668 to 710 days for the F344/N rats, compared with 567 to 656 days for the SD rats (Table 1), with respective overall means of 688 and 630 days (Table 3). Among the subset of four F344/N rat studies with the same vehicle (corn oil) as the nine SD rat studies, the proportion surviving two years was 62–68% and the average life-span was 668 to 699 days. The observed variation in survival, both within and between strains, illustrates the need for a survival adjustment in the statistical analysis of the tumor rates.
Tumor Incidence
Several tumor types were common in both strains of female rats (Table 2). Among all eighteen studies, there were four tumor types for which both strains had incidence rates above 10%: mammary gland fibroadenoma (48% in F344/N and 67% in SD), pituitary gland pars distalis adenoma (45% in F344/N and 39% in SD), thyroid gland C cell adenoma (14% in F344/N and 25% in SD), and uterine stromal polyp (14% in F344/N and 14% in SD). There were also two tumor types with incidence rates of 10% or higher in one strain but not the other: mammary gland carcinoma (2% in F344/N and 10% in SD) and mononuclear cell leukemia (17% in F344/N and 1% in SD). The incidence rates for the remaining seventy-six tumor types were below 10% in both strains. In fact, the rates were only 1% or less in both strains for seventy of the eighty-two tumor types considered. Each tabulated percentage is the observed number of tumor-bearing rats divided by the total number of rats examined for that tumor type.
Explanatory Factors
The only factors that had a statistically significant impact on tumor incidence were tumor type and the interaction between strain and tumor type (each with p < .001). All other factors had p values above .15 and thus were not considered statistically significant.
Strain Differences
As a first step toward identifying which tumor types exhibited different incidence rates between female F344/N and SD rats, we calculated a p value for each of the eighty-two tumor types, based on all eighteen NTP bioassays that matched on sex, lab, diet, and route (Table 2). Initially, we viewed any p value below .05 as statistically significant and worthy of further attention. This criterion flagged fourteen tumor types as possibly having different strain-specific incidence rates (see Table 2 and the first column of Table 4). The p values below .05 are indicated with an asterisk.
Strain and vehicle comparisons with respect to the incidence of select tumor types among control female Fischer 344/N and Hsd:Sprague Dawley (SD) rats in NTP studies matched on laboratory (Battelle Columbus), diet (NTP-2000), and route (gavage).
a An asterisk (*) indicates the p value is statistically significant at the 0.05 level, in the absence of a correction for multiple testing. An additional superscript B indicates the p value remains statistically significant after applying a Bonferroni correction for multiple testing, to adjust for having performed eighty-two tests.
b The first column of p values corresponds to strain comparisons (Fischer 344/N vs Hsd:Sprague Dawley SD), regardless of vehicle (corn oil or water), among all eighteen studies. These p values were obtained from Table 2.
c The second column of p values corresponds to strain comparisons among the subset of thirteen studies having the same vehicle (corn oil).
d The third column of p values corresponds to vehicle comparisons (corn oil vs water) among the nine Fischer 344/N rat studies.
Several tumor types had identical strain-specific incidence rates, but the p values for comparing strains differed. For example, the incidence of malignant lymphoma was 1/450 in F344/N rats and 6/473 in SD rats, as was the incidence of uterine carcinoma, but the corresponding p values were .139 and .037, respectively. This phenomenon can occur for several reasons. The tumor-bearing rats could all come from the same study or from multiple studies, where between-study variability would make the overall variability larger in the former case, and thus the p value would be less significant. Another reason is that our analysis adjusted for factors such as body weight and calendar time, which varied across studies; the same tumor rates could arise from animals from different studies, and thus the adjustments for these factors could vary across tumor types and give different p values.
When we also matched on vehicle and compared the nine SD rat studies with the four F344/N rat studies that used corn oil as the vehicle, there were ten tumor types with a p value below .05 (Table 4, second column). Of these ten tumor types, nine were on the previous list based on all eighteen studies and one was not. Those on both lists include: clitoral gland adenoma, clitoral gland carcinoma, lung alveolar/bronchiolar adenoma, mammary gland carcinoma, mammary gland fibroadenoma, mononuclear cell leukemia, malignant mesothelioma, thyroid gland C cell adenoma, and thyroid gland C cell carcinoma. The only strain difference that was newly identified after matching on vehicle was for pituitary gland pars distalis adenoma. There were five tumor types, however, with p values that rose above .05 (and thus provided less evidence of a strain difference) after matching on vehicle: adrenal medulla pheochromocytoma benign, pituitary gland pars intermedia adenoma, skin basal cell adenoma, thyroid gland follicular cell carcinoma, and uterine carcinoma.
At the next stage, to guard against false positives, we incorporated a Bonferroni correction for multiple testing to account for having performed eighty-two tests. This correction produced a much stricter test, which declared a strain difference to be significant only if the observed p value was below .05/82 (or approximately .0006) rather than below the nominal .05 level. The corrected test identified strain differences in incidence for seven tumor types when using all eighteen studies (which matched on sex, lab, diet, and route) and six tumor types based on the subset of thirteen studies that also matched on vehicle. The p values below the Bonferroni cutoff are indicated with a “B” in Tables 2 and 4. Five tumor types were significant whether matching on vehicle or not: clitoral gland adenoma, mammary gland carcinoma, mammary gland fibroadenoma, mononuclear cell leukemia, and thyroid gland C cell adenoma. Two tumor types lost their significance after matching on vehicle: adrenal medulla pheochromocytoma benign and thyroid gland C cell carcinoma. One tumor type became significant only after matching on vehicle: lung alveolar/bronchiolar adenoma (Table 4).
Vehicle Effects in F344/N Rats
For most tumor types, the statistical significance of the strain difference in incidence did not depend on whether or not we matched on vehicle. That is, whether the nine corn oil SD rat studies were compared with only the four corn oil F344/N rat studies or with both the four corn oil F344/N rat studies and the five water F344/N rat studies, the results were usually similar. In a few cases, however, the vehicle used in the F344/N rat studies appeared to have an effect, and we decided to investigate further. Thus, we focused only on F344/N rats and, for each tumor type, we used a two-sided t-test to assess the equality of responses in the five water and four corn oil F344/N rat studies (Table 4, third column). Of the fifteen tumor types flagged by one of our previous analyses, four had an unadjusted p value for this vehicle comparison that was below .05, though only one was still significant after adjusting for multiple testing. The one tumor type that appeared to depend on which vehicle was used in female F344/N rats was pituitary gland pars distalis adenoma. The three other tumor types that were suggestive of a vehicle effect were: adrenal medulla pheochromocytoma benign, lung alveolar/bronchiolar adenoma, and mononuclear cell leukemia.
Discussion
Since the NTP was established in 1978, the F344/N rat has been the default strain of rat for the majority of its carcinogenicity bioassays. During the past thirty years, the NTP has maintained a historical control database with strain- and sex-specific spontaneous tumor incidence data from control animals. This database is updated annually to consist of data from all control animals from two-year bioassays within the most recent five-year window. Although the concurrent control tumor incidence data is always the most appropriate to use when evaluating the significance of tumor rates among treated groups, there are certain situations in which the use of historical control data can be helpful, such as the interpretation of rare tumors and marginally increased tumor incidences (Deschl et al. 2002; Greim et al. 2003; Keenan, Elmore, Francke-Carroll, Kemp, et al. 2009).
When the NTP switched from the F344/N rat to the SD rat for the majority of its carcinogenicity bioassays, there was a question of how the control tumor incidence data might differ between these two strains. The NTP had previously used female SD rats for the evaluation of nine bioassays that involved dioxins and dioxin-like compounds. These were primarily mechanistic studies and, for this group of chemicals, the SD rat was chosen for three reasons: (1) most of the effects of dioxins and dioxin-like compounds are mediated through binding to the aryl hydrocarbon receptor via CYP1A1 gene expression and Sprague Dawley rats are sensitive to the effects of these toxins; (2) to compare results to previously published data on dioxins and dioxin-like compounds that had used SD rats; and (3) choosing a strain with a lower incidence rate of mononuclear cell leukemia would simplify the interpretation of complex hepatic lesions (Brix et al. 2005; Hailey et al. 2005; Walker et al. 2005). Control tumor incidence data were collected from these nine SD rat studies and were therefore available within the NTP HC database for comparison to matched F344/N rat data.
Challenges in Comparing Tumor Rates
When comparing spontaneous tumor rates, especially from different sources, it is crucial to match on as many experimental conditions as possible and to adjust for suspected risk factors to avoid bias (Haseman 1995). In addition, rather than simply controlling the usual Type I error rate (or false-positive rate), we want to control the familywise error rate (FWER) when testing multiple hypotheses (Hochberg and Tamhane 1987). The FWER is the probability of rejecting at least one true null hypothesis among all hypotheses tested. For example, suppose we are interested in testing twenty-five independent hypotheses, each at a Type I error rate of 0.05. The probability of rejecting at least one exceeds 0.72. Thus, there is roughly a 72% chance of falsely rejecting at least one true null hypothesis by chance alone. Ideally, we would like to limit the FWER to a small probability, such as 0.05. To accomplish this, we use the Bonferroni method, which controls the FWER at the desired level, even if the tests (or tumors) are not independent.
In addressing these many challenges, our comparison of F344/N and SD rat tumor incidence data controlled for potential sources of variation by matching on sex, lab, diet, route, and in some cases vehicle, and by adjusting for body weight, survival, chronologic time, and study-to-study variability. Specifically, our statistical analysis accounted for survival differences, which is important because rats that died early were at less risk of developing tumors than rats that died late. Also, we made a conservative (Bonferroni) correction for multiple testing by performing each of the eighty-two comparisons at the .05/82 significance level to keep the familywise error rate at or below 0.05.
Strain Differences
Eight tumors had incidence rates greater than 5% in either the F344/N or SD rats: adrenal medulla benign pheochromocytoma, clitoral gland adenoma, mammary gland carcinoma, mammary gland fibroadenoma, mononuclear cell leukemia, pituitary gland pars distalis adenoma, thyroid gland C cell adenoma, and uterus polyp (Table 2). Of these eight most common tumors, five had statistically significant strain differences in incidence rates after matching on all criteria (sex, lab, diet, route, vehicle) and adjusting for multiple testing: clitoral gland adenoma, mammary gland carcinoma, mammary gland fibroadenoma, mononuclear cell leukemia, and thyroid gland C cell adenoma (Table 5). After relaxing the stringency of our comparison by not matching on vehicle, the difference in incidence rates for adrenal medulla benign pheochromocytoma became significant (Table 5). Alternatively, if we matched on all criteria but did not adjust for multiple testing, the difference in incidence rates for pituitary gland pars distalis adenoma became significant (Table 5). Uterine polyps were common in both F344/N (14.4%) and SD (13.7%) rats, but the strain differences were not statistically significant, even without adjusting for multiple testing, whether we matched on all criteria or all except vehicle (Table 2).
Summary of neoplasms with statistically significant differences in incidence rates between the two strains of control female rats (Fischer 344/N vs. Hsd:Sprague Dawley) by at least one criterion in our analysis of eighteen NTP studies.
a Rates obtained from Table 2 and based on nine studies per strain, matched on laboratory (Battelle Columbus), diet (NTP-2000), and route (gavage).
b Some analyses further matched on vehicle (corn oil), and some did not (corn oil or water).
c The Bonferroni correction adjusts for multiple testing because a separate test was performed for each of the eighty-two tumor types.
Of the seven common tumors listed above with statistically significant strain differences, three had lower tumor incidence rates in SD rats compared to F344/N rats: clitoral gland adenoma (0.22% vs. 5.82%), mononuclear cell leukemia (0.85% vs. 16.67%), and pituitary gland pars distalis adenoma (38.85% vs. 44.67%). Conversely, four had higher tumor incidences rates in SD rats compared to F344/N rats: mammary gland carcinoma (10.15% vs. 2.44%), mammary gland fibroadenoma (67.44% vs. 48.44%), thyroid gland C cell adenoma (25.43% vs. 13.56%), and adrenal medulla benign pheochromocytoma (7.23% vs. 2.89%). The high incidence rates of mammary gland fibroadenoma and thyroid gland C cell adenoma in the SD rat may make interpretation of subtle treatment-related effects difficult at these sites.
Several of the less common tumors also suggested strain differences. After adjusting for multiple testing, strain differences for lung alveolar/bronchiolar adenoma were significant when matching on all criteria, despite not being significant when matching on all criteria except vehicle (Table 5). Without a correction for multiple testing, these differences were significant whether matching on sex, lab, diet, and route only, or also including vehicle. In contrast, the strain differences for thyroid gland C cell carcinoma were significant with or without the Bonferroni correction when matching on all criteria except vehicle, but only without the multiple testing adjustment when including vehicle (Table 5).
None of the other tumors showed significant evidence of strain differences after adjusting for multiple testing, though some suggested strain differences if the Bonferroni correction was not performed. In this latter category, clitoral gland carcinoma and malignant mesothelioma suggested strain differences, whether matching on all criteria or all criteria except vehicle, whereas pituitary gland pars intermedia adenoma, skin basal cell adenoma, thyroid gland follicular cell carcinoma, and uterus carcinoma suggested a strain difference only when the matching criteria did not include vehicle (Table 5). It is important to note that some rare tumor types could have real strain differences that our analysis did not find significant because the low number of tumors caused a decrease in the power of the analysis.
Biological Relevance of Strain Differences in Survival, Body Weight, and Tumors of the Pituitary and Mammary Glands
In addition to the difference in tumor rates noted above, there were significant differences in survival and body weight between the F344/N and SD rat strains used in these studies (Tables 2 and 3). The SD rats had shorter life spans (629.9 ± 26.5 days) compared to the F344/N rats (688.4 ± 14.1 days). Pituitary gland tumors are reported to be the major neoplastic cause of death in both male and female SD rats, and mammary gland tumors are the second most common cause of death in female SD rats (Keenan et al. 1992; Keenan et al. 1994; Keenan, Soper, Smith, et al. 1995). In our comparison, there was a significant increase in the incidences of mammary gland fibroadenomas and carcinomas in the SD rats (67.44% and 10.15%, respectively) compared to the F344/N (48.44% and 2.44%), which most likely accounted for this difference in life span. For the pituitary gland pars distalis adenoma, however, the incidence was higher in F344/N rats (44.67%) than in SD rats (38.85%), but this increase was not statistically significant.
Pituitary gland adenomas and mammary gland neoplasms are recognized as the most common spontaneous neoplasms in female SD rats (Chandra, Riley, and Johnson 1992; Ettlin, Stirnimann, and Prentice 1994; McMartin et al. 1992; Son 2004). Pituitary gland adenomas produce multiple hormones, including prolactin, which is implicated in the development of mammary gland neoplasms (Attia 1985; McComb et al. 1984). Previous studies have shown that, in F344 and SD strains, pituitary gland adenomas immunoreactive for prolactin are the most common type (McComb et al. 1984; Sandusky et al. 1988). It has also been demonstrated that inhibition of prolactin secretion can decrease the development of mammary gland adenomas (Welsch et al. 1981). However, in our comparison, the increase of SD mammary gland fibroadenomas (67.44%) was not associated with a similar increase in SD pituitary gland pars distalis adenomas (38.85%).
There is also a positive correlation between body weight and the incidence and time of onset of tumors of hormonal tissues such as mammary and anterior pituitary glands in rats and mice (Nold et al. 2001; Thurman et al. 1994). Development of these tumors is mediated through increased levels of estrogens, and adipose tissue is considered to be one of the major sources of extraglandular estrogen (Rao 1996). Alternatively, weight reduction decreases the estrogen levels via a decrease in body fat, thus decreasing the incidence of these tumors (Rao 1996). In our comparison, the SD rats were heavier than the F344/N rats and had a higher incidence of mammary gland fibroadenomas. The lack of a similar increase in the incidence of pituitary gland adenomas suggests that, in this group of studies, the increased incidence in mammary gland tumors may have been owing to the estrogenic effects of higher body weight rather than the effects of prolactin-secreting pituitary tumors. The cause for the difference in body weight between these two strains of rats is unknown, but there is a general trend for Sprague Dawley rats to be heavier than F344 rats, and the degree of weight difference can depend on a number of factors including, but not limited to, diet, exercise, and genetics. Moderate diet restriction has been shown to reduce the incidence of these two tumors in SD rats and, since these are the main neoplastic causes of death, also increase lifespan (Keenan et al. 1992; Keenan et al. 1994; Keenan, Soper, Herzog, et al. 1995; Keenan, Soper, Smith, et al. 1995; Keenan et al. 1997).
Other causes of decreased lifespan in the SD rats were considered. Mononuclear cell leukemia, one of the causes of early death in the F344/N rat, was not a contributory cause of mortality in the SD rats since this is a tumor type that is common only in the F344 rat strain (F344/N 16.67%; SD 0.85%; Caldwell 1999). Other tumors that occurred with relatively high frequency were benign pheochromocytoma (F344/N 2.89%; SD 7.23%), clitoral gland adenoma (F344/N 5.82%; SD 0.22%), thyroid C cell adenoma (F344/N 13.56%; SD 25.43%), and uterus stromal polyp (F344/N 14.44%; SD 13.74%). Of these, the pheochromocytomas and thyroid C cell adenomas were significantly increased in the SD rats but were not likely contributors to the difference in early mortality between the two strains. Certain non neoplastic lesions may also contribute to early death in rats, such as chronic progressive nephropathy (Ettlin, Stirnimann, and Prentice 1994). A review of the data did not reveal any significant difference in incidence or severity of this condition between the two strains (data not shown).
Corn Oil Vehicle Effect on Pituitary Gland Pars Distalis Adenoma
In the absence of a correction for multiple testing, the difference in incidence rates for pituitary gland pars distalis adenoma between female F344/N (201/450, 45%) and SD (183/471, 39%) rats was significant after matching on all criteria, including vehicle, but not if vehicle was dropped as a matching criterion (Table 5), suggesting a possible vehicle effect in female F344/N rats. Among female F344/N rats, the incidence rates by vehicle were 106/200 (53%) for corn oil and 95/250 (38%) for water, which are significantly different, even after applying a Bonferroni correction for multiple testing (Table 4, last column).
Neoplasms of the pars distalis are one of the most common neoplasms in the laboratory rat (MacKenzie and Boorman 1990). The cause of spontaneous neoplasms of the pars distalis is unknown, but hormonal imbalances related to aging or stress may be a factor (Greim et al. 2003; Attia 1985). Estrogen has a trophic effect on the development of spontaneous neoplasms of the pars distalis, and the F344 rat is known to be more sensitive to estrogen-induced hyperplasia of the pars distalis than the Sprague Dawley rat (Fujimoto et al. 1987; MacKenzie and Boorman 1990). The incidence of pituitary neoplasms in F344/N rats is also reported to be directly correlated with body weight (Gries and Young 1982; Haseman et al. 1997). Dietary restriction results in significant reductions in body weight and is also reported to reduce the incidence of spontaneous pituitary neoplasms in laboratory rats (Keenan et al. 1994).
Haseman et al. (1985) performed a retrospective study of NTP carcinogenicity bioassays in 1985 to determine if F344/N rats receiving corn oil by gavage showed tumor incidences that differed from those of untreated control animals. The reported effects of corn oil gavage on rats were increases in body weight, survival, and pancreatic acinar cell tumors and decreases in mononuclear cell leukemia, all occurring in male but not female F344 rats. Some of these effects in male rats appear to be interrelated. Since corn oil gavage reduces leukemia, the male rats live longer. Similarly, the increased rates of pancreatic acinar cell tumors may be a direct effect of corn oil, in combination with other dietary fatty acids, on the metabolism of initiated pancreatic cells (Haseman and Rao 1992).
Previous investigations have not shown vehicle to affect the incidence rates of pituitary neoplasms in female F344/N rats from NTP studies (Haseman and Rao 1992; Haseman et al. 1985). However, there is evidence that decreased or increased body weight can reduce or increase the incidence of pituitary gland tumors, respectively, in both male and female F344/N rats (Haseman et al. 1997; Haseman et al. 2003). In the current evaluation, the female F344/N rats were gavaged with either corn oil or water, but they had similar mean body weights at one year on study, 278.9 g for corn oil gavage studies and 274.5 g for water gavage studies. The lack of a correlation between body weight and pituitary gland tumors in our analysis suggests that the increase in pituitary gland pars distalis adenomas in female F344/N rats may have been a direct effect of the corn oil.
Comparison to Previous Analysis
In 2005, Brix et al. reported spontaneous tumor incidence data in female control SD rats and also compared tumor incidence data for six common tumors between SD and F344/N rats used in NTP studies. They informally noted similarities and differences among those 6 tumor types, but they did not perform formal statistical comparisons of the two strains. Some of their incidence rates agree well with ours and others do not, but this lack of consistency was expected for several reasons. Brix et al. (2005) combined adenomas and carcinomas in their comparisons of clitoral gland neoplasms, pituitary gland pars distalis neoplasms, and thyroid gland C cell neoplasms, whereas we listed adenomas and carcinomas separately. They used data on 371 SD rats from seven NTP studies, whereas we used data on 473 SD rats from those same studies plus two additional NTP studies. Among studies using the NTP-2000 diet, they compared SD rats from gavage studies with F344/N rats from feed studies, whereas we held route constant by using only gavage studies for both SD and F344/N rats. Brix et al. (2005) also presented incidence rates from gavage studies in F344/N rats, but those rats were fed the NIH-07 diet, whereas the SD rats were fed the NTP-2000 diet. Thus, our formal comparisons matched on sex, lab, diet, and route (and in some cases vehicle), whereas their informal comparisons matched either on sex and diet (but not lab, route, and vehicle) or else on sex, route, and vehicle (but not lab and diet).
Concluding Remarks
In some specific cases involving rare tumors, pooling incidence rates across vehicles may provide greater power to detect differences between strains. For example, the incidence of uterus carcinoma was 6/473 (1.27%) in SD rats given corn oil gavage, 1/200 (0.50%) in F344/N rats given corn oil gavage, and 0/250 (0.00%) in F344/N rats given water gavage. By matching on all criteria, including vehicle, we found no significant difference in tumor rates between SD and F344/N rats (p = .248; Table 4). However, when we did not match on vehicle, the strain difference became significant (p = .037; Table 4). This change in significance was owing to the sample size for the F344/N group essentially increasing from 200 to 450, resulting in the statistical test having greater power.
The major concern with using historical control data from different studies or sources is the comparability of the study under evaluation with the studies in the historical control database, in light of known and unknown sources of variability (Haseman and Rao 1992; Keenan, Elmore, Francke-Carroll, Kemp, et al. 2009; Keenan, Elmore, Franke-Carroll, Kerlin, et al. 2009). Potential sources of variability in chronic rodent bioassays include, but are not limited to species, strain, age, sex, laboratory, dietary factors, route of exposure, vehicle, body weight, survival, animal room environment, gross necropsy, slide preparation techniques, and histopathology diagnoses (Haseman and Rao 1992; Haseman et al. 1989; Rao, Piergorsch, and Haseman 1987; Rao and Crockett 2003). As shown in this investigation, sources of variability within a strain, such as diet, vehicle, and body weight, can be interrelated and can affect specific tumor rates, such as pituitary gland tumors. Furthermore, as we illustrated, pooling of tumor data across vehicles within a strain may be reasonable for rare tumors, such as uterus carcinoma. Historical control data can be useful when investigating rare tumors, but our study illustrates that a tumor that is rare for one strain may not be rare for another strain. Thus, the rarity of tumors is strain specific.
There are a number of factors that must be considered when choosing rodent strains for a particular study (carcinogenicity, reproductive, etc.) or for evaluation of chemicals with known or suspected target organ sites. Issues include, but are not limited to, chemical susceptibility, spontaneous tumor incidence rates, type of spontaneous tumors, time of tumor onset, type of spontaneous nonneoplastic lesions, survival, body weight, litter size, sex ratio, and pathogen profile. Although all strains have one or more features that could have a potential impact on study protocol or study interpretation, the primary goal is to choose the strain with the best profile for the specific chemical to be studied. Studies should be designed to exploit the uniqueness of the chemical being evaluated; therefore, flexibility is important in the protocol development of each study, including the choice of rodent strain.
Footnotes
Acknowledgments
This research was supported (in part) by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (Z01ES101744; Z01ES045007; N01ES55547). We are grateful for the constructive comments from Michelle Hooth, Grace Kissling, and David Malarkey.
