Society of Toxicologic Pathology Position Paper: Organ Weight Recommendations for Toxicology Studies

Abstract

The evaluation of organ weights in toxicology studies is an integral component in the assessment of pharmaceuticals, chemicals, and medical devices. The Society of Toxicologic Pathology (STP) has created recommendations for weighing organs in GLP general toxicology studies lasting from 7 days to 1 year. The STP recommends that liver, heart, kidneys, brain, testes, and adrenal glands be weighed in all multidose general toxicology studies. Thyroid gland and pituitary gland weights are recommended for all species except mice. Spleen and thymus should be weighed in rodent studies and may be weighed in non-rodent studies. Weighing of reproductive organs is most valuable in sexually mature animals. Variability in age, sexual maturity, and stage of cycle in non-rodents and reproductive senescence in female rodents may complicate or limit interpretation of reproductive organ weights. The STP recommends that testes of all species be weighed in multidose general toxicology studies. Epididymides and prostate should be weighed in rat studies and may be weighed on a case-by-case basis in non-rodent and mouse studies. Weighing of other organs including female reproductive organs should be considered on a case-by-case basis. Organ weights are not recommended for any carcinogenicity studies including the alternative mouse bioassays. Regardless of the study type or organs evaluated, organ weight changes must be evaluated within the context of the compound class, mechanism of action, and the entire data set for that study.

Introduction

Organ weights are widely accepted in the evaluation of test article-associated toxicities (Black, 2002; Bucci, 2002; Wooley, 2003). A recent survey by the Society of Toxicologic Pathology (STP) on the practice of weighing organs during toxicology studies performed in pharmaceutical, animal health, chemical, food/nutritional, and consumer product industries worldwide revealed notable variation in organ weighing practices (Michael et al., 2007). The present document defines STP’s recommendations for weighing organs in general toxicology studies.

Organ Weights in General Toxicology Studies

Organ weight changes are often associated with treatment-related effects. The choice of appropriate organs to weigh in toxicology studies involves understanding the test article’s mechanism of action, metabolism and toxicokinetics; the physiology of the test species; and the cumulative data set from previous studies of the same or similar compounds or materials. The STP recommends that organ weights be included routinely in multidose GLP general toxicity studies with durations from 7 days to 1 year. Weights should be recorded in metric units using equipment calibrated for the range of weights measured (Long et al., 1998a).

Organ Weights Recommended for Collection in General Toxicology Studies for all Species

The STP recommends that liver, heart, kidneys, brain, adrenal glands, and testes (preferably from sexually mature animals) should be weighed in all species in multidose GLP general toxicology studies of 7 days to 1 year in duration. Alterations in liver weight may suggest treatment-related changes including hepatocellular hypertrophy (e.g., enzyme induction or peroxisome proliferation) (Greaves, 2000; Amacher et al., 2006; Juberg et al., 2006). Liver weights may be elevated in studies of less than 7 days duration for potent hepatic enzyme-inducing compounds. Elevated heart weight may be the only evidence of myocardial hypertrophy that is often macroscopically and microscopically difficult to recognize (Thiedemann, 1991; Greaves, 2000). Changes in kidney weight may reflect renal toxicity, tubular hypertrophy or chronic progressive nephropathy (Greaves, 2000). Variations in adrenal gland weight may indicate hypertrophy, hyperplasia, or atrophy associated with stress, endocrinopathies, or test article effects (Greaves, 2000).

Changes in brain weights are rarely associated with neurotoxicity. The utility of brain weight rests in the ability to calculate organ to brain weight ratios. Some consider evaluation of organ to brain weight ratios helpful when terminal body weights are affected by the test article or to normalize organ weight data when there is large interanimal variability. The STP recommends collection of brain weights so that organ to brain weight ratios may be calculated if needed.

Weighing Endocrine Organs

When tissues have been properly collected, thyroid and pituitary weight changes may reflect endocrine perturbations that may not be immediately apparent upon histological evaluation (Greaves, 2000; Capen et al., 2002). The STP recommends that thyroid gland and pituitary gland weights be gathered routinely in all species except mice. Weighing the thyroid gland and pituitary gland in mice may have similar value in principle, but should be optional because the collection and weighing process may produce artifacts that can complicate or even prevent microscopic assessment. In rodents, fixing the thyroid and pituitary glands prior to weighing may provide accurate weight measurements and improve morphology (presumably due to less postmortem or handling artifact) (Kanerva et al., 1983; Bucci, 2002).

Weighing Reproductive Organs

Changes in testes weights may reflect changes in seminiferous tubules or interstitial edema. Changes in epididymal weight may be a sensitive indicator of decreased sperm production or may reflect edema or inflammation. Prostate weights may be associated with compound-related effects arising from modulation of androgenic or estrogenic signaling. For multidose rat GLP toxicology studies, testes, epididymides, and prostate gland should be weighed routinely. Testes should be weighed in mice; prostate and epididymides can be weighed in mice on a case-by-case basis. Seminal vesicle weights in male rodents provide information similar to that afforded by prostatic weights (Creasy, 2002), and therefore usually contribute little further understanding of toxicity.

In non-rodents, testes are recommended to be weighed routinely. Epididymides and prostate may be weighed on a case-by-case basis. Regardless, organ weights for prostate, epididymis, and other accessory sex organs in males are more valuable when assessed in mature animals than in immature animals (Creasy et al., 2002; Lanning et al., 2002). This limitation arises from the generally lower testicular, epididymal, and prostate weights found in sexually immature animals compared to their sexually mature counterparts (Foley, 2001) and the interanimal variability in non-rodents associated with the onset of puberty.

In non-rodents, epididymal and prostate weights should be optional, as differences in age, sexual maturity, and certain spontaneous changes (e.g., nonspecific hypertrophy and hyperplasia in the prostate) can produce variation in organ weights among individuals that makes interpretation difficult. In non-rodents, toxic effects in the prostate and epididymis resulting in weight changes likely will be reflected by other parameters, such as altered microscopic structure or aberrant testicular weights. If collected, reproductive organ weights from immature animals must be evaluated with caution (Johnston et al., 2000).

In both rodents and non-rodents, normal reproductive cycling and the effects of age cause notable interanimal variation in uterine and ovarian weights. Most toxicities of the female reproductive tract can be adequately identified by light microscopy. In addition, interpretation of reproductive organ weights from animals with evidence of stress or exhibiting significant body weight loss must take into account that organ weight changes might represent secondary effects of treatment on the reproductive cycle rather than a direct toxic effect of the test article. If ovarian weights are collected in rodents, then the STP recommends weighing the ovaries in repeat-dose rodent studies of less than 6 months’ duration. Reproductive organ weights in female rodents may have greater value in shorter duration toxicity studies (less than 6 months durations), because reproductive senescence in mature rats can begin as early as 6 months of age (Peluso and Gordon, 1992).

Weighing Lymphoid Organs

The STP advocates that splenic and thymic weights be incorporated into GLP rodent toxicity studies lasting longer than 7 days (excluding carcinogenicity studies). Thymic weights can be valuable in non-rodents, but spontaneous thymic involution complicates interpretation of thymic weights in postpubertal dogs and monkeys, particularly in studies over 3 months in duration. In non-rodents, splenic weights may be influenced by the method of anesthesia and degree and consistency of exsanguination. Thymic and splenic weights may be collected in non-rodents on a case-by-case basis. If the nonhuman primate is the definitive test species for a chemical with a primate-specific target (e.g., monoclonal antibody or biologic agent), spleen and thymus should be weighed in at least one nonhuman primate study. If treatment-associated alterations in splenic or thymic weights are not identified in general toxicology studies of short duration in non-rodents, these weights may be omitted from studies of longer duration, where data interpretation may be confounded by aging changes (Morishima et al., 1990; Snyder et al., 2001; Bucci et al., 2002).

Splenic and thymic weights should always be interpreted in conjunction with histopathologic findings because of the inherent variability in lymphoid organ weights. Lymphoid organ weight changes that are measured in the absence of a corresponding histopathological alteration should be interpreted with caution (Haley et al., 2005). Weighing lymph nodes is not recommended because measurements vary markedly between and within animals, and because these tissues (especially mesenteric nodes) are difficult to isolate from adjacent fat (Haley et al., 2005). The STP previously published recommendations for evaluating lymphoid organs in nonclinical toxicology studies (Haley et al., 2005). Readers are referred to this article for more detailed guidance for assessing lymphoid tissues.

Organs to be Weighed on a Case-by-Case Basis

The STP recommends that certain organs should be weighed on a case-by-case basis rather than routinely in every study. These organs include uterus, ovary, lung, lymph nodes, gastro-intestinal tract, pancreas, seminal vesicles, and salivary glands. The STP considers lung weights to be standard and valuable endpoints in inhalation studies. For studies that do not involve administration of test article by inhalation, the STP believes that evaluation of lung weights adds little value to microscopic assessment and should be optional.

Weights of gastrointestinal segments are generally too variable due to uneven filling and inconsistent trimming. Cecal weights occasionally may be useful in antibiotic and nutritional studies, but these alterations usually are adaptive rather than toxic responses that are evident during gross and/or microscopic examination (Delaney et al., 2003).

While the pancreas can be isolated in non-rodent species, it is difficult to define in rodents as it is relatively disseminated and interspersed with adipose tissue and lymph nodes (Feldman and Seeley, 1988). Seminal vesicle weights in male rodents provide information similar to that afforded by prostatic weights (Creasy, 2002), and therefore usually contribute little further understanding of toxicity.

Salivary gland weight may increase or decrease in response to administration of some test articles (Greaves, 2000). In most instances, histopathology is sufficient to detect salivary gland toxicity, so salivary gland weights should be considered only in situations in which the test article is known or suspected to affect secretory glands.

Organ Weighing in Acute Studies

The STP believes that organ weights have value in studies in which animals have systemic exposure to the compound, device, or treatment for at least 7 days. Organ weights are considered to be of limited value and are not recommended by the STP in studies in which only a single dose is administered (e.g., acute/MTD single-dose studies, single-dose/microdose IND-enabling studies) or in which a single treatment group receives multiple doses (e.g., escalating dose studies). Likewise, organ weights are not recommended by the STP in single dose (2-week duration) IND-enabling studies unless the test article (e.g., monoclonal antibodies) demonstrates prolonged exposure lasting over most of the study period.

Organ Weighing in Carcinogenicity Studies

The STP believes that organ weights are not appropriate for carcinogenicity studies, including those utilizing alternative mouse models such as p53 +/− and Tg rasH2 mice. In these long-term studies, normal physiological aging changes and intercurrent disease may contribute to interanimal variability, which will confound organ weight interpretation (Long et al., 1998a, 1998b). Examples of such intercurrent conditions include spontaneous neoplasia, cachexia related to toxicity or neoplasia, morbid obesity, chronic progressive nephropathy, cardiomyopathy, endocrine imbalance, and amyloidosis (mice), to mention only a few (Keenan et al., 1997; Morton et al., 2002; Haseman et al., 2003; Hard et al., 2005).

Recommendations on Collecting Organ Weights

Attention to the order of necropsy is essential for obtaining meaningful organ weight data. Ideally, the animal necropsy order should be randomized or appropriately rotated to prevent bias. Organ weights change throughout the day, particularly in rodents if food is removed from all cages at one time (i.e., for a large study, some mice may be necropsied hours after the first animal) (Bucci, 2002). Prosector assignments should be rotated to ensure that an individual prosector does not necropsy animals from only one dose group, as differences in organ removal and trimming techniques might impact results. Organs must be gently isolated and all extraneous tissues (e.g., fat) removed.

If organs are not weighed and placed in fixative immediately after removal from the carcass, isolated organs should be kept moist until they are weighed and fixed because small organs may dehydrate rapidly (Bucci, 2002). Paired organs routinely should be weighed together. Organ weights from immersion-fixed tissues cannot be compared to fresh organ weights from the same tissues (Kanerva, 1983). In like manner, organ weights from perfusion studies may be altered by the infusion of fixative or other fluids, so organ weights taken following perfusion may not be comparable to organ weights collected from fresh or immersion-fixed organs. Handling and fixation of control organs should be appropriately matched to those of treated groups in all cases.

Terminal body weights should be taken consistently for all animals in a given study. Ideally, the terminal body weight, which is useful in calculation of organ weight ratios, should be collected at necropsy, not as a weight taken the morning of the day of necropsy, to control for potential variations induced by novel and stressful handling and diurnal fluctuations. This recommendation is particularly important if rodents are to be fasted prior to necropsies scheduled to last over the course of one or more days.

Weighing Organs in Animals Euthanized Before Scheduled Necropsy

The STP recommends that organ weights not be collected from animals that die or are euthanized prior to scheduled necropsy. Differences in nutritional status, exsanguination, tissue congestion and edema, and the absence of matched concurrent control data confound interpretation of organ weights from animals that are necropsied before the scheduled termination period in ways which do not reflect a direct treatment-related effect.

Use of Concurrent vs. Historical Controls in the Evaluation of Organ Weights

Organ weights from treatment groups in a given toxicology study are best compared to concurrent controls. Comparisons to historical control values can be useful when the concurrent control group animals have abnormally high or low weight values or very limited inter-animal variation. Use of historical controls must be appropriately matched for age, sex, strain, body weight, fasting status, vehicle used, etc. Historical control data are also useful when the number of animals in each group is low, as in non-rodent studies. Historical control data should be updated regularly to minimize drift in the organ weight data set over time. Data compiled over a recent period of less than 5 years are generally considered more appropriate than older data.

Recommendation for Using Organ-to-Brain and Organ-to-Body Weight Ratios

The STP advocates the routine calculation and evaluation of organ-to-body weight ratios in toxicology studies lasting from 7 days to 1 year. The use of organ-to-body weight ratios is often helpful for clarifying treatment-related organ weight changes, particularly in non-rodents in which there can be notable variations in organ and body weights (Wooley, 2003). For example, a mean organ weight of 250 grams in control animals and 300 grams in high-dose animals implies the presence of a treatment effect (20% increase). However, if the mean terminal body weights of control and high-dose animals were 8 kg and 9 kg, respectively, the organ-to-body weight ratios would only be slightly different between the 2 groups (6%), thus suggesting that the apparent difference in organ weight might have been the result of differences in body weight and unrelated to treatment.

Understanding the biological behavior of an organ in response to body weight changes is important in the evaluation of organ weights. While normalization of organ weights to body weight helps eliminate variations due to body weight differences, alterations in body weight may lead to increases or decreases in some organ-to-body weight ratios. In cases of notable body weight changes, organ-to-brain weight ratios may be useful, as test materials that alter body weight generally do not alter brain weight (Wilson et al., 2001) making organ-to-brain weight ratios useful in cases of notable decreases in body weight that impact organ-to-body weight ratios. Organ-to-brain weight ratios also may be helpful in normalizing animal-to-animal variability, especially in non-rodent studies. Studies indicate that evaluation of organ-to-brain weight ratios may be more appropriate for evaluation of organ toxicity in the ovary and adrenal glands, while organ-to-body weight ratios may be more appropriate for evaluation of liver and thyroid gland weights (Bailey et al., 2004). Organ-to-brain weight ratios can be calculated routinely or can be requested on a case-by-case basis. At the very least, organ-to-brain weight ratios may provide additional confidence that absolute organ weight data reflect the presence or absence of treatment-related alterations in organ weights.

Statistical Evaluation of Organ Weights

Recent excellent reviews on statistical evaluation of organ weights have been published (e.g. Gad et al., 2002; Gad, 2006), so this document will not extensively discuss this topic. Although statistics are commonly utilized in the evaluation of organ weights in general toxicology studies, organ weights may be reliably interpreted with only descriptive statistics (individual animal data, number of animals evaluated, mean, standard deviation) in coordination with other study data. Statistical methods for analyzing alterations in organ weights vary widely according to the preference of the statistician and the assumptions required by a specific statistical test.

Reliance on statistical significance (or the lack thereof) alone in the evaluation of organ weight changes is not satisfactory, particularly in studies with a small sample size. It is important to note that organ weight alterations may be test article-related but not statistically different from controls, or conversely, statistically different from controls but not related to treatment (Gad et al., 2002). When organ weight changes are statistically significant from control values or in anyway outstanding, interpretations should clearly distinguish treatment-related findings from incidental findings and provide perspective on the reasons for these distinctions.

Interpretation of Organ Weight Data

The proper evaluation of absolute organ weights and organ-to-body weight or organ-to-brain weight ratios should include examination of both individual animal values and group means. While organ weights provide useful signals indicating test article-related effects, organ weight data must be interpreted in an integrated fashion with gross pathology, clinical pathology, and histopathology findings. Detectable weight changes in and of themselves may not necessarily be treatment-related or adverse. Organ weight changes without macroscopic or microscopic correlation should be interpreted with caution.

The STP recommends that the study pathologist be responsible for evaluating organ weight data because the study pathologist is most qualified to correlate organ weight changes with the clinical pathology, gross, and microscopic findings. The STP further recommends that the study pathologist have access to and examine all organ weight data (and gross pathology data) prior to initiating the histopathological evaluation in order to better identify and characterize any treatment-related effects.

Conclusions

These recommendations have been developed to assist in the selection and interpretation of organ weight data acquired in general toxicology studies. Priority has been given to weighing organs that more frequently contribute to detecting test article-related effects (Appendix 1). It is important to note that organs weighed should be added or removed from the necropsy protocol as appropriate for the test article. Regardless of the study type or organs evaluated, organ weight changes must be evaluated by the pathologist within the context of the compound class, mechanism of action, and the entire data set for that study.

Footnotes

The recommendations in this paper are endorsed and supported by the European Society of Toxicologic Pathology and the British Society of Toxicological Pathologists.

Table

References

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28.

29.

30.