Abstract
Hormonally mediated effects on the female reproductive system may manifest as pathologic changes of endocrine-responsive organs and altered reproductive function. Identification of these effects requires proper assessment, which may include investigative studies to profile female reproductive hormones. Here, we briefly describe normal hormonal patterns across the estrous or menstrual cycle and provide general guidance on measuring female reproductive hormones and characterizing hormonal disturbances in nonclinical toxicity studies. Although species used in standard toxicity studies share basic features of reproductive endocrinology, there are important species differences that affect both study design and interpretation of results. Diagnosing female reproductive hormone disturbances can be complicated by many factors, including estrous/menstrual cyclicity, diurnal variation, and age- and stress-related factors. Thus, female reproductive hormonal measurements should not generally be included in first-tier toxicity studies of standard design with groups of unsynchronized intact female animals. Rather, appropriately designed and statistically powered investigative studies are recommended in order to properly identify ovarian and/or pituitary hormone changes and bridge these effects to mechanistic evaluations and safety assessments. This article is intended to provide general considerations and approaches for these types of targeted studies.
Keywords
Introduction
Pathologic findings in female reproductive tissues from general first-tier toxicity studies in sexually mature animals should be evaluated in the context of the estrous cycle and collective changes in endocrine-responsive organs. Hormonally mediated effects on the female reproductive system may manifest in diverse ways, including disruption of the estrous cycle, organ weight changes, developmental abnormalities, decreased reproductive function, and histopathological effects (Table 1). Evaluating these effects is complicated by the dynamic nature of female reproductive hormones, which may influence the nature and severity of observed changes. Developmental and pathologic consequences of alterations in female reproductive hormones have been detailed previously (e.g., Hart 1990; Yuan and Foley 2002). The primary reasons for measuring reproductive hormones are to identify potential safety concerns, better understand the mechanism of action of a test article to determine relevance to humans, and evaluate whether hormonal measurements may potentially serve as biomarkers in a clinical setting.
General histopathologic changes of vagina, ovary, uterus, and mammary gland associated with hormonal changes.
Note: FSH, follicle-stimulating hormone; LH, luteinizing hormone; PRL, prolactin.
Test article–related effects on female reproductive tissues and hormones may be a consequence of a direct (primary) effect by the test article or an indirect (secondary) interference with the hormonal balance. For example, direct estrogen- or progestogen-like activity at the tissue level may manifest as hypertrophic or hyperplastic changes in uterus, vagina, and mammary gland and atrophic changes in the ovary (Nuttall et al. 1998; Cline 2007; Jordan 1994; Gobello 2006; Okazaki et al. 2002; Slayden and Brenner 2004; Rehm et al. 2007). Secondary effects of a test article might be mediated via interference of hypothalamic–pituitary pathways that would more likely manifest as ovarian changes with downstream effects on reproductive tract and mammary gland (Alison, Capen, and Prentice 1994; Nilson et al. 2000; Gobello 2006; Sinkevicius et al. 2009; Chakravarty et al. 1979; Topaloglu et al. 2009). There could also be an exaggerated effect of 17β-estradiol (E2) or progesterone (P4) in sex steroid–responsive tissues due to increased serum concentrations or an imbalance of these two hormones (Harleman et al. 2012). Reproductive tissue alterations may manifest as asynchrony of tissue changes with the stage of the estrous or menstrual cycle or as atrophy due to cessation of the cycle (Mirsky et al. 2011). Other hormonal effects may lead to proliferative tissue responses, including hyperplasia and neoplasia (Rehm, Dierksen, and Deerberg 1984; Alison, Capen, and Prentice 1994; Liehr 2000; Milliken et al. 2002; Capen 2004; Munson and Moresco 2007; Harvey 2011; Harleman et al. 2012). Stress and reduced food consumption associated with generalized systemic toxicity may also affect reproductive hormones, resulting in reproductive organ toxicity such as altered reproductive organ weight and tumors (Knuth and Friesen 1983; Koizumi et al. 1989; Molon-Noblot et al. 2003).
Identifying hormonally mediated mechanisms thus requires consideration of both endocrine and morphologic context (Attia 1998; Li and Davis 2007; Rehm, Stanislaus, and Williams 2007; Westwood 2008). For example, diestrus accounts for the longer portion of the rodent estrous cycle (see below); therefore, at necropsy of a 4-week female rat study, most rats should display features of diestrus, and deviations from this pattern, even when reproductive tissues lack overt lesions, can be an early sign suggesting a disturbance of estrous cyclicity (which should be confirmed by an investigative or female fertility study). For the detection of reproductive lesions related to hormonal disturbances in the rat, duration of dosing covers 2 to 3 cycles (roughly 2 weeks), but hormone-associated lesions may become apparent following dosing as short as 5 days (Mirsky et al. 2011). In nonrodents, depending on dose, species, age of animal, and mechanism of action, time to appearance of lesions related to hormonal disturbances may vary considerably (Rehm et al. 2007; Vo and Jeung 2009). Most commonly in the rat, first indications of a test article–related effect on the female reproductive system occur in general toxicity studies in rats by 4 weeks. Anatomic pathology changes include alterations in ovarian weight with associated findings such as vaginal mucification, secretory change and hyperplasia of the mammary gland, or tissue atrophy (Table 1). Less commonly, microscopic alterations observed in 6 months or carcinogenicity studies are the trigger to investigate hormonal effects.
Prior to initiating targeted endocrine or reproductive toxicity studies, several issues regarding study design should be considered. Appropriate dose selection is necessary to maximize effect on the reproductive system and minimize confounding toxicity. Timing of treatment is important because treatment at specific times of the estrous cycle (staged/synchronized animals) may also help determine possible mechanisms of action and narrow the field of potential hormones to evaluate. Hormonal changes tend to be progressive, and it is not unusual that the effect level, and hence safety margin, drops with increased dosing duration. In these cases, investigative studies may be essential to gain insight into the relevance of the findings to human.
To assist with these types of evaluations, here we describe normal hormonal patterns across the estrous or menstrual cycle and provide general recommendations on measuring reproductive hormones and detecting hormonal disturbances in toxicity studies. We focus on the measurement of female reproductive hormones, including E2, P4, follicle-stimulating hormone (FSH), luteinizing hormone (LH), and prolactin (PRL), in species commonly used for nonclinical safety studies.
The Estrous or Menstrual Cycle
The pubertal activation of the female reproductive endocrine axis is the start of cyclical events that directly influence the function and anatomy of the pituitary, ovary, uterus, external genitalia, and mammary gland. Rodents, canines, and primates share several basic features of female reproductive endocrinology. FSH from the pituitary drives ovarian follicular development and E2 synthesis. E2 feeds back to the central nervous system and pituitary to produce a surge of pituitary–LH causing ovulation, luteinization of follicular cells, and P4 production. Following lysis of the corpus luteum, a new wave of follicular development begins, and the cycle is repeated. In addition to sex steroids, PRL and growth hormone support the development and function of the mammary gland (Brisken and O’Malley 2010; Kleinberg and Ruan 2008), particularly during pregnancy and lactation. Apart from these similarities, significant differences in ovarian cyclicity exist across species which are important to consider when assessing potential hormonal effects.
Determination of cycle status is an important part of any study design in adult female animals and often essential for the proper interpretation of female reproductive hormone measurements. Various methods exist for assessing cycle stage. The vaginal epithelium in particular is rapidly responsive to cyclic hormonal alterations (especially estrogen), and vaginal cytology or histology provides an early and highly sensitive indication of a hormonal disturbance, both for rodent (Li and Davis 2007; Goldman, Murr, and Cooper 2007) and for primate (Weinbauer et al. 2008; Stute et al. 2004) studies. Uterine histology may also provide additional information regarding cycle stage for hormone values measured at or near the time of death (e.g., van Esch et al. 2008). These types of assessments provide important ancillary information for interpreting morphologic changes in the reproductive system and mammary gland.
Mice and Rats
After reaching sexual maturity and showing vaginal opening around 24 to 30 days and 36 to 38 days postnatally in mice and rats, respectively (Safranski, Lamberson, and Keisler 1993; Beckman and Feuston 2003), females have short 4- to 5-day rapidly recurring estrous cycles consisting of 2 to 3 days of diestrus (called diestrus-1 and diestrus-2, respectively), 1 day of proestrus, and 1 to 2 days of estrus (Westwood 2008). Metestrus in rodents is considered an early part of diestrus rather than a late part of estrus. Cycles are characterized by an increase in E2 on the second day of diestrus, which peaks midday on the day of proestrus, resulting in LH and PRL surges in the late afternoon of the same day, followed by a short luteal phase and estrus (Figure 1; Goldman, Murr, and Cooper 2007; Murr, Geschwind, and Bradford 1973; Organization for Economic Cooperation and Development [OECD] 2009). Both E2 and P4 synthesis by the ovary are under basal stimulation of pituitary gonadotropins in rodents, but PRL is essential to sustain P4 production by the corpus luteum by inhibiting 20α-hydroxysteroid dehydrogenase (which inactivates P4), and upregulating LH receptors (Freeman et al. 2000). During the ovarian cycle and initiation of pregnancy, pituitary-derived PRL performs this role and the long form of the PRL receptor is essential for the maintenance of the corpus luteum in cycling females (Bouilly et al. 2012). However, both long and short forms of the PRL receptor are required for normal ovarian function after mating (Le et al. 2012). PRL in female rats is also needed for luteolysis, and if the preovulatory PRL peak is blocked, inactive (non-P4 producing) corpora lutea remain in the ovarian tissue (Bowen et al. 1996), resulting in increased ovarian weights in long-term studies due to the accumulation of nonfunctioning luteal cells.

Schematic pattern of typical endocrine changes during the rat estrus cycle.
Since the pituitary lactotrophs are under negative regulation by hypothalamic dopamine, rat and mouse reproductive biology are very sensitive to any effect on the dopaminergic system (Freeman et al. 2000; Rehm, Stanislaus, and Wier 2007). Changes of the dopaminergic/PRL system in rodents can lead to effects associated with altered P4 levels (Table 1) and, indirectly, signs of increased estrogen effects secondary to hypoprolactinemia and shifts in the P4/E2 balance (Harleman et al. 2012). Ovarian-derived androgen levels generally follow the pattern of E2 during the estrous cycle in female rodents (Dupon and Kim 1973; Rush and Blake 1982), and adult estrous cyclicity does not depend on androgen receptor activation (Clark, Kelton, and Whiteny 2003). With increasing age, irregular reproductive cycles leading up to reproductive senescence are common in female rodents (Huang and Meites 1975; Dudley 1982). These reproductive aging effects in rodents are not primarily the result of follicle depletion in the ovary but stem from age-related changes in the hypothalamus, involving both gonadotropin-releasing hormone (GnRH; Brann and Mahesh 2005) and dopaminergic (Sánchez et al. 2003) neurons, which increases the individual variability in female reproductive hormones in animals over 1 year of age.
Cycling in the female rodents can be characterized using daily vaginal smears evaluated for the presence of different cell types (Refer to Goldman, Murr, and Cooper 2007; Cooper, Goldman, and Vandenbergh 1993, for images of smears). Vaginal cytology features across the estrous cycle include the following: diestrus (predominance of leukocytes mixed with some cornified epithelial cells); proestrus (predominance of round nucleated epithelial cells often in clumps); and estrus (predominance of cornified epithelial cells). In rodents, it is recommended that vaginal smears be collected at approximately the same time each day. Disturbances in the hypothalamic–pituitary–gonadal (HPG) axis may occur as irregular cycles (e.g., diestrus > 3 days or estrus >2 days) or absence of cycles (prolonged periods of either vaginal cornification or leukocytic smears).
Dogs
Bitches are monoestral and polyovulatory with nonseasonal ovarian cyclicity characterized by a long luteal phase (similar to pregnancy, with detectable mammary enlargement) and obligatory periods of anestrus between cycles (Concannon 2011). The canine estrous cycle consists of a 5- to 20-day proestrus with increasing E2, 5- to 15-day estrus with increasing P4 and decreasing E2, 50- to 80-day metestrus/diestrus with elevated P4, and 80- to 240-day anestrus characterized by minimal ovarian activity and rising FSH levels in the latter part (refer to figure 1 in Concannon 2011 for hormonal pattern; Kooistra et al. 1999). In the bitch, circulating androgens increase during proestrus and decline across estrus and generally show high interindividual variation (Olson et al. 1984; Rota et al. 2007). Based on stage lengths, more than 50% of bitches in a nonsynchronized group will be in anestrous at any given time (Chandra and Adler 2008). Individual variation in the duration of the different phases of the canine cycle complicates study design with hormonal monitoring. Regulation of ovarian E2 and P4 is driven by gonadotropins, similar to other species. Both LH and PRL are luteotropic in dogs and suppression of either of these hormones will lower serum P4 from day 10 (LH) and day 13 (PRL) after ovulation (Concannon 2011).
Cycle stage in the dog can be evaluated using vaginal cytology (refer to Raskin and Meyer 2001 for images of smears). Cytologic changes include the following: metestrus/diestrus (predominance of smaller intermediate cells and lesser numbers of neutrophils, often with phagocytosed erythrocytes and bacteria); proestrus (predominance of nondegenerate neutrophils admixed with parabasal, intermediate, and superficial epithelial cells); estrus (predominance of cornified epithelial cells); and anestrus (predominance of parabasal and intermediate cells). Disturbances in the HPG axis may occur as irregular or absence of cycles (McRae et al. 1985; Valiente et al. 2009). Cytologically, diestrus can appear similar to early proestrus; therefore, serial cytologic sampling is recommended to distinguish these stages.
Nonhuman Primates
Considerable variation exists in ovarian hormone cycles across primate species. The two primary Old World primate research species of macaque, the cynomolgus monkey and the Rhesus monkey, have an ovarian cycle of 28 to 32 days. The first day of menses is by convention considered the first day of the menstrual cycle. These Old World primate species have a follicular phase of 12 to 14 days, a periovulatory interval around 3 days, and a luteal phase of 14 to 16 days. E2 gradually increases during the follicular phase, with a marked increase 1 to 2 days before the ovulatory peak of LH and FSH (refer to figure 2 in Weinbauer et al. 2008 for hormonal pattern). Following ovulation, P4 increases as the corpus luteum develops, reaching peak levels in mid-luteal phase (Weinbauer et al. 2008). PRL is not luteotropic in primates and does not show a preovulatory surge, although there is a circadian component (Quadri and Spies 1976). In addition to ovarian-derived testosterone and androstenedione (Ethun et al. 2012), the androgen precursor dehydroepiandrosterone and its sulfate released from the zona reticularis of the adrenal glands (Abbott and Bird 2009) contributes at least 50% of the total plasma pool of androgens in females of many Old World primate species. This large amount of adrenal androgen production is unique to primates and represents potentially important species differences in background hormonal context, particularly in ovariectomized females (Labrie et al. 2003).
The pubertal onset of cyclicity starts on average at 18 months of age in cynomolgus monkeys and 24 months of age in Rhesus monkeys, but shows high interindividual variation (Wanatabe et al. 2006; Wilen and Naftolin 1976). Cynomolgus monkeys are sexually active year round (Kavanagh and Laursen 1984), while Rhesus monkeys (both males and females) are seasonal breeders with reproductive activity during winter (short day) months. This seasonal pattern persists for years under artificial light conditions (Wickings and Nieschlag 1980). Reproductive hormone cycles for New World primate species may vary considerably from that seen in Old World primate species. For example, the marmoset ovarian cycle is around 28 days with a follicular phase of 8 days and a luteal phase of 20 days (Fuchs and Weinbauer 2006). Marmosets are nonseasonal breeders and exhibit multiple ovulations of 1 to 4 eggs (Tardif et al. 2003; Gilchrist et al. 2001).
Cycle stage in macaques is best evaluated by daily vaginal cytology to detect menses in combination with cycle-timed ovarian hormone measurements (Weinbauer et al. 2008). Ancillary methods for determining cycle stage include evaluation of vaginal cytology maturation (higher in follicular phase; Stute et al. 2004) and assessment of endometrial histology if available (van Esch et al. 2008). Hormone-dependent changes in perineal sex skin occur in some female macaques, but these changes are not considered a reliable indicator of cycle stage (Weinbauer et al. 2008). Variability in normal ovarian function is common, particularly in socially housed macaques (Weinbauer et al. 2008; Kaplan et al. 2010). Marmosets do not show overt external signs of ovarian cyclicity. Therefore, regular measurement of P4 is used for cycle monitoring (Weinbauer et al. 2008). In order to monitor the ovarian cycle from a defined stage, luteolysis needs to be induced (e.g., by administration of prostaglandin F2-α; Harlow, Hearn, and Hodges 1984).
Morphologic Effects
An extensive literature is available describing morphologic effects resulting from changes of female reproductive hormones, particularly in rodents. Basic histological changes across species are summarized in Table 1, while more detailed effects are reviewed elsewhere (e.g., OECD 2009; Yuan and Foley 2002). In general, morphologic changes resulting from alterations in female reproductive hormones can be sorted into three basic patterns of toxicity (Yuan and Foley 2002). The first type is characterized by atrophy of the ovary, uterus, and vagina (antiestrogenic effect). This pattern occurs secondary to decreased gonadotropin production in the pituitary (most common), decreased follicular development in the ovary, or decreased ovarian steroidogenesis. Hypothalamic and pituitary disturbances can lead to inhibition of the HPA axis.
The second type features ovarian atrophy with uterine and vaginal hypertrophy and hyperplasia secondary to increased endogenous (or exogenous) female sex steroid exposure (estrogenic effect). As an example, increased estrogen will cause ovarian atrophy secondary to decreased gonadotropin release while stimulating endometrial hyperplasia and vaginal maturation. Similarly, in ovariectomized rodents (with or without E2 pretreatment), androgens cause vaginal mucification and increased vaginal weight (Kennedy and Armstrong 1976), while androgen treatment in adult cyclic females inhibits the HPA axis (Bronson, Nguyen, and De la Rosa 1996).
The third basic toxicity pattern is associated with increased pituitary hormone concentrations (gonadotropic or luteotropic effect). Here, higher gonadotropin concentrations lead to increased ovarian follicular development and uterine/vaginal tropism, while higher PRL may lead to persistent corpora lutea (e.g., in mice and rats) and mammary gland lobuloalveolar hyperplasia. It should be noted that, in practice, the morphological pattern can be very complex when imbalances in the female reproductive hormonal system occur because the morphological presentation is the integration of the hormonal milieu. Particularly, identification of these morphologic patterns in rodents may be confounded by reproductive senescence (e.g., Harleman et al. 2012) and other factors, as described in detail in the following section.
Study Design and Number of Animals
Intraindividual variation, cyclicity, and pulsatility in reproductive hormones make careful study design and appropriate sample size essential for accurate assessment of potential treatment effects. The purpose of an investigative study with reproductive hormone measurements can be either to characterize the endocrine changes, such as investigating whether the basal P4/E2 balance is altered or to elucidate the underlying mechanism (e.g., diagnosing a hypothalamic suppression of the HPG axis). Thus, targeted hypotheses and study designs are particularly important. In addition to the findings from in vivo studies, information that can help decide on an appropriate study design may include biochemical and in vitro data such as structure–activity relationships, sex steroid receptor activation profiles, and secondary pharmacology screening.
Here, we present two different approaches for sampling and data analysis that may assist in determining sample sizes needed for evaluation of female reproductive hormones in intact rats. In Table 2, power calculations (two-group t-test for fold change assuming log-normal distribution) for serum concentrations of P4, E2, LH, and PRL in single-point samples from staged female rats on day 1 of diestrus are shown. During diestrus-1, P4 values have declined, E2 is beginning to increase, and LH and PRL are low (Figure 1). Based on the variability in this data set it can be concluded, for example, that when taking single-point measurements at diestrus-1, group sizes should not be less than 23 to detect a 2-fold change in P4 with a power of 80%. In contrast, a group size of around 5 is sufficient to have the same power for E2 and LH, mostly because E2 and LH levels are very low, and variability is minimal at this part of the cycle. For PRL, a notoriously variable hormone, a group size of 72 would be needed to detect a 2-fold change.
Power calculation for ovarian and pituitary hormones in female rats.
Note: CV, coefficient of variation. Shown here are the numbers of animals needed to detect 3-, 2-, 1.5-, and 1.25-fold changes (increase or decrease) in P4, E2, LH, and PRL, in a single-point sample during diestrus-1, with power values of 50% and 80%.
aHormone data came from intact Sprague-Dawley rats 10–12 weeks of age, sampled by decapitation.
bLog-normal transformation was performed before analysis.
Another example is shown in Table 3, which shows power calculations based on area-under-the-curve (AUC) values calculated from multiple (3–6) samples of P4 and E2 from staged female rats during proestrus and the early morning of estrus. During proestrus, E2 peaks and then sharply declines, whereas P4 levels start out low and then show an LH-induced increase. Here, a 2-fold change with 80% power requires 18 animals per group for E2 but only 8 animals per group for P4.
Power calculation of AUC of P4 and E2 in female rats.
Note: AUC = area-under-the-curve; CV = coefficient of variation. Shown here are the numbers of animals needed to detect 3-, 2-, 1.5-, and 1.25-fold changes (increase or decrease) in the AUC of the P4 and E2 pattern during proestrus/early estrus with power values of 50%, 60%, 70%, 80%, and 90%.
aHormone data came from intact Wister Hannover rats 10–45 weeks of age, sampled by tail vein.
bLog-normal transformation of the AUC was performed before analysis. In case of missing values, imputation was used to estimate AUC.
In nonrodent species, such as dogs and nonhuman primates, where small group size is standard in nonclinical toxicity studies, sample size may limit the use of single-point measurements for female reproductive hormones. This is illustrated in Table 4. Power calculations are based on the variability of sex steroids in 52 female macaques, showing that the number of animals needed to detect a 2-fold change in E2 with a power of 80% from a single-point measurement during the follicular phase is 28. However, in these species, longitudinal sampling following the estrus or menstrual cycle can be applied (see below).
Power calculation of P4 and E2 in adult female cynomolgus macaques.
Note: CV = coefficient of variation. Shown here are the numbers of animals needed to detect 3-, 2-, 1.5-, and 1.25-fold changes (increase or decrease) in P4 and E2 levels during the follicular and luteal phase with power values of 50% and 80%.
aHormone data came from intact adult female cynomolgus macaques 8–20 years of age, housed in all-female social groups of 3–5 (Stute et al. 2004). Follicular and luteal phase samples were paired from the same group of animals. Mean day of cycle (±SD) based on vaginal cytology was 11.2 ± 1.5 for follicular phase samples and 21.2 ± 1.6 for luteal phase samples.
bLog-normal transformation was performed before analysis.
Reproductive Hormone Measurements
In nonclinical toxicity studies of standard design with groups of unsynchronized or unstaged intact female animals, it is generally not recommended to add reproductive hormone measurements as retrospective, exploratory, or post hoc assessments (e.g., as part of a clinical pathology blood sample). Instead, to further investigate a finding potentially related to changes of reproductive endocrine profile, a separate targeted (investigative) study properly designed to address the question will eliminate false-positive results and offer the highest probability of an accurate evaluation of mechanism. Targeted hormone measurements may be selected based on general pathologic observations and/or specific reproductive toxicity studies (Table 1).
Reproductive hormone analyses in intact cycling female animals require sampling according to stage (e.g., proestrus, estrus, met/diestrus, or anestrus), and, particularly for rodents, time of day. In rats and mice, the timing of ovulation, and consequently the entire estrous cycle, is strictly regulated by circadian rhythm (Figure 1). In rodents and canines, vaginal smear cytology enables identification of estrous cycle stage for a particular individual (Goldman, Murr, and Cooper 2007; Li and Davis 2007). In nonhuman primates, long-term monitoring for menstrual bleeding patterns in combination with vaginal cytology can be used to estimate cycle stage (e.g., Stute et al. 2004). However, it is important to remember that even when stage is characterized, a subset of nonhuman primate females may not cycle appropriately due to stress, age, and other factors (Kaplan and Manuck 2004), and any additional hormonal disturbance is likely to further increase the number of animals needed for adequate statistical power.
Several other factors may influence reproductive hormone measurements. In the bitch, analyzing reproductive hormones is complicated by relatively rare estrous cycles (2–3 per year) and variation in ovarian hormones during stages of the estrous cycle other than anestrus. The most stable period for analyzing reproductive hormones (E2, P4, PRL, FSH, and LH) in the bitch is during anestrus (Rehm, Stanislaus, and Williams 2007), although any decreases in the HPG axis can be difficult to detect at this stage because of low basal levels. Assessment of reproductive hormones in female macaques housed in social groups is complicated by social status hierarchies. In females, social subordination may lead to decreased menstrual cyclicity and circulating ovarian hormones (Kaplan et al. 2010) and introduce additional variability across groups in both hormone measures and potential secondary morphologic changes. Diet may also affect female reproductive hormone measures, particularly in rodents and primates fed chow diets high in soy phytoestrogens (e.g., Owens et al. 2003; Stroud et al. 2006). In general, it is recommended that phytoestrogen-free diets be used for all studies evaluating reproductive hormones or related effects.
One useful strategy to assess an effect on a particular stage of hormone physiology is through artificial models, which reduce variability and allow for study of specific mechanisms or parts of a hormone axis. Here, the endocrine physiology is altered to induce a compensatory increase or decrease in the hormone in question, and the effect of a treatment upon this responsive level is examined. For example, cycle suppression may be caused by either a central (hypothalamic or pituitary) suppression of gonadotropins, in which case LH, FSH, E2, and P4 should be low, or a direct suppression of ovarian steroidogenesis, in which case E2 and P4 should be decreased and LH and FSH should be increased in an attempt to compensate. The use of ovariectomized animals is beneficial when investigating compounds that may suppress the HPG axis as it minimizes potential effects of ovarian cycle on treatment outcomes, allowing effects on LH or FSH that are not dependent of steroid feedback to be studied. In the case of ovariectomy, circulating E2 and P4 will be low (or undetectable) while LH and FSH will be high. This background hormone milieu is critical when evaluating effects of different compounds, particularly endocrine agents. For example, selective estrogen receptor modulators (SERMs) such as tamoxifen may increase LH and FSH in a premenopausal context while decreasing gonadotropins in a postmenopausal context (e.g., Bianco et al. 1985; Rossi et al. 2009). Another model used for investigation of reproductive neuroendocrine effects is to restore negative feedback using an estrogen implant in ovariectomized animals, which lowers LH and FSH to physiological levels while providing more stable (noncyclical) E2 concentrations (Sajapitak et al. 2008; Walker 1983).
Giving an exogenous dose of a stimulating agent (e.g., a particular hormone, a hormone analog, or other compound that causes activation of the endocrine system) and measuring the response from the pituitary and/or gonads can also reveal hormonal disturbances and help determine what part of the HPG axis is being directly affected. For example, GnRH agonists (e.g., LH–RH analogs) and products from the Kiss1 gene (e.g., kisspeptin10) can be used to study the response of LH and sex steroids (e.g., Hoffmann and Schneider 1993; Kawakami, Hori, and Tsutsui 1997; Wahab et al. 2008). However, this evaluation is more easily performed in males (measuring testosterone), and using males should be considered if the hormonal disturbance is occurring in both sexes. If not, staged or ovariectomized females can be used. In immature female rats, ovulation can be induced by human chorionic gonadotropin (hCG) injection (LH surge effect) after priming with pregnant mare’s serum gonadotropin (PMSG)/equine chorionic gonadotropin (eCG; emulating both FSH and LH effects) and used to study effects of compounds on reproductive hormones (Esprey et al. 1990, 1992).
Gonadotropins
Because LH is secreted in discrete pulses, informative LH measurements are always difficult to perform, especially in rodents. Precise dissection of LH profiles is more reliably accomplished with frequent sampling at 10- to 20-min intervals for a 4- to 8-hr interval, yielding at least 20 samples per animal. This “window approach” usually requires indwelling venous catheters. When multiple sampling within a limited time frame is used in rodents, replacement of plasma volume (by physiological saline solution or resuspended blood cells) may be needed. This approach allows analysis of different parameters, including number of pulses (pulse analysis), basal levels (between peaks), and amplitude of peaks, all which can give an informative picture of the effects on LH secretion. An alternative approach for evaluating effects on gonadotropin levels (without pulse details) is described below. This method is most useful when there is a clear dose level with morphological changes that could be used to identify/confirm an underlying hormonal mechanism. This design (with some modifications) can also be used to measure other hormones like FSH, PRL, E2, and P4. FSH has higher basal levels and less pronounced pulsatility than LH and therefore fewer samples points are needed to determine changes.
Detection of Changes in Gonadotropins in Rodents
Choose females with regular 4- or 5-day estrous cycles, typically with a minimum of 3 consecutive regular cycles. Plan to start dosing with the test article on a specific day of the estrous cycle (typically estrus). In female rodent, the best time to detect a decrease in gonadotropins is during the gonadotropin surge. Therefore, timing the dosing of the test article such that the maximum blood concentration (T
max) precedes the naturally occurring gonadotropin surge on proestrus would allow you to detect any inhibitory effect on gonadotropins. This is typically in the afternoon of proestrus (Figure 1). To detect an increase in gonadotropins, it is more appropriate to collect blood when hormone concentrations are low and more stable. In rodents, the most appropriate time for this is diestrus-1. To capture appropriate elevations in gonadotropins, ensure the number of animals in each dose group is large enough to allow collection of blood samples a minimum of two time points about 2 hr apart. Blood collection should be via remote sampling by indwelling catheters or decapitation.
Prolactin
PRL is a peptide hormone with various endocrine functions, including regulation of the female reproductive cycle. In rodents, PRL sustains P4 secretion and supports luteolysis (Ben-Jonathan, LaPensee, and LaPensee 2008; Freeman et al. 2000; Grattan and Kokay 2008). PRL is circadian-gated, altered by stress, and shows a distinct pattern of peaks during the ovulatory cycle; therefore, careful study design and a targeted sampling strategy are needed to evaluate treatment effects on PRL concentrations. Similar to gonadotropins, the stage of the cycle is important when measuring PRL. Furthermore, PRL is affected by stress (Gala 1990), and the blood sampling method should be taken into consideration when evaluating results. Blood collection via remote sampling by indwelling catheters is preferred, and of the commonly used euthanasia methods, decapitation is considered least stressful for rodents (Cohen et al. 1983; Döhler et al. 1977). Typically, anesthetics or CO2 are not used as they may affect PRL concentrations in rodents (Lawson and Gala 1974). When sampling for PRL in large animals, careful handling with minimal stress is imperative when not using remote sampling by indwelling catheters. Use of anesthetics such as ketamine may also affect PRL concentrations in macaques and should be considered when evaluating potential treatment effects (Rizvi et al. 2001).
In cycling female rodents, PRL concentrations are elevated only during proestrus (Figure 1). To detect elevations in PRL, the study design shown above for gonadotropins can be used with some modification. Elevated PRL concentrations in the rat will halt cycling or lengthen the individual cycles and produce vaginal cytology typical of diestrus. Therefore, the study design needs to be adapted in such a manner that when persistent diestrus is observed the animals are euthanized and blood is collected, typically after about 5 days of persistent diestrus. The controls would be allowed 2 to 3 cycles and then be euthanized on diestrus 1.
Since PRL concentrations are lower during most stages of the estrous cycle in rodents, detecting decreases in PRL may be challenging. To best capture a decrease in PRL concentrations, one needs to time the collection for a day of proestrus (typically in the afternoon) when elevated PRL levels are typically observed. A rodent study design similar to the one described to examine decreased gonadotropins could be employed to evaluate lowered PRL concentrations.
PRL in female rodents is under positive regulation by E2, and overiectomized rats and mice thus have low concentrations of PRL. In ovariectomized animals, an E2 injection will cause a marked release of PRL (after a delay of several hours), and any inhibitory effect on this system can be studied by remote sampling in cannulated animals or by blood sampling following decapitation (Brott et al. 2012; Murai and Ben-Jonathan 1990).
Ovarian Steroid Hormones
The primary sex steroids produced by the ovary are E2 and P4. When measuring these hormones in cycling female animals, both hormones should be measured whenever possible. Physiologically, these two hormones can have antagonistic effects (Gambrell, Bagnell, and Greenblatt 1983). For example, P4 inhibits E2 effects in the endometrium and downregulates estrogen receptors in some tissues. As these two hormones change dramatically across the ovarian cycle and in response to hypothalamic-mediated factors, care should be taken when measuring these hormones to control for estrous/menstrual cycle stage and potential sources of stress.
In rodents, elevations of P4 and E2 are best evaluated with a study design similar to that described above for detecting increased PRL. However, when elevated E2 is present in most instances the vaginal cytology exhibits a predominance of superficial cornified cells reflecting extended stages of estrus (Li and Davis 2007). Thus, the more appropriate time to sample blood is at diestrus-1. When measuring elevated E2 concentrations, dosing should commence on a given stage of the estrous cycle (e.g., diestrus or estrus) and continue for a predefined period (e.g., 2–3 estrous cycles). Blood should then be collected by decapitation on diestrus-1 after completion of 2 or 3 estrous cycles.
In cycling female rodents, decreases in E2 could be evaluated by employing an experimental design similar to that for elevated E2 (i.e., analyzing hormone levels at diestrus-1) or using a design similar to the one employed to measure decreased gonadatropins. Decreases in P4 could be measured at early metestrus or in pregnant animals when P4 concentrations are elevated and thus more likely to show a detectable decrease. It should be noted that similar to other hormones, measuring decreases in E2 and P4 when they are naturally at lower levels is technically challenging due to assay detection limits.
In cycling canines and primates, potential elevations in E2 and P4 are best detected during cycle stages when circulating concentrations are typically low. In the bitch, this time would be during anestrus for P4 and during early to mid-anestrus for E2. In macaques, P4 and E2 are both low during the early follicular phase (days 1–7 after onset of menses; Weinbauer et al. 2008). Identifying decreases in E2 and P4 is more technically challenging, given the greater fluctuation and variability in these hormones (particularly E2) during periods of higher concentration. Luteal insufficiency (decreased luteal phase P4) is best detected in mid-metestrus in the bitch and mid-luteal phase in the macaque.
Hormone Assays and Data Analysis
Standard methods for serum hormone analysis include radioimmunoassay (RIA), enzyme-linked immunosorbent assay (ELISA), and multiplex immunoassay. Liquid chromatography/mass spectrometry (LC/MS)-based methods are also becoming more widely used (as cost and sample size requirements decrease), particularly for measurement of estrogens and estrogen metabolites (Xu et al. 2007). Proper assay validation is necessary for all of these techniques at physiologically high and low concentrations (see Stanislaus et al. 2012). For P4 and E2, rodent-specific immunoassays are commercially available and relatively straightforward to use but may have different limits of detection and produce different values based on the kit (Haisenleder et al. 2011; Ström, Theodorsson, and Theodorsson 2008). For primates, human E2 and P4 immunoassays are generally used with minor modifications (Pazol et al. 2004). As noted above, circulating E2 concentrations vary widely across the normal cycle, increasing by more than 20-fold from early to late follicular phase in some species. Concentrations of P4 are higher than E2 in cycling animals but with less range across the cycle. This cyclical variation and range, and the potential for low values, highlight the need for sensitive and high performance assays to correctly detect treatment effects. Hormone assay specificity should also be considered (Stanislaus et al. 2012). Occasionally, the test article (or downstream metabolites) can interfere with the assay. The package insert may give some indications of known cross-reactive species, but the only way to rule out cross-reactivity is to assess reactivity of the test article in the assay in advance. Sulfate and glucuronide hormone conjugates (which are typically present in serum at much higher concentrations than corresponding unconjugated steroids) may also cross-react with hormone antibodies used in RIA and ELISA assays. To reduce conjugate reactivity (or other potential matrix effects), lipid extraction of serum with diethyl ether is often used (e.g., Ankarberg-Lindgren and Norjavaara 2008). Extracted serum may provide different results from unextracted serum and thus proper steps should be taken to standardize and validate protocols using this procedure.
For rodents, species-specific LH, FSH, and PRL immunoassays (single or multiplex) are commercially available. For macaques, human pituitary hormone immunoassays have been used, but appropriate method validation is needed to confirm this within individual laboratories. For dogs (and many other larger nonprimate mammals), the commercial availability of immunoassays for LH, FSH, and PRL has historically been inconsistent. Pituitary hormone immunoassays may be either heterologous assays based on antisera to LH, FSH, or PRL of species other than the target species, or homologous assays, which use purified hormone preparations of the target species as standards. A canine pituitary hormone multiplex bead-based assay is also now available (Millipore, Billerica, MA), offering an efficient way to measure 7 pituitary hormones in concert, although validation of this multiplex assay has not been published to date. It should also be emphasized that establishing in-house immunoassays is laborious, and regular use of the assay is needed to maintain quality.
Variability in hormone results caused by pulsatility, cyclicity, and frequent occurrence of values at or below the limit of detection of the assay (especially for LH between pulses and E2 in early diestrus or early follicular/late luteal phase) may require transformation of raw values (e.g., log normal) to improve heterogeneity of variance and distribution before statistical analysis. When frequent multiple sampling during the cycle is employed to capture the dynamic pattern of P4 and E2 and surges of LH and/or PRL, calculated AUC values may be suitable estimates to use for statistical analysis. Using AUC also provides an effective way to statistically analyze the hormonal response following a stimulation challenge.
Screening Programs
Recommendations provided in this article relate generally to the evaluation of pharmacologic agents. For environmental agent toxicity assessments, a comprehensive battery of standardized assays has been established to identify potential endocrine-disrupting compounds (EDCs). These assays include in vitro and in vivo tests designed to detect changes in hormone receptor and steroidogenic activity. Detailed information on EDC screening tests, including guideline requirements, is provided elsewhere in reports from the U.S. Environmental Protection Agency (USEPA 2009a), OECD (2002), World Health Organization (WHO 2003), and others (O’Connor et al. 2002). Screening guidelines at the USEPA, for example, use a two-tiered approach to determine whether an environmental compound may pose a risk to human health or wildlife due to estrogen, androgen, or thyroid hormone activity (USEPA 2009b). The current tier 1 battery includes 12 assays, none of which requires female reproductive hormone assessments. Evidence from tier 1 assays (along with other relevant data) is then used in a weight-of-evidence approach to determine which, if any, tier 2 tests are necessary. The goal of tier 2 testing is to identify adverse endocrine-related effects and establish dose relationships. While tier 2 assay guidelines are still in the process of validation and peer review, female reproductive hormone assessments may be included in select assays using a targeted approach. Female reproductive hormones may also be measured as part of specific mode-of-action evaluations of environmental compounds for which a potential endocrine-related tumorigenic effect has been observed. Current programs such as the USEPA Endocrine Disruptor Screening Program for the twenty-first century are now applying computational and molecular technologies to increase efficiency and accuracy of EDC screening. In the future, these types of tools should provide additional power to identify and prioritize testing of compounds that may affect female reproductive hormones.
Conclusions
Identifying effects of pharmaceutical, chemical, and environmental agents on female reproductive hormones can be complicated by many factors, most notably estrous/menstrual cyclicity. Due to this complexity, hormonal measurements are not generally recommended to be included in nonclinical toxicity studies of conventional design unless followed longitudinally (e.g., in female macaques with menstrual cycle evaluations) and in properly staged animals. For some endocrine imbalances, longer periods of time may be required before any downstream pathologic changes appear. Study duration is thus an important consideration when relating hormonal changes and reproductive tissue effects. However, the underlying endocrine mechanism may be possible to investigate in shorter term studies in the appropriate model.
Alterations in sex hormone concentrations should always be considered in context of other information related to endocrine changes. This information includes in vivo changes in weight and morphology of hormone-responsive tissues, reproductive function, estrous/menstrual cyclicity, development, and behavior, as well as biochemical and in vitro data, structure–activity relationships, sex steroid receptor activation profiles, and molecular signals induced by the compound of interest. In most cases, investigations into female reproductive endocrine disturbances should be targeted studies, and this article outlines some approaches to assist in the design of these studies. The most important considerations are staging of animals with respect to their reproductive cycle, minimizing sources of individual variation, proper determination of animal numbers, use of dose levels that do not adversely impact food consumption or body weight, appropriate assay validation, and understanding species differences to ensure accurate interpretation of results.
Footnotes
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
The declared receipt of the following financial support for the research, authorship, and/or publication of this article: The recommendations in this article are endorsed and supported by the STP.
Acknowledgment
The authors would like to thank Gunnar Nordahl for his statistical expertise and assistance with the power calculations.
Authors’ Note
This review series is a product of the Society of Toxicologic Pathology (STP) Working Group and has been reviewed and approved by the Scientific and Regulatory Policy Committee and Executive Committee of the Society. The article does not represent a formal best practice recommendation of the Society but provides expert guidance on key principles to consider in designing regulated toxicity studies. This article has been reviewed by the U.S. Environmental Protection Agency and approved for publication. Approval does not signify that the contents reflect the views of the agency, and mention of trade names or commercial products does not constitute endorsement or recommendation for use.
