Abstract
Normal reproductive aging
Women possess the maximum number of oocytes in their life, roughly a total of 6–7 million, at 20 weeks gestation. From this point forward there is an irreversible attrition in the number of germ cells. The prevailing dogma for decades has been that the mammalian ovary is incapable of producing new germ cells after birth. Recent work in the mouse casts doubt upon this belief [1], however, clear proof of neo-oogenesis in adult humans has yet to be produced [2,3]. At birth, the follicle number is already reduced to 2 million and declines to 300,000 at puberty (Figure 1). During her reproductive years a women will only release approximately 400–500 oocytes (i.e., ~12/year from menarche at the average age of 12 years until menopause at an average age of 51–52 years) [4].

Decline in follicle number after birth.
There is an accelerated rate of follicular atresia, which begins in most women in their late thirties. However, this more rapid decline is initiated at a different age, and occurs at a variable rate for each woman. It is extremely difficult to predict how this will manifest in a specific individual. In most women, fertility is optimal in their twenties, starts to decline after 30 years of age, and decreases dramatically after the early forties. This is caused by progressive decline in the quantity of their oocyte pool as a result of atresia, particularly after 35 years of age [5,6].
Age & fecundity
As the follicle pool declines, parallel decreases in fecundity and a rise in the incidence of aneuploid conceptions and resultant miscarriages also occur. Navot and colleagues established that it is ovarian aging, not uterine, that impacts a woman's ability to conceive [7]. They studied the transfer of embryos from young healthy oocyte donors into two groups of women. Group one included recipients younger than 40 years, and group two included only women over 40 years of age. They reported equal pregnancy and miscarriage rates in both groups. It has since been noted that women can have successful pregnancies well into their fifties using donor oocytes.
An age-related decline in fecundity has been well established. A women's age remains a rough estimate of how many oocytes remain in her follicular pool. Compared with women 20–24 years old, fertility is reduced on average by 6% for women between 25–30, 14% in those 30–34, and 31% in those women 35–39 [8].
Abundant data and clinical experience have proven that female age is an important predictor of the success of infertility treatments of various types. One large well-done study removed the confounding effect of other infertility factors by only studying women whose husbands were completely azospermic (thus the women themselves were felt to be most likely of normal fertility). The Centre d’ Etude et de Conservation des Oeufs et du Sperme study of donor sperm insemination was a large multicenter trial in over 2000 nulliparous women with azoospermic husbands. Their results indicated that fertility fell significantly with female age greater than 30 years. In women over 35 years the cycle fecundity per insemination was half that of younger women [9].
Various investigators have shown that female age is also a strong predictor of how a patient will respond to more aggressive fertility treatments, such as IVF and other assisted reproductive technologies, which utilize injectable gonadotropins [10,11]. Tan et al. observed at 5055 consecutive IVF cycles in a single IVF center [12]. Both conception and live birth rates per cycle declined with age (p < 0.001). Cumulative live birth rates after five treatment cycles were about 45% at 20–34 years, compared with 28.9% at 35–39 years and 14.4% at greater than or equal to 40 years.
Templeton and colleagues reviewed 36,961 IVF cycles from 1991–1994 in the UK to examine rates of live birth per cycle [13]. The highest live birth rates were in those under 30 years of age, followed by 17% at 30 years, 7% at 40 years and 2% at 45 years. In Australia, Jansen et al. examined the effect of female age on the likelihood of a live birth from a single IVF treatment [14]. In women aged 35 years or younger, the chance of a live birth was 52.4%. For women aged 35–44 years, there was a linear decline in the live birth rate, and no babies were born in women over 45 years. There was also an age-dependent rise in the frequency of miscarriages, from 10.5% for women under 35 years, to 16.1% for those 35–39 years and 42.9% for those over 40 years of age (p < 0.001).
This illustrates that there is not only an age-related decline in fertility, but also an increase in the incidence of spontaneous pregnancy loss (Figure 2). Based upon karoytypic analysis of tissue obtained from products of conception, this elevated risk is mostly caused by a rise in aneuploid conceptions. In natural conception cycles, prior to 30 years of age, the clinically observed miscarriage rate is 7–10%, from 30–34 years it is 8–21%, from 35–39 years it is 17–28% and at 40 years and older it is 34–52% [4].

Age and reproduction in women.
Although it has been clearly proven that there is an age-related decline in female fecundity in both the fertile and infertile population, extensive clinical experience has shown that age alone is far from exact in predicting pregnancy. At any age there is tremendous individual variability in fecundity, and most clinicians are aware of the ‘change of life baby’ phenomenon and the difficulty in determining at what age women may safely discontinue contraception. The limited value of age alone has led to the quest for tests that may help to determine oocyte quality and quantity and predict fecundity for an individual women; namely ovarian reserve testing. These tests have attempted to measure the functional health of the follicular pool. Most of this work has been performed in the infertile population about to undergo IVF. These tests have been used to predict a woman's response to gonadotropin stimulation and her prognosis for pregnancy success prior to undergoing IVF. Women with an abnormal test may have a very low chance for pregnancy and are often counseled to pursue oocyte donation or adoption instead of infertility treatments utilizing their own oocytes.
The general obstetrician and gynecologist should be familiar with these tests and understand their proper performance and interpretation, including their limitations. This will allow them to properly counsel patients regarding their prognosis for infertility treatment success, prior to initiating time consuming and expensive interventions. In this paper we will review the correct timing and interpretation of the most commonly accepted tests of ovarian reserve.
Tests of ovarian screening
Basal follicle-stimulating hormone
The most widely used endocrine marker for ovarian reserve is the early follicular phase or basal follicle-stimulating hormone (FSH) level. Blood for an FSH determination is drawn on cycle day 3 (can be done on days 2–4) of the menstrual cycle. It is important that this level is obtained in the early follicular phase as a result of variations in the FSH levels throughout the menstrual cycle. The physiological mechanism behind this test is as follows. As the follicular pool diminishes, there is a decreased production of estradiol (E2) and inhibin B by the ovary. This leads to a lack of negative feedback at the level of the pituitary. This lack of suppression leads to a rise in pituitary FSH production.
Follicle-stimulating hormone levels are not generalizable; the normal range is assay and laboratory dependent [15]. It is important for each lab to establish a critical FSH threshold value above which there is a clear difference in treatment outcome. In general, FSH levels ± 25% of the upper range of normal in the follicular phase represent the borderline group, while patients with values greater than 25% above the upper limit have diminished ovarian reserve. In our lab, a solid phase, two-site chemiluminescent immunometric assay (Immulite®, DPC, New Jersey, USA) is used. Levels of 4–13 mIU/ml are considered to be normal follicular phase range. Therefore, basal FSH levels of 10.5–15.5 mIU/ml are considered borderline (± 25%), and those 16 mlU/ml are considered to have significantly diminished ovarian reserve and little or no chance for success with infertility treatments [McCulloh DH: University Reproductive Associates, Internal Data Review 2006, Unpublished Data].
Many studies have demonstrated that basal FSH level is a better marker for ovarian reserve than chronologic age. Toner et al. were among the first to report that, in women undergoing IVF, their basal FSH level was a much better predictor of in vitro fertilization performance than age [16]. Patients with elevated basal FSH levels had a higher chance of cancellation for inadequate response to gonadotropin stimulation, produced fewer oocytes and achieved significantly lower pregnancy rates independent of age. Cahall et al. also found a significant negative correlation between FSH levels and ovarian follicular responsiveness to maximal exogenous stimulation using gonadotropins in IVF cycles, and that these effects were independent of chronologic age [17].
It has been suggested that the chances of a false-positive screening FSH test may be greater in younger women, who clearly have a pretest probability of diminished ovarian reserve, which is lower than their older counterparts.
There are data to support the reduced reliability of basal FSH testing in this younger population. Chaung et al. observed that women below 35 years of age, IVF with basal FSH of less than 10 mIU/ml had better IVF performance measures (higher peak serum E2 levels, more follicles retrieved, fewer cancelled cycles and higher clinical pregnancy rate) than those with basal FSH [18], but the quality of their remaining follicles may remain somewhat preserved as compared with older women with similar numbers of follicles.
Akande et al. also confirmed that FSH has particular importance in older women [19]. Although both age and basal FSH were independently associated with risk of treatment cancellation, this risk was lower in women under 35 years irrespective of serum FSH. In a prospective observational study, Van Rooij et al. demonstrated that age remains the critical factor associated with poor response and embryo quality. Women aged over 41 years who had normal FSH levels fared much worse than women aged below 41 years who had elevated FSH levels [20].
In summary, basal FSH screening is a useful measure of ovarian reserve, but results must be interpreted with caution in women younger than 35 years. In these women, an elevated basal FSH may represent a smaller follicular pool, but those that remain may be of sufficient quality to result in live birth [19,21].
Estradiol levels
Combining day 3 FSH and E2 improve the prognostic ability of either of these hormones used alone; therefore, an E2 level should be drawn along with the FSH level in the basal state [22,23]. Owing to negative feedback, elevated E2 levels may lead to falsely low or normal FSH levels, even in women nearing menopause. Thus, poor ovarian function may be masked by an elevated E2 level. The bottom line for interpretation of these values is that you cannot completely rely upon an FSH level for prognostic purposes if the E2 level drawn at the same time is over 80 pg/ml. The patient must be instructed to return in another menstrual cycle, usually a day earlier (e.g., on cycle day 2 if the high E2 was seen on cycle day 3). An elevated cycle day 3 E2 concentration (greater than 80 pg/ml) may be a marker for diminished ovarian reserve, low fecundability and poor response to ovarian stimulation. The physiologic mechanism behind this is as follows. This earlier acute rise in E2 levels results from advanced follicular development at the beginning of the cycle and earlier selection of the dominant follicle. FSH rises in response to low inhibin B levels that result in increased E2 production in the early follicular phase [22].
It may also represent poor timing of the test (i.e., the woman is not in the basal state, but is actually mid-cycle having periovulatory bleeding) or the normal state of a woman with chronic anovulation/polycystic ovarian syndrome (PCOS). A concomitant ultrasound will easily differentiate from among these three possibilities: the mid-cycle patient will have a dominant follicle visualized, the woman with PCOS will have more than the usual number of small (antral) follicles (a normal antral follicle count [AFC] in a reproductive aged woman would be a total of 10–20 on both ovaries), and the woman with diminished ovarian reserve will have a low AFC (usually under 10).
Measurement of both FSH and E2 on cycle day 3 may, thus, help to decrease the incidence of false-negative tests based on measurement of FSH alone. When both FSH and E2 are elevated on day 3, ovarian response to stimulation is likely to be very poor [23,24].
Antral follicle counts
The AFC is a noninvasive, easily performed test. The numbers of follicles ranging from 2 to 10 mm in both ovaries are counted using transvaginal ultrasonography during the follicular phase of the menstrual cycle, usually on day 3. The total number of antral follicles reflects the size of the resting follicular pool, and is usually 10–20 for a normal ovulatory woman during the reproductive years. The more antral follicles there are, the better a woman's ovarian reserve. The AFC pool declines at 3.8% per year in reproductive age women with proven fertility [25]. It has been well established that a higher AFC is associated with a younger age [26–29], lower basal FSH levels [27–29], a decreased amount and duration of gonadotropins used in a controlled ovarian hyperstimulation [25,28,29], and an increase in number of oocytes retrieved [25,28,29].
Antral follicle counts has been found to be a valuable predictor of poor response to ovarian stimulation and to be superior to FSH [30,31], inhibin B [31,33], ovarian volume [30,32,33], the clomiphene citrate challenge test (CCCT) [32] and patient age [34]. A higher AFC has also been correlated with a higher clinical pregnancy rate [26,28,35,36] and a lower miscarriage rate [37]. Of course these indicators are all surrogate markers for the outcome of major interest, namely livebirth pregnancy. Our group has recently demonstrated that AFC counts are also significantly predictive of live birth. In IVF patients, an AFC less than or equal to ten was a strong negative predictor of live birth [38].
Clomiphene citrate challenge test
Clomiphene citrate (CC) is an orally administered antiestrogen. The CCCT was first described in 1987 by Navot and colleagues as a dynamic ovarian reserve screen in infertile women over the age of 35 years [39]. It has been described as a ‘stress test’ for the ovaries. The CCCT is administered in the following manner. A FSH and E2 are drawn on day 3 of the menstrual cycle. On days 5 through to 9 of the cycle, CC 100 mg is taken daily. The patient returns on day 10 for another E2 and FSH. The test is designed to predict how women would respond to ovarian stimulation.
Stimulation of ovarian function is elicited by raised pituitary FSH secretion as a result of blockage of E2 feedback by CC. In a woman with normal ovarian reserve, the ovaries will respond to the elevated gonadotropins with follicular development (perhaps multiple follicles) and increased E2. This increased E2 will overcome the CC blockage enough to cause the same or lower FSH level as seen on day 3. Conversely, in a woman with poor ovarian reserve, the ovaries will be unable to respond to the elevated gonadotropins with follicular development (perhaps multiple follicles) and increased E2. Thus, the elevated gonadotropins produced by the CC will remain, and the FSH level on cycle day 10 will be even higher than the day 3 FSH level. An elevated FSH level on either day 3 or 10 constitutes an abnormal test.
Navot found that 35% of women had an elevated FSH level after CC administration [39]. In this group, even with extensive assisted reproductive treatments, there was only a 6% clinical pregnancy rate versus a 42% rate in the normal response group. Scott tested the CCCT in a group of general infertility patients [40]. He reported that an abnormal test predicted lower pregnancy rates; 92 of 213 (43%) of patients with normal results, but only two of 23 (9%) of patients with abnormal results (p < 0.004).
A total of 85% of women with values above this level will respond poorly to ovarian stimulation. The strength of this test is that it may unmask women with diminished ovarian reserve that would have had a normal basal day 3 FSH level. Compared with the basal FSH screen alone, a CCCT will pick up two- to three-times more patients with diminished ovarian reserve [41]. Another advantage of the CCCT is that it has been tested in a more heterogeneous infertility population, while the basal FSH screen has predominantly been tested only in patients undergoing superovulation, usually with IVF [42].
On the negative side, the CCCT is more than twice the cost of a single basal FSH test. It is more invasive and associated with an increased inconvenience. CC can be associated with numerous and troublesome side effects, such as multiple pregnancy (7–10%), vasomotor flushes (10.4%), abdominal/pelvic discomfort or bloating (5.5%), nausea/vomiting (2.2%), breast discomfort (2.1%), visual disturbances (1.5%), headache (1.3%) and abnormal uterine bleeding (1.3%).
Another possible disadvantage of the CCCT is that some studies have reported that the CCCT may not be as sensitive in women under 35 years of age, and that more false-positives may occur in this younger age group. However, other studies have found that a failed CCCT is a poor prognostic factor at any age [43]. In general, the CCCT alone has a false-positive rate of 5%. This essentially means that there is a 5% pregnancy rate in women that have failed a CCCT [42]. Therefore, the CCCT test alone should not be used as a basis for exclusion for assisted reproductive techniques and progression to oocyte donation or adoption as the only option.
In our practice we recommend screening most women over 35 years, as well as younger patients with suspected diminished ovarian reserve, with the CCCT. In general, patients with abnormal day 3 or day 10 FSH levels (greater than 15.5 IU/l in our assay) are not candidates for IVF. For these patients, we suggest oocyte donation or adoption.
Other ovarian reserve testing
Numerous other methods for measuring ovarian reserve have been investigated including: inhibin-B levels, anti-Müllerian hormone (AMH) levels, ovarian volume, the gonadotropin-releasing hormone (GnRH)-agonist test (GAST), and the exogenous FSH ovarian reserve test (EFORT). To date, the literature suggests that these tests do not provide any more sensitive or reliable information than the previously describes tests.
Inhibin B is a hormone produced by the granulosa cells of the ovary. When produced, it negatively feedbacks at the level of the pituitary gland in turn decreasing FSH levels. During the regular menstrual cycle inhibin B rises in the early follicular phase, followed by a decrease, and then peaks briefly post-LH surge, and then declines in the luteal phase [44]. Since it is directly produced by developing early antral follicles, a decreased basal (day 3) inhibin-B level is thought to precede an elevation in FSH levels.
Seifer et al. described the utility of basal inhibin-B levels in patients undergoing ovarian stimulation for IVF [45]. They found that when day 3 inhibin-B levels were less than 45 pg/ml, the women had fewer follicles retrieved, an increased cycle cancellation rate, and a decreased pregnancy rate than in women with inhibin-B levels above this threshold.
Corson et al. demonstrated that the CCCT was superior to basal inhibin-B levels in predicting pregnancy rates [46]. The pregnancy rates for women with a low basal inhibin-B level (<45 pg/ml) were 31.8%, while those with normal inhibin-B levels had a pregnancy rate of 34.5% (p < 0.05). However, in women who failed the CCCT, with FSH levels ranging between 11–15 IU/l on either day had a pregnancy rate of 13.6%. Those with a normal CCCT, FSH less than or equal to 9 mIU/ml had a pregnancy rate of 38.4% (p = 0.03). This study validates the usefulness of the CCCT and fails to find a clinical value for inhibin B testing.
Anti-Müllerian hormone has also been studied as a marker of ovarian aging. AMH is produced by ovarian granulosa cells from approximately 36-week gestation to menopause, and is highest in granulosa cells of growing preantral and small antral follicles. Since growing follicles produces AMH, it has been proposed as a marker of ovarian reserve. It has been demonstrated that serum concentrations of AMH decrease over time in young fertile women.
A possible advantage of this ovarian reserve test, in contrast to inhibin B and FSH, is that serum AMH levels are relatively constant throughout the menstrual cycle [47,48]. Unfortunately, in the USA, currently this test is only available in reference laboratories. Many clinical studies have found high correlations between low AMH levels and poor ovarian response to gonadotropin stimulation [49–51]. A few studies demonstrated that AMH, in combination with FSH and inhibin B improve the prediction of ovarian response over any one test individually [51,52]. Hazout reported that serum AMH levels have a greater prognostic value than age, serum FSH, inhibin B or E2 levels [53]. Nelson et al. found plasma AMH to be a superior predictor of live birth and oocyte yield compared with FSH in first time IVF/intracytoplasmic sperm injection cycles in 340 patients [47]. AMH, when commercially available, may prove to be the standard marker of ovarian reserve in the future.
The GAST and the EFORT are two dynamic markers of ovarian reserve. Briefly, the GAST test is administered with a high dose of a GnRH agonist. The change in E2 concentration from day 2 to 3 is measured. A higher E2 is associated with better ovarian reserve [54]. In the EFORT test, on day 3 of the menstrual cycle, a basal FSH and E2 are drawn and 300 IU of FSH is given [55]. E2 is checked 24 h later. It has not been studied in prediction of pregnancy for an IVF population or in the general subfertile population. Neither of these tests have been found to be better predictors ovarian reserve than the less invasive and inexpensive tests available.
Ovarian volume, which partly reflects the number of ovarian follicles, has also been shown to decrease with age. Numerous studies have suggested a role for this parameter as a marker of ovarian reserve. A volume less than 3cm3 in women undergoing IVF has been associated with an increased rate of cycle cancellation [56,57]. Comparative studies of multiple tests for ovarian reserve suggest that ovarian volume adds little to the AFC.
Conclusion
In the case of ovarian reserve screening we need tests with high specificity to minimize falsely counseled patients that they would have no chance of achieving pregnancy with treatment. The trade-off is to sacrifice sensitivity by letting some patients go through treatment, even though they too may have a poor chance of success. Jain and colleagues, in a meta-analysis, found that both basal FSH and the CCCT have low sensitivities, but high specificities [58]. For basal FSH and the CCCT, the sensitivities were 6.6 and 25.9%, respectively, and specificities were 99.6 and 98.1%, respectively.
This illustrates that these tests are generally reliable but certainly not infallible. They are merely screening tests and are not absolute in their results. A comprehensive study reviewed much of the literature on ovarian reserve testing and determined that there is only modest-to-poor predictive value of the tests that are available today [59]. The results must be interpreted correctly and applied with caution. Inappropriate recommendations for treatment or for no treatment must be avoided. Patients who have failed the screening process can become pregnant, and practitioners should be warned not to advise patients that their likelihood of pregnancy is zero based on these tests. The probability of pregnancy may be low with an abnormal screen, but one cannot accurately predict who will become pregnant within that group. The purpose of the testing is to provide information that can help to guide the choice of treatment and best use of available resources.
Albeit these tests are far from perfect in their predictive values, multiple studies have correlated abnormal ovarian reserve with decreased pregnancy and live birth rates, increased miscarriage rates, increased amount and duration of gonadotropins used, and fewer oocytes retrieved. In addition, the cost of an IVF cycle is not trivial and it is an invasive procedure, not without risk. Therefore, the patient should be informed if she has ovarian failure and/or diminished ovarian reserve so she can make an educated decision regarding her treatment options. The fact that ovarian reserve screening tests are relatively inexpensive, safe, and easily performed argues strongly that these tests should be among the initial steps taken in women who present with infertility. We suggest that the generalist should perform a basal FSH with an E2 level and AFC in women under 35 years, and these tests plus a CCCT in women 35 years and over.
Executive summary
Owing to a progressive decline in quantity of the oocyte pool in women, fertility is optimal in during their twenties, starts to decline after age 30 years and decreases dramatically after the early forties.
There is an age-related decline in fertility owing to decreased quality and quantity of oocytes. Younger women may have an accelerated decline in quantity, but will have a higher live birth rate, presumably as a result of better quality oocytes remaining.
Ovarian reserve tests are an indirect measurement of a woman's remaining follicular pool and give an estimate of her sensitivity to ovarian stimulation and her prognosis for success with fertility treatments.
False-positive screens do occur. There is no cut-off at which there is absolutely no possibility of pregnancy and individual patient characteristics and age must still be considered.
Ovarian reserve tests cannot be used to predict future fertility or the exact timing of the decline or cessation of fertility. They cannot be used to determine a ‘safe’ time to discontinue contraception for the perimenopausal woman.
Basal follicle-stimulating hormone (FSH) levels, drawn on cycle day 3 (can be done on days 2-4) of the menstrual cycle, are the most widely used single measure for ovarian reserve testing.
FSH levels ± 25% of the upper range of normal in the follicular phase represent a borderline group, while patients with values greater than 25% above the upper limit have diminished ovarian reserve.
Always obtain an estradiol level when performing a basal FSH level (day 3) to limit false-negative screening.
For women under 35 years of age with a normal cycle, a basal measurement of FSH and estradiol level (cycle days 2, 3 or 4), along with a transvaginal ultrasound for antral follicle count (AFC), should provide adequate assessment of ovarian reserve.
For women over age 35 years, or in those with a high clinical suspicion of diminished ovarian reserve (e.g., recent onset of shorter or irregular menstrual cycles, poor response to fertility drugs) basal FSH and estradiol levels and a transvaginal ultrasound for AFC should be performed along with a clomiphene citrate challenge test for better detection of poor ovarian reserve.
Inhibin B, ovarian volume, gonadotrophin-releasing hormone agonist test, exogenous FSH ovarian reserve test or anti-Mullerian hormone testing can also be used to assess ovarian reserve. At this point they do not seem to provide better information than basal FSH, estradiol, AFC and clomiphene citrate challenge test are not available.
The average age of childbearing has risen over the last few decades. Women are purposefully deferring childbearing into their thirties and forties. We, as physicians, are therefore faced more frequently with the challenge of helping these women conceive when the time is right. It is important to remember that studies regarding ovarian reserve screening have been performed primarily in order to assess the prognosis for success of IVF and other fertility treatments in an infertile population, and approximate the ovary's ability to respond to ovarian stimulation. This testing cannot be used to evaluate the future fertility prognosis of the aging women, not trying to conceive, who wants to know how fast her biological clock is ticking. In the future it may be possible that new tests may be helpful in counseling women about the rapidity of their decline of reproductive potential.
Future perspective
Longitudinal studies are currently underway, looking at changes in both serum FSH levels and AFC over time. This information should hopefully improve our understanding of the expected rate of follicle atresia in the normal state. We need much better normative data for the ‘fertile’ population. We need to be able to predict the impact of known accelerators of the rate of follicle loss (e.g., smoking and ovarian surgery) upon the individual woman. Perhaps then we can develop the test that women actually want: a simple screening test, which will predict how long they will remain fertile.
Footnotes
The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties. No writing assistance was utilized in the production of this manuscript.
