Abstract
This article reviews four important screening principles applicable to screening mammography in order to facilitate informed choice. The first principle is that screening may help, hurt, or have no effect. In order to reduce mortality and mastectomy rates, screening must reduce the rate of advanced disease, which likely has not happened. Through overdiagnosis, screening produces substantial harm by increasing both lumpectomy and mastectomy rates, which offsets the often-promised benefit of less invasive therapy. Next, all-cause mortality is the most reliable way to measure the efficacy of a screening intervention. Disease-specific mortality is biased due to difficulties in attribution of cause of death and to increased mortality due to overdiagnosis and the resulting overtreatment with radiotherapy and chemotherapy. To enhance participation, the benefit from screening is often presented in relative instead of absolute terms. Third, some screening statistics must be interpreted with caution. Increased survival time and the percentage of early-stage tumors at detection sound plausible, but are affected by lead-time and length biases. In addition, analyses that only include women who attend screening cannot reliably correct for selection bias. The final principle is that accounting for tumor biology is important for accurate estimates of lead time, and the potential benefit from screening. Since “early detection” is actually late in a tumor's lifetime, the time window when screen detection might extend a woman's life is narrow, as many tumors that can form metastases will already have done so. Instead of encouraging screening mammography, physicians should help women make an informed decision as with any medical intervention.
Introduction
S
For informed choice to occur, the estimates of harms and benefit from screening mammography, along with their uncertainty, must be effectively communicated to women. 8,9 One challenge is that important screening harms are often not mentioned or misrepresented in promotional campaigns for screening and in the medical literature. 10 Primary care physicians are a trusted source of information about cancer screening, but many are not trained in interpreting and presenting cancer screening statistics, which are often complex or even counterintuitive. 1 Finally, the effect of tumor biology on screening effectiveness is rarely discussed. The goal of this review is to inform primary care physicians about four central screening principles applicable to mammography, so that they can help women make an informed decision on whether or not to participate.
Principle 1: Screening May Help, Hurt, or Have No Effect
Breast cancer screening is a type of secondary prevention that is justified if the screening technology can identify and advance the diagnosis of otherwise lethal tumors and earlier treatment for this asymptomatic disease is more effective than usual treatment. In addition, this benefit must outweigh any harms from the screening intervention. In order for screening to be beneficial and reduce mortality and mastectomy rates, screening must reduce the incidence rate of advanced disease. In measuring advanced disease, stage is a less robust measure than size, as detection methods and stage definitions change over time. Autier et al. have shown that there has been no decrease in the incidence rate of tumors over 2 cm in six Western countries with population screening through many years, including the United States. 11 Kalager et al. found identical decreases in advanced stage incidence in screened and unscreened areas of Norway where breast screening was introduced in a staggered fashion, creating a “control group”. 12 In the U.S. there has been a slight reduction in the incidence rate of regional metastatic cancers, but none in the rate of distal (stage 4) metastatic disease, even though screening mammography has existed for 30 years (Fig. 1). 13 The findings from Norway 12 suggest caution rather than uncritically attributing the reduction in regional metastatic cancer in the U.S. to screening, as such a reduction can happen unrelated to this.

Incidence rate of breast cancer in the United States since screening mammography was introduced. Surveillance, epidemiology, and end results data for invasive breast cancers without metastases, with regional metastases, and those with distant metastases involving other organs at diagnosis, as well as ductal carcinoma in situ lesions (DCIS) are included, for women aged 40–84 years at diagnosis, apart from DCIS (50+ years). There was no decline in the occurrence of cancers with distant metastases over the observation period, a slight decline for cancers with regional metastases, and vast increases for both localized and in situ cancers.
Since screening likely has little or no impact on the incidence rate of advanced disease, the most plausible explanation for the large and persistent increase in both in situ lesions and localized breast cancers wherever screening has been implemented is overdiagnosis. Overdiagnosed tumors are those screen-detected cases that would never present clinically or cause any problems in a woman's lifetime. Overdiagnosis means that many healthy women will receive unnecessary surgery, chemotherapy, endocrine therapy, and radiation, and this overtreatment can only cause harm. Nevertheless, it is often asserted that screening yields “more treatable” tumors, 14 but this claim ignores extra treatment caused by overdiagnosis and is based on what seems intuitively correct rather than on evidence. 15 Even if some individual women with earlier detected disease receive less invasive therapy, overall the number of women with surgeries increases. The randomized trials showed that screening participation increases lumpectomy and mastectomy rates by 30% and 20%, respectively. 16,17 In situ lesions were rare before screening, and now most cases are detected as microcalcifications with mammography (Fig. 1). In both the U.S. and the United Kingdom, the proportion of in situ cases treated by mastectomy are 29%, and these new cases likely account for much of the increase in mastectomy rates due to screening. 18,19 In the U.S., about 20% of all diagnosed breast cancers and one-third of screen-detected cancers are in situ cases. 2 Newer screening technology such as digital mammography, computer-aided detection, and magnetic resonance imaging likely increase overdiagnosis. 17,20
Earlier detection does not necessarily improve prognosis either, because adjuvant therapy and chemotherapy are effective in all prognostic groups, even for metastatic disease. 17 In fact, improved treatment has caused the majority of the impressive declines in breast cancer mortality over the past 20 years. 21 Countries that introduced screening late have experienced declines in breast cancer mortality that are equal to, and sometimes larger than, countries that introduced breast screening early. 22,23 This conclusion is supported by the fact that reductions in breast cancer mortality across Europe have been about 50% in women who are below the screening age (<50 years), which is considerably more than in age groups most often invited (about 30% in women aged 50–69 years). 23 With better medical treatment, the absolute benefit of screening declines as there are fewer breast cancer deaths to prevent, a development that will continue with further advances in treatment. 24 The U.S. breast cancer death risk without screening for women age 50 is now 9/1000 over 15 years, a decrease from 11/1000 in 1978–1980 due to better treatment. 2 Breast cancer awareness means women are more likely to notice a smaller symptomatic tumor than in the past, adding to this development. 17
Principle 2: The Most Reliable Statistic for Evaluating the Benefit of Breast Screening Is All-Cause Mortality
Screening statistics must be interpreted with caution, since they are subject to several biases. 25 Many have difficulty distinguishing between the most reliable effect estimate for a screening intervention, reduced all-cause mortality rates in randomized trials, from two misleading but often cited screening statistics: increased survival-time and stage distribution percentages. 26 Instead of all-cause mortality, disease-specific mortality in the breast cancer screening trials was chosen because many fewer trial participants were required, but it is biased (e.g., due to uncertain attribution of cause of death and may be misleading because of increased mortality from other causes due to overdiagnosis). Overdiagnosis leads to overtreatment with radiotherapy and chemotherapy and thus increased mortality from heart disease and other malignancies that may entirely outweigh the benefit in terms of reduced breast cancer mortality. 16,27 Only all-cause mortality avoids these biases and measures both the major benefit and harm, but the benefit of breast screening on all-cause mortality is so small that hundreds of thousands of participants are required to test if the effect is there. The randomized trials, including >600,000 women, did not show an effect on total mortality, 17 which tells us that the absolute benefit of the intervention must be quite small, if it is there.
The relative risk reduction (RRR) for disease-specific mortality in the screening mammography trials was 10%–20%, 17,28 but this number can be misleading and is not informative on its own. For informed choice, a woman needs to know the absolute benefit or absolute risk reduction (ARR), which is the RRR times the base rate for the breast cancer death risk without screening. For U.S. women at age 50 years, this base rate is about 5/1000 over 10 years. 29 The base rate over 10 years for women in the Swedish trials, which excluded women with a preexisting breast cancer diagnosis, was 3.4/1000. 16 The Cochrane Review of all the screening trials calculated an ARR of 0.5 percentage points, which means that 2000 women need an invitation to screening to avoid one breast cancer death over a 10 year period. 16
A different perspective is obtained by presenting the mortality benefit in relation to the higher base rate of all-cause death risk. 30 Routine screening starting at age 50 with an ARR of (0.15*5/1000) or 0.8/1000 is equivalent to a RRR of (0.8/37) or 2.2% using all-cause mortality (Fig. 2). Routine screening increases a nonsmoking woman's overall survival chance from 96.3% (100−3.7) to 97.1% over 10 years. 2 These statistics disregard deaths from overdiagnosis. Women's inaccurate and exaggerated perceptions of the benefit of screening are a challenge in promoting informed choice. In a 2003 survey, more than half of U.S. women thought that screening mammography can prevent or reduce the risk of contracting breast cancer. 31 Most believed that the RRR was over 50%, and the ARR was greater than 80/1000. This is 100 times the actual benefit of 0.8/1000, and twice the all-cause death risk. 32

Ten-year all-cause death risk for U.S. women by smoking status (age 40 and 50 years) compared with the breast cancer death risk and lives extended through screening. Starting at age 50, 7–13 women die from something else for every breast cancer death. The absolute risk reduction (ARR) or benefit from routine screening mammography is the assumed relative risk reduction (RRR) times the breast cancer death risk without screening. The ARR in absolute numbers is the same as lives extended through inviting 1000 women to routine screening.
Principle 3: Some Plausible Screening Statistics Can Be Misleading
Instead of using the ARR, some patient organizations like the Susan G. Komen foundation argue that screening improves survival time. 33 Survival time with screening is unreliable due to two biases. 34 –36 Lead-time bias means that earlier diagnosis through screening will increase the measured survival-time from diagnosis as one will live longer knowing about the cancer, regardless of whether screening makes people live to an older age. Length bias is less intuitive. Because there is more time available to detect slow-growing rather than fast-growing tumors, routine screening will preferentially detect slow-growing and more indolent tumors, and is less likely to catch fast-growing, more lethal tumors. These biases mean that women with screen-detected tumors will inevitably have a longer survival time and a more favorable prognosis than clinically detected ones, even if screening has no real benefit. Overdiagnosis can be regarded as the extreme case of length bias. Overdiagnosed tumors therefore all have a very favorable prognosis and survival time but contribute to making survival time noninformative.
The second misleading statistic is changes in cancer stages observed after screening due to lead-time and length biases. There is an increase in percentage of localized cancers as a fraction of total cancers (accounting for size, affected lymph nodes, and metastases) with screen-detected versus symptomatic cancers (Fig. 3). 37 As is the case with survival time, overdiagnosis distorts this statistic. 34 The correct measure is the rate (say, per 100,000) of advanced tumors in a population. 38 As Esserman et al. have shown, the best-case screening scenario has a declining advanced cancer rate with a stable total cancer rate, because every case of early detection of an invasive cancer or in situ lesion causes one less case of advanced cancer (true stage-shift). The worst-case screening scenario has a stable advanced cancer rate with a growing total cancer rate, because all early detection is overdiagnosis (no stage shift). In summary, an increase in the percentage of localized cancers always occurs, regardless if screening is effective, just like improved survival time. 39

Misleading statistics about screening benefits. The increased percentage of localized breast cancers (stages 0 and 1, <2 cm) after screen-detection does not prove screening is beneficial. This result is due to lead-time bias and length bias (overdiagnosis). The relevant statistic is the incidence rate of advanced cancers. Likewise, the increased percentage of smaller invasive tumors after screen-detection does not prove a morbidity or mortality benefit. Data is from the U.S. Breast Cancer Surveillance Consortium.
Finally, case-control and cohort studies that measure exposure or screening participation regularly estimate double or triple the benefit, with RRRs of 30%–40%, or more. 14,40 However, we know from recalculations of results from the randomized trials that the case-control design can yield effect estimates of more than a 50% reduction in breast cancer mortality when in fact there is none. 41 The apparent, but spurious, effect is caused by selection or volunteer bias; those who choose to go for screening are generally healthier and less likely to die from breast cancer, as well as from many other causes. Some researchers claim to avoid this bias when they use statistics that count only women who attend screening, but reliably adjusting for selection bias, the size of which varies and is unknown in a given setting, is likely not possible. 15 A systematic review of seventeen relevant population studies concluded that the benefit of screening mammography today is probably smaller than in the randomized trials. 42
Principle 4: Breast Cancer Biology Limits the Screening Window
There is substantial overlap in tumor sizes for U.S. women between screen-detected and symptomatic breast cancers (Fig. 3). A 10-mm diameter tumor has 1 billion cells after 29 doublings, and assuming a median volume doubling time of 260 days, is about 20.7 years old. 38 In the United States, the median symptomatic invasive tumor is about 21 mm (20 mm = 32 doublings, 22.8 years), and the median screen-detected invasive tumor is 14 mm (13 mm = 30 doublings or 1 cm3, 21.4 years). 37 The difference in average size would therefore seem to be about 7 mm or 1.4 years, but in reality it is smaller because the many small, overdiagnosed tumors exaggerate the difference. In the randomized trials, the mean diameter of tumors in the screened groups was 16 mm (16 mm = 31 doublings), and in the control groups it was 21 mm. 38
The 5 mm size difference, including the overdiagnosed tumors, represents just over one volume doubling, or 340 days, which is 4% of a tumor's lifetime. 43 Again disregarding overdiagnosis, the time difference between U.S. screen-detected (21.4 years old) and clinically detected (22.8 years old) tumors (or two doublings) would be at most 520 days. Screening mammography therefore likely advances the time of diagnosis (lead-time) by far less than the 2–7 years that is usually assumed in studies of overdiagnosis that attempt to compensate for this lead-time effect. 44 These studies consequently produce very low overdiagnosis estimates of 1%–11%. 14,40 Mean estimates of age-specific doubling times for U.S. women range from 179 to 288 days. 45 Zahl et al. calculated a lead-time of less than one year when excluding overdiagnosed cases. 43
Tumor biology has implications for the success or failure of screening mammography (Fig. 4). Breast cancer is not uniformly progressive, but is a collection of heterogeneous diseases. 46 Women with breast cancer do not die from the primary tumor, but from metastatic disease. Only tumors that metastasize during the part of the tumor's lifetime when it can be detected by mammography but is not yet symptomatic may have their prognosis improved. 35,38 We know that some very small tumors send out micrometastases below the mammogram threshold, while some large symptomatic tumors have late occurring or no metastases. Both cases occur outside the window where screening can benefit. 47 The window narrows even more because in order to be successful, the screen-detection must occur before the tumor metastasizes. For every 5 mm of tumor growth, node-positivity increases by about 5%. Based on the mean 16-mm screened group tumor size from the trials, 35% of tumors would have already metastasized before being detectable. Given the mean size in the control group of 21mm, which is also the median size of U.S. symptomatic tumors today, 40% of tumors would have metastasized. Therefore, 5% would metastasize during the screening window, yielding a plausible RRR of breast cancer mortality of 12% due to screening. 38 In summary, a 1- to 2-year window is too narrow for screening mammography to have a large effect.

“Early” detection is late. On the left, each concentric circle depicts one volume doubling of an invasive breast tumor (time scale). The right graph shows the tumor size correlating to each volume doubling. Each shade codes for a specific step in tumor development important to screening mammography. The difference in median tumor size between screen-detected and clinically detected tumors (5–7 mm) is one to two of 32 tumor-doubling times. The tumor size difference is overestimated due to overdiagnosis.
Weighing the benefit against the harms
We provide a decision model with a summary of possible outcomes for women considering screening (Fig. 5). 48 About 55% of breast cancers diagnosed in the U.S. are screen-detected, with symptomatic tumors diagnosed in women who did not screen, or as interval cancers between screening rounds. 2 A U.S. woman with a screen-detected in situ lesion or invasive cancer will have a true positive mammogram. Using optimistic and pessimistic screening scenarios as limits, she may be helped, hurt, or her prognosis unchanged (Table 1). 49 For the baseline scenario single screening round for women ages 50–59 years, for every 100 women with screen-detected cancer, over 15 years, 8 will have their lives extended. This estimate of “lives saved” is optimistic, and is probably fewer than 5 “lives saved.” 50,51 Since 21 women die despite screen-detection, of the remaining 79 survivors, 10% (8/79) are cured by earlier treatment, but over half of the survivors (42/79) are overdiagnosed and harmed. These findings are contrary to the popular belief that screen-detection is synonymous with cure, reflecting the lack of balanced messaging regarding screening mammography. 52 –54

Simplified screening mammography decision model with benefit and harms. The top branch, or no screening decision, shows that some early cancers do not progress to cause symptoms. With screening (bottom branch), many of these lesions are detected earlier, resulting in the harm of overdiagnosis. Although some otherwise lethal cases can be cured from earlier treatment, most cases that are not overdiagnosis remain curable or remain lethal despite screen-detection. Since mammography is imperfect, some cancers are not detected (false negatives), and some healthy women are recalled for additional testing and biopsy (false positives). Color images available online at
Equivalent to true positive mammograms in Figure 5. Includes invasive cancer and in situ lesions.
The baseline scenario is 15% relative risk reduction from screening and 30% overdiagnosis (1.30/1.0 or 23% (0.3/1.30) of diagnosed cancers). The ranges correspond to optimistic (20%/10%) and pessimistic (10%/50%) screening scenarios.
The baseline scenario is 30% relative risk reduction from screening and 30% overdiagnosis. The ranges correspond to optimistic (35%/10%) and pessimistic (25%/50%) screening scenarios.
Besides overdiagnosis of screen-detected cancers, false-positive radiologist interpretations are another important screening harm in healthy women (Fig. 5). The associated diagnostic imaging recalls can lead to negative (benign) core-needle and open surgical biopsies in up to 19% of women after 10 mammograms. 55 False-positive recalls can also cause long-term psychological harm. 56 These recalls are much more common than overdiagnosis and happen at least once to 40%–60% of U.S. women who attend breast screening over a 10-year period, depending on screening frequency. 57 We estimated single round screening events associated with 100 screen-detected cases by age (Table 2). 2 Recent insurance data reveal that recall and biopsy rates are much higher compared with data utilized in 2009 by the U.S. Preventive Services Task Force 58 –60 Besides direct patient costs, 61 there are other indirect and intangible harms from false-positive recalls that require further investigation. 60,62
Performance measures are Breast Cancer Surveillance Consortium 2000–2005 data from Nelson et al., 2009 58 unless specified.
Equivalent to true positive mammograms in Figure 5. Includes invasive cancer and in situ lesions.
False positive mammograms are recall exams minus 100.
Insurance claims data based on 2009–2011from Fitch et al., 2014 59 with third row ages 60–64 years.
Insurance claims data based on 2010–2013 from Alcusky et al., 2014 60 , second row ages 50–64, third row ages 65–75 years.
Surgical breast biopsies are 18% of the total.
Scenarios from Table 1.
A recent survey of U.S. women showed that less than 10% have been informed by their physicians of the possibility of overdiagnosis and overtreatment after cancer screening. 63 This creates an ethical problem, especially if 1.3 million U.S. breast cancer patients may have been overtreated because of screening mammography over the last 30 years, or 31% of diagnosed cancers in the screened age group (70,000 women annually). 64,65 Extensive overdiagnosis of invasive breast cancer and in situ lesions has been confirmed by the 25-year follow-up of the randomized Canadian screening trial. 66 –68 As the physical, psychological and economic consequences of unnecessary cancer diagnoses and treatment are substantial, primary care physicians have an ethical duty to discuss overdiagnosis before ordering or recommending a screening mammogram. 69 Once a cancer is detected, it is currently not possible to distinguish life-threatening from indolent cases. Therefore, overdiagnosis can only be avoided by abstaining from breast screening. The Nordic Cochrane Centre has produced an evidence-based leaflet on screening mammography, available in 17 different languages, to facilitate discussion with patients. 70
Conclusion
The principle of informed choice and the harm of overdiagnosis are well recognized in prostate cancer screening, and both are applicable to screening mammography. Screening mammography is a complex topic, and many seemingly plausible screening statistics are misleading. Tumor biology limits the potential benefit of screening. Overdiagnosis distorts screening statistics and can change the delicate balance between benefit and harms from positive to negative. Primary care physicians can use the principles explained here to help women make their own informed decisions.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
