Abstract
Objective
Although early detection of cancer through screening can prevent cancer deaths, a drawback of screening is overdiagnosis. Overdiagnosis has been much debated in breast cancer screening, but less so in cervical cancer screening. We examined the impact of overdiagnosis by comparing two screening programmes in the Netherlands.
Methods
We estimated overdiagnosis rates by microsimulation for breast cancer screening and cervical cancer screening, using a cohort of women born in 1982 with lifelong follow-up. Overdiagnosis estimates were made analogous to two definitions formed by the UK 2012 breast screening review. Pre-invasive disease was included in both definitions.
Results
Screening prevented 921 cervical cancers (−55%) and 378 cervical cancer deaths (−59%), and 169 (−1.3%) breast cancer cases and 970 breast cancer deaths (−21%). The cervical cancer overdiagnosis rate was 74.8% (including pre-invasive disease). Breast cancer overdiagnosis was estimated at 2.5% (including pre-invasive disease). For women of all ages in breast cancer screening, an excess of 207 diagnoses/100,000 women was found, compared with an excess of 3999 diagnoses/100,000 women in cervical cancer screening.
Conclusions
For breast cancer, the frequency of overdiagnosis in screening is relatively low, but consequences are evident. For cervical cancer, the frequency of overdiagnosis in screening is high, because of detection of pre-invasive disease, but the consequences per case are relatively small due to less invasive treatment. This illustrates that it is necessary to present overdiagnosis in relation to disease stage and consequences.
Keywords
Introduction
The purpose of cancer screening is to prevent cancer death by detecting cancerous lesions, and for some cancers pre-cancerous lesions, early, when treatment is still a viable option and more effective, or cancer may be prevented altogether. 1 Screening advances the diagnosis of disease to an earlier age, resulting in a higher incidence just after the initiation of screening. After the upper age limit of screening, the incidence rate will drop. 2
Breast cancer screening detects invasive breast cancer and ductal carcinoma in situ (DCIS), which are both considered a cancer diagnosis.3,4 The number of breast cancer diagnoses has increased since the introduction of screening, due to both lead time and changes in underlying risk. In a mature cervical cancer screening programme, the screen-detection of invasive cancer is rare, due to the higher frequency of detection of precursor lesions, thus altering the natural history of those lesions that are progressive. Screening for cervical cancer mostly detects cervical intra-epithelial neoplasia (CIN). CIN is not regarded as a cancer diagnosis. The incidence rate of cervical cancer had been decreasing prior to the introduction of screening and has continued to decrease due to screening.5,6 To the degree that a colorectal cancer screening programme focuses on the detection of adenomatous polyps and cancer, incidence also is expected to decline along with mortality after screening is introduced. 6
A downside of early detection is the possibility of detecting abnormalities that would never have become clinically apparent in the absence of screening. 7 This may occur because abnormalities spontaneously regress, as is described for cervical cancer,8–10 or that they remain indolent, as is described for breast cancer.11,12 Although there is little evidence to support the possibility of regression of breast cancer, it has been shown in vitro. 13 The detection of such an abnormality is called overdiagnosis, and most overdiagnoses lead to overtreatment. Overdiagnosis has been the topic of a fierce debate in breast cancer screening. 7 In cervical cancer screening, overdiagnosis is usually quantified as a lack of specificity for clinically significant disease.
The impact of overdiagnosis depends on its frequency and its consequences. In breast cancer screening, the overdiagnosis rate is relatively low.7,14,15 In cervical cancer screening, the overdiagnosis rate is usually not established. The consequence of overdiagnosis is unnecessary treatment, which is inherently harmful. The consequences of overdiagnosis in breast cancer screening are more severe than those in overdiagnosed non-progressive CIN in cervical cancer screening. For a patient, the name of the disease also carries weight.
We aimed to exemplify the impact of overdiagnosis by comparing these two screening programmes, which have been implemented for several decades in the Netherlands; for cervical cancer, since 1985 for women aged 30–60 every 5 years, and for breast cancer, since 1990 for women aged 50–74 every 2 years.16,17 Estimates of the measure of overdiagnosis in breast cancer screening in the literature vary from 4 to 54%.7,14,18–20 The proper estimate of overdiagnosis has been the topic of many debates, and the cause of many misunderstandings. We chose to use the definitions put forward by the UK independent review panel. 21 This is the first simulation study aiming to compare different screening programmes by addressing the potential amount and composition of overdiagnosed cases in the same overdiagnosis framework.
Methods
We used microsimulation MISCAN models to simulate all individual life histories in a population for breast cancer screening (MISCAN-Breast) and cervical cancer screening (MISCAN-Cervix).7,22 To obtain a representative population, the models are fitted with a birth table and a life table. Each life history has its own probability of developing a (pre-) cancerous lesion. In MISCAN-Breast, this probability is determined by fitting the model parameters hazard, onset, and incidence, to data on incidence without screening from the Netherlands Cancer Registry. In MISCAN-Cervix, the model is fitted to incidence data from the Dutch Cancer Registry and data on detection from the Dutch pathological anatomy national automated archive (PALGA/PALEBA).23,24 From each state, the disease may progress to the next stage by a semi-Markov progression model (Figure 1). In MISCAN-Breast, screening is implemented in the model using data on gradual roll-out, attendance rate, and re-attendance rate in the Dutch screening programme. Sensitivity, stage distribution, and distribution of sojourn-time were estimated by fitting these parameters to data on incidence and stage distribution with screening (1991–2010) and without screening (1990). MISCAN-Breast assumes a 1.4% annual percentage change in underlying incidence.
25
Mortality reduction in the breast cancer model is based on the results of the Swedish trials.
26
The mortality reduction in the cervical cancer model is based on observational data, provided by the Dutch Cancer Registry and PALGA in the years 1998–2007.
Progression in the MISCAN model. Every woman starts at the top left, where she has no cancer. From there, she may progress through the different stages of cancer. If the cancer is detected by screening, the woman moves to the bottom of the graph (screen-detected). If the cancer is clinically detected, she moves to the far right of the graph (clinically detected).
The impact of screening on an individual life history is illustrated in Figure 2, in which five different women each have a scenario without (A) and with (B) screening. The black areas are the negative effects of screening (life years with lower quality due to diagnosis and treatment), and dark grey areas are the positive effects of screening (healthy life years gained). Woman number 1 will benefit from screening. In scenario 1A, there is no mass screening. She will have an onset of cancer; this cancer will grow and become symptomatic, will be clinically diagnosed, and she will die from this cancer. In scenario 1B, there is mass screening. The woman will have the same onset and the same preclinical disease phase, but in this instance, mass screening will detect her cancer before she develops symptoms. Therefore, the disease is in a less advanced state, and treatment is successful. She has gained life-years, and will die of causes other than cancer. Woman number 2 does not benefit from screening. Like woman number 1, she has an onset of cancer, followed by a preclinical disease phase. This phase, however, would extend beyond her lifespan. She will never be diagnosed with cancer in the scenario without screening (2A). In the scenario with screening (2B), the cancer will be detected by screening and she will be treated accordingly. She will still die at the same time, but she will have lost several quality-adjusted-life-years because she had a cancer diagnosed. Woman number 3 develops a pre-invasive disease that will progress to a clinically detected cancer, but she will not die from this cancer (3A). She will also not gain any life-years by screening (3B). Woman number 4 has a type of cancer with an obvious pre-invasive precursor state (i.e. CIN in cervical cancer). In this case, the preclinical phase is two stage, one with preclinical pre-invasive disease, and one with preclinical cancer. The preclinical-pre-invasive state will progress to preclinical cancer, which becomes clinically detected and leads to cancer-related death in the situation without screening (4A). When this woman is screened (4B) while her disease is in the pre-invasive phase and her condition is detected, she may be cured completely, and thus cancer has been prevented; she benefits from screening. Woman number 5 does not benefit from screening; she has a preclinical-pre-invasive disease that will not progress, or may even regress to normal without screening (5A). Screening (5B) will give her a diagnosis of pre-invasive disease, but she will not gain any life-years.
Life histories of women affected differently by screening. The numbers indicate different women, each of them having a life history without screening (A) and with screening (B). Black areas represent negative effects of screening (overdiagnosis), dark grey areas represent the positive effects of screening (LY gained).
MISCAN-breast assumes a regression rate of 2%, and a progression rate of 11%, for DCIS. 27 MISCAN-Cervix has six different disease paths, five assume regression, and one assumes progression from onset to invasive disease. Each woman has an age-dependent probability of ending up in one of the disease paths.
We performed a cohort run using our breast cancer and cervical cancer models to a cohort of 10,000,000 women born in 1982; 1982 was chosen so all women were aged 30 and invited for cervical cancer screening in 2012, the most recent year with complete data. The number of cohort women alive in 2012 was also chosen as the denominator to convert raw data to rates. Between 2012 and 2032 (the year all women would be first invited to breast cancer screening), approximately 2% of the women die of all-cause mortality (including cancer). Follow-up was completed for ages 30–100. Output measures were number of diagnoses during entire follow-up in the scenarios without and with screening, and the number of diagnoses during the screening ages in the scenario without and with screening. All results are presented per 100,000 women aged 30 in 2012, and stratified by pre-cancer (DCIS for breast cancer and CIN grades I, II, and III for cervical cancer) and invasive cancer.
We used the overdiagnosis definitions from the UK Independent review panel: (a) “from the population perspective, the proportion of all cancers ever diagnosed in women invited to screening that are overdiagnosed” and (b) “from the perspective of a woman invited to screening, the probability that a cancer diagnosed during the screening period represents overdiagnosis”.
21
To be able to address all diagnoses in the programme, we extended the definitions above to include pre-invasive lesions, such as CIN I, II, and III. These definitions translate into the following calculations:
From the population perspective: Number of extra diagnoses with screening/total number of diagnoses in a population with screening. For the purpose of comparison, we used ages 30–100. No significant amount of cancers occurs before the age of 30. From an individual perspective: Number of extra diagnoses with screening/total number of diagnoses in women of screening age. For breast cancer screening, this age range is 49–75, and for cervical cancer screening, 29–60, but we used 29–64, because the diagnostic process in cervical cancer screening may take some time due to follow-up. This definition corresponds to the risk of having an overdiagnosed cancer in the lifetime of screening.
The number of extra diagnoses with screening is the difference between the total number of diagnoses in women aged 0–100 without screening and the total number of diagnoses in women aged 0–100 with screening. When we consider overdiagnosis, we included pre-invasive disease. If we had not included pre-invasive disease, overdiagnosis measures would not have applied.
Results
Cervical cancer: Number of cases by stage and overdiagnosis rate per 100,000 women aged 30 in 2012.
The total of screen detected and clinically detected do not add up as a result of rounding.
CIN: cervical intra-epithelial neoplasia.
Breast cancer: Number of cases by stage and overdiagnosis rate per 100,000 women aged 30 in 2012.
The total of screen detected and clinically detected do not add up as a result of rounding.
DCIS: ductal carcinoma in situ.
Overdiagnosis in cervical cancer and breast cancer.
Number of cases by stage and overdiagnosis rate per 100,000 women aged 30 in 2012. Excess diagnoses were calculated by subtracting all diagnoses in women aged 30–100 in the situation without screening from all diagnoses in women aged 30–100 in the situation with screening. Lifetime diagnoses are all diagnoses in women aged 30–100. Screening age diagnoses are all diagnoses in women aged 30–64 for cervical cancer, and in women aged 49–75 for breast cancer.
CIN: cervical intra-epithelial neoplasia; DCIS: ductal carcinoma in situ.
For women aged 30–100, we predicted 266 cervical cancer deaths with screening, and 644 without screening, a mortality reduction of 59%. For women aged 30–100, we predicted 3668 breast cancer deaths with screening, and 4637 without screening, a mortality reduction of 21%.
Discussion
Our comparison of the burden of breast cancer screening with that of cervical cancer screening shows that screening prevents cancer-specific mortality, but when also including the detection of pre-invasive lesions in this equation, both types of screening generate overdiagnosis. The burden of overdiagnosis depends on its frequency and its consequences. Although the overdiagnosis frequency is high in cervical cancer screening relative to breast cancer screening, the impact is limited, because treatment is minimally invasive. For CIN I, most often no treatment is necessary, and for CIN II or III, a loop excision or conization may be performed in an out-patient setting. 28 These procedures have relatively limited risks, and no apparent cosmetic impact. However, cold knife conization and large loop excision may be associated with preterm delivery, low birth weight, caesarean section, and preterm rupture of the membranes in future pregnancies.29–31 For breast cancer screening, the frequency is low relative to cervical cancer screening, but the impact is higher due to more invasive treatment. The treatment of DCIS is lumpectomy or even mastectomy, in some cases followed by radiation therapy.32,33 The risks of these treatments include (rare) standard operation risks (haemorrhage or infection), and the risk of generalized anesthesia. Additionally, the cosmetic result of these procedures has significant impact. 33 The perception of the individual should also be taken into account. The information provided with each diagnosis, whether it is cancer or pre-invasive disease, is crucial to the impact of this event. The decision to count a diagnosis as overdiagnosis must be related to its severity, treatment warranted, and on the impact of the information provided at diagnosis.
Our estimates for overdiagnosis of breast cancer were different from those previously published using the MISCAN model. This is a direct result of using cohort runs instead of simulating a realistic population. If we run our model with a population aged 0–100, we obtain an overdiagnosis rate directly comparable with that of de Gelder et al. 7 This rate is from a population perspective, for all diagnoses 4.6%; and from an individual perspective, for all diagnoses 8.1%.7,14,18–20 For cervical cancer, no comparable numbers were published.
Our analysis for cervical cancer screening was performed on the current situation (i.e. primary conventional cytology testing with cytology triage) in the Netherlands, but in recent years, most laboratories have added human papillomavirus (HPV) testing in the triage phase, which slightly increases CIN I and CIN II detection. 34 In addition, most laboratories processing primary screening tests have switched from using conventional cytology to liquid-based cytology tests (SurePath and ThinPrep). Rozemeijer et al. 35 showed that CIN2+ detection rates increased by using SurePath, while they were unaffected by using ThinPrep, meaning that overdiagnosis rates are probably somewhat higher in the current Dutch situation than estimated in our study. Also, it is expected that from 2016, primary cytology will be replaced by primary HPV screening with cytology triage, and that women will be invited for screening five times in their lifetime in the Dutch cervical cancer screening programme. 36 While this may mean a risk of increasing overdiagnosis by detecting disease at yet an earlier stage, overdiagnosis may decrease due to less screening examinations in a life time.
Screening practices for breast and cervical cancer vary widely between countries. In the United States, there is no national screening programme, though the American Cancer Society recommends annual mammography screening for women aged 45–54 and biennial screening after 55. 37 Cervical cancer screening in the United States is carried out by many practitioners with shorter intervals than guidelines indicate, despite the recommendation made by the United States Preventive Services Task Force (USPSTF).38,39 Although the affordable Care Act now ties coverage to the USPSTF recommendations, more doctors follow the American Cancer Society guidelines in breast cancer screening than the USPSTF recommendations. The NHS Breast cancer screening programme in the United Kingdom invites women aged 50–70 every 3 years, and is currently extending to include women aged 47–73. 40 Cervical cancer screening guidelines in the United States are similar to those in the Netherlands,41,42 the cervical cancer screening programme in Finland is comparable with that of the Netherlands (but with considerably more opportunistic screening), while in the United Kingdom, Sweden, and Denmark, women are screened 12 and 13 times a lifetime, starting at the ages 25, 23, and 23 respectively.43–45 Therefore, our estimates may differ from those in other countries. Overdiagnosis estimates are expected to increase as screening programmes undertake increasing numbers of screening examinations, and with a younger age at first screening. Every early diagnosis can lead to overdiagnosis, because other cause mortality may occur before the benefits of early treatment are realized. This is most likely in older women, but may also occur in younger women, especially with indolent disease, as more non-progressive CIN is found in younger women than in older women. 46
If we were to analyse the data for colorectal cancer screening, we would expect results in between those of breast cancer and cervical cancer screening, depending on the screening test being used. Not only the fecal occult blood tests, especially the older guaiac tests, but also the newer (e.g. immunochemical) tests, have a lower sensitivity for early, pre-invasive disease, than endoscopy. The most sensitive test will find more pre-invasive disease, which will need less invasive treatment, but also more often would not have developed into clinical disease, so the frequency of overdiagnosis would be high, but the per case consequences would be low. As more types of cancer become eligible for screening, we hope that in the future balanced reports will elucidate the impact of cancer screening on the advanced cancer rate and disease-specific mortality, while also publishing the properly estimated extent of overdiagnosis.
Study limitations
To compare the two programmes, which offer screening at different ages, we performed a cohort run. Although this results in a lifetime estimate of harms and benefits, it remains hypothetical, as the homogeneity of a cohort never resembles a real population. Mathematical modelling requires assumptions on the natural history of cancer. The mean duration of sojourn time and the probability of progression are interchangeable in the model, and the assumptions used have influenced the overdiagnosis estimate. 47 We have extended the definition of overdiagnosis somewhat by including precursor lesions that commonly are not judged to be cancers. This is not the case so much with DCIS of the breast, but neither precursor lesions of the cervix nor adenomas have commonly been included in discussions of overdiagnosis. In the case of each, the fraction of overdiagnosis will be harder to estimate because the fraction of treated lesions that were progressive would need to be estimated, something that is quite uncertain.
Conclusion
We have compared the burden of screening for two of the population screening programmes for cancer currently in use in the Netherlands. For breast cancer, overdiagnosis estimates are relatively low, but the consequences for overdiagnosed women are significant. For the programme overall, however, these consequences are quite small. For cervical cancer, overdiagnosis estimates of pre-invasive disease are high, but the consequences are relatively small due to less invasive treatment. Informing women about the potential harms of screening should include the consequences of finding the different lesions, invasive or pre-invasive.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: This study was funded by the Dutch National Institute for Public Health and the Environment (007/12 V&Z NvdV/EM), a non-profit organization with no involvement in the study design, data collection, data analysis, interpretation of the data, writing of the report, or the decision to submit the article for publication.
