Abstract
Background
Overdiagnosis in breast cancer screening is a topic of debate. Researchers often estimate trends in incidence prior to screening and project these to predict incidence during the screening epoch.
Methods
Data was obtained from the Cancer Registry of Norway and the Norwegian Breast Cancer Screening Programme. Using breast cancer incidence prior to screening in Norway (1976–1995), incidence trends were estimated from age-period and age-cohort models. These estimates were used to predict the incidence of breast cancer in five-year age and period groups in the screening epoch (1996–2009).
Results
Excess numbers of cancers in the screening age range (6,876 cancers), and deficits in women above and below the screening age range (1,947 cancers) were observed. However, only part of the observed differences between the observed and the expected incidence can be explained by screening, as evidenced by numbers of excess cancers greater than the numbers of screen-detected cancers in some age groups and time periods.
Conclusion
There are potential errors in estimation of overdiagnosis from screening if individual data on screening exposure and detection mode are not taken into account. For reliable estimates of overdiagnosis, it is necessary to compare excess incidence in the screening period in those actually screened with the corresponding excess in those not screened. This is the subject of ongoing research.
Introduction
Concern is frequently expressed about the risk of overdiagnosis from cancer screening, in particular breast cancer screening.1,2 In this context, overdiagnosis is defined as the diagnosis of cancer as a result of screening which would not have been diagnosed in the patient’s lifetime had screening not taken place. 3 Implicit in this definition is a very long time frame. Breast screening, for example, usually takes place in middle aged women in developed countries, who have many future years of expected life in which a tumour could potentially progress to symptomatic diagnosis. Long term observation is one way of distinguishing excess incidence, due to overdiagnosis, from that due to screening lead time. 4 Ideally, overdiagnosis could be estimated from a randomized trial of screening, in which the control group was never screened, and for which there is a long period of observation after screening ceased in the intervention group. This is what the recent UK review of breast cancer screening aimed to do, 3 although reservations have been expressed about the review’s choice of data sources, and insufficient follow-up time.4–6
In the absence of appropriate trial data, many attempts have been made to estimate overdiagnosis from trends in observational data on national or regional incidence of breast cancer, in conjunction with the time of introduction of screening.1,7–10 Commonly, researchers estimate trends in incidence prior to screening, and project these to predict incidence during the screening epoch. An excess of observed incidence over that predicted may be partly attributable to overdiagnosis. Such an excess will also be partly due to lead time, the diagnosis as a result of screening of cancers which would otherwise have been diagnosed symptomatically some years later. This may be evidenced by a ‘compensatory drop’ in cancer incidence above the upper age limit for screening. A recent review by Puliti et al noted that those studies which adequately adjusted for lead time, either by prolonged follow-up after screening ceases or by adjustment for external estimates of lead time, and for changes in incidence unrelated to screening estimated modest rates of overdiagnosis, of the order of 10% or less, whereas those which did not derived much higher estimates. 11
Although not invariably acted upon, the significance of lead time and trends in incidence not attributable to screening is well known. What is less appreciated is the value of individual data on exposure to screening, and on whether cancers were screen detected or symptomatic. 12 Clearly, a cancer which was symptomatic rather than screen-detected cannot be overdiagnosed under the definition generally used. In this paper, we use data from the Cancer Registry of Norway and the Norwegian Breast Cancer Screening Programme (NBCSP) to demonstrate the extent to which the individual screening data qualifies interpretation of incidence trends. The aim was not to estimate overdiagnosis at this stage, but to assess the potential for such trend analyses to do so.
Data and methods
The NBCSP was initiated in November 1995 (although only 956 screens took place, and there were only three screen-detected cancers in 1995), offering biennial 2-view mammography to women aged 50–69. The programme began in four counties, and achieved nationwide coverage in 2005.
We obtained data on invasive breast cancers, from the Cancer Registry of Norway, including age at and date of diagnosis, from 1953 to 2009. Data on ductal carcinoma in situ (DCIS) was available from 1993 to 2009. The NBCSP provided data on detection mode (outside of the screening cohort, screen detected, interval cancer, non-attender, not invited due to upper age limit, and not invited as opted out). From the NBCSP, we had data on all screening invitations and attendances from November 1995 to December 2009. We also had tabular data on the resident female population in Norway by age and calendar year, as estimated in January every year. Age was calculated by subtracting the date of birth from the relevant calendar time.
We considered up to 1995 as the pre-screening epoch, and 1996 onwards as the screening epoch. Using poisson regression we estimated three models using the whole country’s incidence data from 1976 to 1995, as we were not confident of projecting trends from before 1976 through to 2009. The three models were:
A discrete age-cohort model using five-year age groups (30–34, 35–39,……, 85–89) and time periods (1976–80, 1981–85, 1986–90, 1991–95); A discrete age, continuous period trend model; and A separate period trend for each five-year age group.
We then used the estimates from each model to predict the incidence of breast cancer in five year age groups, and periods in the screening epoch from 1996 onwards (the last period being of four years duration, 2006–09). We compared the predicted numbers of cases with breast cancer cases actually observed. The observed cases were then classified as screen detected or symptomatic, and we also calculated the number of screening episodes for each age group and period, to establish bounds on the extent to which any excess in the screening epoch was actually due to the screening.
Results
Cases, person years and incidence rates per 100,000 (in that order) by five year age group and calendar period. a
Last period is of four years.
Relative risks and 95% confidence intervals in the age-cohort model applied to incidence data from 1976–95.
Midpoint year of cohort given
Age-adjusted overall trend in incidence with calendar time, and age-specific trends during 1976–95, expressed as a relative risk per five-year period.
Observed incidence per100,000 and projected incidence by the three models, by age and period, from 1996 onwards.
Observed absolute number of cases, expected numbers from model 3, excess cancers (negative numbers indicate a deficit) and number of observed screen detected cancers, by age and period, from 1996 onwards.
Discussion
Using breast cancer incidence in the two decades prior to screening initiation in Norway, 1976–95, we estimated incidence trends from age–period and age-cohort models. These were then extrapolated to the screening period, 1996–2009, to give expected incidence if the trends continued unchanged. In the screening period, we observed an excess of 6,876 cancers in the screening age range, and a deficit of 1,947 cancers at all other ages. Several observations prohibit use of these figures to estimate overdiagnosis due to screening. These include:
For some age groups and periods, the excess incidence is greater than the number of screen-detected cancers, for example, in ages 50–59 in 1996–2000; A substantial proportion of the deficit observed in the non-screening ages occurs in women who were never screened, including women too young for the screening programme, and in women already past the upper age limit for screening at the programme’s initiation; for example, women aged 75 or older in 1996–2000.
The first of these implies that some of the observed excess in the screening period is due to symptomatic tumours, and therefore there are changes in the incidence trends between the pre-screening and screening periods, which are not due to screening, and which are not captured by estimation from the pre-screening period alone. The poisson regression models used provide a reasonable fit to the period of estimation (data available from the authors), but cannot be extrapolated to the screening period, because of these unattributed changes in incidence. Further, they cannot provide an estimate of overdiagnosis. This casts doubt on similar past estimates of overdiagnosis from trends in the absence of screening exposure data.1,7,9,13 Hofvind et al.
14
found that increases in incidence in the screening period in Norway were consistent with changes in use of hormone replacement therapy. There may also be changes in other risk factors, or in breast cancer awareness, which in turn may lead to greater diagnostic or non-programme screening activity. The second implies that deficits observed in the non-screening ages were partly due to changes in incidence trends other than those induced by a compensatory drop in incidence due to lead time.8,11 As has been previously noted, in order to observe the full compensatory drop, there must be sufficient observation time, of the order of ten years or more, of screened cohorts after they have ceased to be screened.4,11
Although Table 5 only shows the extrapolation and excess estimates for the age-specific period effect model, qualitatively similar results were obtained for the age-adjusted common period effect, and the discrete age-cohort model (results available from the authors). It could be argued that our excesses are overestimates in any case due to the unavailability of DCIS data in the early pre-screening years. However, restriction of analysis to invasive cancers only yielded the same qualitative results.
There are several conclusions from these results. Firstly, that there are potential errors in estimation of overdiagnosis from screening, in the absence of individual data on screening exposure and detection mode. It is possible that we, and others, have overestimated overdiagnosis in the past, as a result of absence of information on individual exposure to screening and screen-detection of cancers.1,7,8,13 It would be unwise to present overdiagnosis estimates, based on the balance between observed and expected incidence, unstratified by screening exposure. It may also be necessary to obtain other covariates of risk for reliable prediction of incidence rates from pre-screening data. For reliable estimates, at the very least it will be necessary to compare excess incidence in the screening period in those actually
Footnotes
Acknowledgements
This work was funded by the Research Council of Norway (project number 189520/V50). The funding body had no role in the drafting of this manuscript. No writing assistance was used.
