Sage Journals: Discover world-class research

Abstract

Objective

We aimed to introduce a concept of overdiagnosis with actual formulae. In screening for invasive cancer, the detection proportion to incidence rate (D/I) ratio is used as an intermediate indicator of effect. D is the detection rate (proportion) at screen and I is the background risk (i.e. the expected incidence rate if there were no screening). We sought to determine a proper use of the D/I ratio in the estimation of overdiagnosis in screening.

Methods

We propose the difference between the D/I ratio per year and incidence-based sensitivity as an estimate of overdiagnosis. As an example to evaluate this, we used screening for colorectal cancer, run as a randomized public health policy in Finland.

Conclusion

The use of the D/I ratio is inappropriate in predicting effect, and we recommend that the D/I ratio be used as a component in the estimation of overdiagnosis in screening for invasive cancer.

Keywords

Overdiagnosis cancer screening D/I ratio

Introduction

Mortality reduction is the ultimate outcome goal of cancer screening. Process indicators such as detection proportion (D) to incidence rate (I) ratio (D/I) are used to predict the outcome, because they are available before any final evaluation.^1,2 In the traditional D/I ratio, D is the number of screen-detected cancers divided by the number of screening attenders. The terms detection rate and (especially for the first round) prevalence proportion or rate are also used. Incidence (I) is the hypothetical expected incidence rate of cancer in the attenders during the screening interval in the absence of screening.

Screen-detected cases of cancer in D include cases that because of screening did not cause death, those in which the outcome was unaffected by the screening and those which were only ever diagnosed because of screening (i.e. overdiagnosed cases). Because of this mixture of benefit and harm, the D/I ratio is valid in prediction of outcome only if the harm of overdiagnosis is small compared with the benefit of preventing death through screening. In this paper, we use D/I ratio to estimate overdiagnosis, rather than to indicate any benefit.

Methods

The framework of a screening programme with a fixed length of screening round will be considered. The duration of a screening round (T) is the time period from one screening test to the next subsequent test (Figure 1). The screening round is divided into a period of diagnostic assessment (t) called episode and a screening interval (T–t). The episode is the period during which the result of the test is confirmed as positive (i.e. cancer), or negative (i.e. no cancer). The interval is the period from the end of the episode to the time of the next test. The round includes the point in time of taking the first but not the subsequent test.

Figure 1.

Time periods in estimates of detection and incidence rates.

Screen-detected cancers are those detected within the screening episode. They were unrecognized and in the detectable preclinical phase (DPCP) at the time of the test. Screening for invasive cancer does not, per definition, affect the true cancer incidence; it affects only the timing of the diagnosis due to early detection. Therefore, the expected interval cancer incidence (I) equals the incidence of cancers that enter into the DPCP during the screening interval. A proportion of these cancers defined by sensitivity will be screen detected at the subsequent screen. Therefore, the D/I-ratio divided by the length of the interval equals with episode sensitivity and will be called detection-based sensitivity (S_d).

S_{d} = D / I (T - t)

The D/I-ratio can alternatively be described as the time period after the episode during which screen-detected cancers would have surfaced clinically in the absence of screening. Both interpretations assume that no overdiagnosis exists in screening.

The incidence method offers an independent estimate of episode sensitivity (S_i) that does not suffer from overdiagnosis

Si = 1 - I_{int} / I

where I_int is the observed incidence rate over the interval with a length of (T–t). We propose the difference of the detection-based sensitivity and incidence-based sensitivity as an indicator of overdiagnosis (O).

O = (D / (T - t) - (I - I_{int})) / I

The expected incidence (I) refers to those actually screened. It is not the same as the incidence in those intended to screen, because the non-attenders are a (self) selected group by the risk, incidence.

A relatively good estimate of the expected incidence (I) can be found from a randomized screening study

I = (I_{con} - (1 - α) I_{non}) / α

where

I_con is the incidence rate of cancer in the control arm in the interval

I_non is the incidence rate of cancer in non-attenders in the interval

α is the proportion of person years of attenders out of all person years in the invitees in the interval.

We assume a fixed length T of screening round, t of screening episode, and a population-based invitation system as in a randomised screening trial or in a randomised health services study of screening. Therefore, cancers and person years are known for interval cancer incidence in the controls (I_con) and in the non-attenders (I_non).

As an empirical example, we present data from the randomised health services study on screening for colorectal cancer in Finland. The programme consisted of biennial screening at ages 60 to 68 with guaiac FOB test. We used a fixed six-month episode, and therefore the interval was 18 months.³

Results

Of the 63,958 individuals who attended the first round in 2005–2008, 61,689 were invited for the second round and 57,142 attended (Table 1). In the attenders, 61 cancers were diagnosed during the episode and 64 during the interval. Therefore, the detection proportion (D) was 107 per 100,000 persons, and interval cancer incidence was 75 per 100,000 person years. Out of the interval cancers, 10 were diagnosed in non-attenders and 175 in controls, which resulted in an incidence of I_non = 147 and I_con = 130. The person year proportion of second round attenders was 93% in those who had attended the first round. Therefore, the expected incidence rate (I) was 129. These results imply a detection-based sensitivity estimate of 55% (107/(1.5 × 129)), and incidence-based sensitivity estimate of 42% ((129–75)/129) and, therefore, an overdiagnosis of 13% (55–42).

Table 1.

Second round of the Finnish randomised screening programme for colorectal cancer.

	Second round in attenders to first round
	Number	Rate/10⁵ person years
Invited	61,689
Attenders
Number	57,142
Cancers in episode (screen-detected)	61	213.5
Cancers in interval	64	74.7
Non-attenders
Number	4547
Cancers in episode	4	175.9
Cancers in interval	10	146.6
Control group
Number	89,789
Cancers in episode	47	104.7
Cancers in interval	175	130.0

Note: Numbers of invitees, attenders, controls and screen detected cancers and interval cancers in 2005–2010.

D = 100,000 × 61/57,142 = 107.

I = (130.0–0.07 × 146.6)/0.93 = 129.

α = 57,142/61,689 = 0.93.

I_int = 75.

O = (107/1.5–(129–75))/129 = 17/129 = 0.13.

Discussion

The objective of cancer screening is to reduce cancer-specific mortality by identifying unrecognized cancer in the DPCP.^4,5 Screen-detected cases are an intermediate indication of screening meeting the final objective. They are measured by detection proportion at screening (D). It depends on the risk of cancer (background incidence), on the length of the DPCP (sojourn time), the lengths of the screening episode and of the screening interval, on sensitivity of the test and of the diagnostic assessment phase and on overdiagnosis in detection of cancers in the DPCP. Because of these several determinants of D, it does not have a straightforward interpretation.

We developed an estimate of overdiagnosis from the screen-detected cases through controlling for the determinants other than overdiagnosis. The use of D/I-ratio eliminated the effect of background incidence. The restriction to a single screening round with a fixed length eliminated the effect of the screening interval. The comparison of detection-based sensitivity (S_d) with episode sensitivity estimated by the incidence method (S_i) eliminated the effects of sensitivity of the test and lengths of the episode and DPCP. Hence, the difference between the two sensitivities, one based on detection (after taking the length of interval into consideration) and the other based on incidence, is a measure of overdiagnosis. The limitations are the same as the limitations of the incidence-based sensitivity S_i to indicate the ultimate effect.

Theoretically, sensitivity is the ability to detect cancer in the DPCP. The proportion of these cancers detected at screen out of all cancers in the DPCP was estimated under assumptions, e.g. on the distribution of sojourn time of cancers in the DPCP.^6–10 Here, we took a pragmatic attitude. The expected incidence rate in the incidence-based sensitivity estimate is restricted to those cases in DPCP which would surface within one screening interval. After the first round, the same is true for the detection-based sensitivity estimate.

The aim of our paper is to introduce a concept of overdiagnosis with actual formulae. Because this is conceptually different from proposals based on growth rates and other assumptions, the numerical values will not be comparable with the other estimates, and our result on colorectal cancer serves only as an example. Also, application to screening for cancers in other primary sites remains for future evaluation.

Our estimate of overdiagnosis does not apply to the first screening round because the sojourn times are undefined, and the mean expected incidence is unknown.

Our method applies to a screening programme run as a public health policy, including repeated screening rounds. The limitation of our method is that some screen-detected progressive cancers are misclassified as overdiagnosed cases. They would be cancers diagnosed after the screening round in the absence of screening.

Screen-detected cancers are generally slow growing.^11,12 The screen-detected but misclassified cancers would surface clinically only after a period that is longer than the screening round. Hence, they represent the slow-growing end of the distribution of sojourn times. If the screen would be postponed by one round, all these misclassified cancers would still be in the preclinical phase, and these patients would escape the physical and mental harms of screening. Therefore, incorporating the misclassified patients into the group of overdiagnosis was justified during the screening round in question.

Full agreement on the screen-detected cancers assumes definition of the length of episode. The same length of episode for both test positives and test negatives should be applied in estimation of any indicator to describe screening.¹³ It is, however, commonplace to count person years for interval cancer rates from the time of the initial test to the next subsequent test in screen negatives, and from the time of final assessment to the time of the next subsequent test in screen positives. This method results in biased estimates, as it underestimates the interval cancer rate (I_int) and overestimates the episode sensitivity (S_i). If the length of the screening round is short, including the episode in screen negatives results in large bias. In our example, the bias in interval cancer rates would be more than 30%, because in the biased method the length of episode (six months) was included in the follow-up time of interval (1.5 years) in almost all screens, i.e. in the screen negatives.

The estimates of S_i and O depend on the length of episode, and on the actual organizing of screening, whether the diagnostic assessment is frequent early or late in the episode. If it is justified to shorten the episode, it is likely that the estimate of overdiagnosis will be less than with a longer episode. This limitation is identical with incidence-based sensitivity.

The expected background incidence in absence of screening, the denominator in the D/I-ratio and in the incidence-based sensitivity (S_i), is unknown. It has been estimated as the incidence in non-attenders to screening, or as the historical incidence for the same area (based on e.g. cancer registry data). Both these methods are likely to be biased. The incidence in controls of a randomised screening trial is also biased and should be corrected for selective attendance using data on proportion of person years of attenders and incidence in non-responders. In non-experimental studies, the separation of these groups of non-invitees and non-attendees is not possible, and bias occurs. Further, the results from trials seldom apply to service screening. Therefore, any public health policy should preferably be introduced with a randomised health services study.¹⁴

The use of the D/I ratio as an intermediate indicator of screening efficacy¹ has been recommended in guidelines,² and major screening programmes for breast cancer have reported the D/I ratio before the evaluation of effectiveness.^15–18 Its use is no longer as common as previously, and more emphasis is placed on risk of interval cancers,^19,20 which is a component of sensitivity by the incidence method. The D/I ratio is a combination of overdiagnosis and sensitivity, i.e. a mixture of benefit and harm. We propose that the D/I ratio still has a use as a component in estimation of overdiagnosis when evaluating screening for invasive cancer.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Day

Williams

Khaw

KT.

Breast cancer screening programmes: the development of a monitoring and evaluation system. Br J Cancer 1989; 59: 954–958.

European Commission. European guidelines for quality assurance in breast cancer screening and diagnosis. 4th ed. Luxemburg: European Commission, 2006.

Malila

Palva

Malminiemi

, et al. Coverage and performance of colorectal cancer screening with the faecal occult blood test in Finland. J Med Screen 2011; 18: 18–23.

National Cancer Control Programmes. Policies and managerial guidelines. Geneva: World Health Organization, 1995.

Cole

Morrison

Basic issues in population screening for cancer. J Natl Cancer Inst 1980; 64: 1263–1272.

IARC. Handbooks of cancer prevention, breast cancer screening. Lyon: IARC Press, 2002.

Day

NE.

Estimating the sensitivity of a screening test. J Epidemiol Community Health 1985; 39: 364–366.

Hofvind

Román

Sebuødegård

, et al. Balancing the benefits and detriments among women targeted by the Norwegian Breast Cancer Screening Program. J Med Screen 2016; 23: 203–209.

Michalopoulos

Duffy

Estimation of overdiagnosis using short-term trends and lead-time estimates uncontaminated by overdiagnosed cases: results from the Norwegian Breast Screening Programme. J Med Screen 2016; 23: 192–202.

10.

Beckmann

Lynch

Hiller

, et al. A novel case-control design to estimate the extent of overdiagnosis of breast cancer due to organized population-based mammography screening. Int J Cancer 2015; 136: 1411–1421.

11.

Duffy

Parmar

Overdiagnosis in breast cancer screening: the importance of length of observation period and lead time. Breast Cancer Research 2013; 15: R41.

12.

Zelen

Feinleib

On the theory of screening for chronic diseases. Biometrika 1969; 56: 601–614.

13.

Hakama

Pokhrel

Malila

, et al. Sensitivity, effect and overdiagnosis in screening for cancers with detectable pre-invasive phase. Int J Cancer 2015; 136: 928–935.

14.

Hakama

Malila

Dillner

Randomised health services studies. Int J Cancer 2012; 131: 2898–2902.

15.

Fracheboud

de Koning

Boer

, et al. Nationwide breast cancer screening programme fully implemented in The Netherlands. Breast 2001; 10: 6–11.

16.

Vejborg

Olsen

Jensen

M-B

, et al. Early outcome of mammography screening in Copenhagen 1991-99. J Med Screen 2002; 9: 115–2.

17.

Hofvind

Geller

Vacek

, et al. Using the European guidelines to evaluate the Norwegian Breast Cancer Screening Program. Eur J Epidemiol 2007; 22: 447–255.

18.

Sarkeala

Performance and effectiveness of organised breast cancer screening in Finland: Acta universitatis tamperensis. Vol. 1309. Tampere: Tampere University Press, 2008.

19.

IARC Breast Cancer Screening. IARC Handbooks of cancer prevention. Vol. 15. Lyon: WHO press, 2016.

20.

European Commission. European guidelines for quality assurance in colorectal cancer screening and diagnosis. 1st ed. Luxemburg: European Commission, 2010.

Estimation of overdiagnosis by detection to incidence ratio in cancer screening

Abstract

Objective

Methods

Conclusion

Keywords

Introduction

Methods

Results

Discussion

Footnotes

Declaration of conflicting interests

Funding

References