Abstract
Objectives
We compared calculations of relative risks of cancer death in Swedish mammography trials and in other cancer screening trials.
Participants
Men and women from 30 to 74 years of age.
Setting
Randomised trials on cancer screening.
Design
For each trial, we identified the intervention period, when screening was offered to screening groups and not to control groups, and the post-intervention period, when screening (or absence of screening) was the same in screening and control groups. We then examined which cancer deaths had been used for the computation of relative risk of cancer death.
Main outcome measures
Relative risk of cancer death.
Results
In 17 non-breast screening trials, deaths due to cancers diagnosed during the intervention and post-intervention periods were used for relative risk calculations. In the five Swedish trials, relative risk calculations used deaths due to breast cancers found during intervention periods, but deaths due to breast cancer found at first screening of control groups were added to these groups. After reallocation of the added breast cancer deaths to post-intervention periods of control groups, relative risks of 0.86 (0.76; 0.97) were obtained for cancers found during intervention periods and 0.83 (0.71; 0.97) for cancers found during post-intervention periods, indicating constant reduction in the risk of breast cancer death during follow-up, irrespective of screening.
Conclusions
The use of unconventional statistical methods in Swedish trials has led to overestimation of risk reduction in breast cancer death attributable to mammography screening. The constant risk reduction observed in screening groups was probably due to the trial design that optimised awareness and medical management of women allocated to screening groups.
Introduction
Between 1977 and 1996, five randomised trials on mammography screening were conducted in Sweden. An overview of these trials published in 2002 reported that two to four rounds of mammography screening could decrease breast cancer risk by 21%. 1
Mammography screening works through finding non-clinically detectable breast cancer before progression into advanced cancer with metastatic spread in lymph nodes and distant organs. Since reduction in cancer deaths due to reduction in the incidence of advanced cancer is not influenced by treatment efficacy, it was concluded from Swedish trials that decreases in the incidence of advanced breast cancer after screening introduction would provide the best indication that mammography screening reduces breast cancer mortality. 2
However, in communities where screening participation was high for more than 10 years, only modest or no declines in the incidence of advanced breast cancer were observed.3–5 This situation is in sharp contrast with that of colorectal and cervical cancer screening, because in communities where screening for cervical and colorectal cancers is widespread, marked declines in the incidence of these types of cancers at an advanced stage have been observed, which indicates a substantial contribution of these screening modalities.6,7
Breast screening trials were initiated at a time when there was limited experience for designing, conducting and analysing cancer screening trials. We therefore postulate that the contrasts between breast and cervical or colorectal cancers could be due to differences in the way randomised trials were conducted and analysed. In this study, we re-examine the mortality data used and the way risks of breast cancer death were computed in Swedish trials in the light of study design and statistical analyses performed in screening trials on cancers other than breast cancer.
Designs of randomised trials for the evaluation of cancer screening tests
These trials are typically composed of two successive periods (Figure 1(a)): the intervention period that extends from randomisation to termination of the last screening round in the screening group, and the post-intervention period that extends from the end of the last screening round in the screening group to the date of last check of vital status of subjects that were included in the trial. The follow-up period is the total of the intervention and the post-intervention periods. Depending on the number of screening rounds and follow-up extent, intervention and post-intervention periods may be of variable duration. Randomised trials evaluating cancer screening methods may consist of a single intervention of short duration including invitation to screening, the screening test itself and possible work-up procedures in case of suspicious screening result. In other trials, the intervention period lasts for several years because the screening test is repeated every year or every two years. After the last screening round in the screening group, screening may be interrupted. Alternatively, screening may be pursued in the screening group and implemented in the control group, when, for instance, decision is taken to launch a population screening program.
Design of randomised trials for the evaluation of cancer screening methods (R: screening round). Intervention periods are the continuous lines and the post-intervention periods are the dashed lines. (a) Typical design, (b) design specific to Swedish trials on breast cancer screening.
Computation of relative risk (RR) of cancer death in randomised trials on cancer screening.
Cause of death assessment and statistical analysis in trials on screening for cancer other than breast cancer
We retrieved publications on 17 cancer screening trials other than breast cancer in which main trial results were presented (see eTable in the Supplement). In 14 trials, cause of death assessment was done by committees unaware of the screening status of subjects that decided on likely causes of death using all available information. In all 17 trials, the relative risk of cancer-specific death associated with screening was calculated using deaths due to target cancers found during follow-up periods (follow-up method).
Cause of death assessment and statistical analysis in breast cancer screening trials
Data used for relative risks calculation in randomised trials on breast cancer screening.
BCE: Breast physical examination; MMS: Mammography screening; RR: Relative risk; NBSS: National Breast Screening Study.
The most recent publication reporting on main trial results is displayed in the Table.
This trial was done in the counties of Dalarna (formerly Kopparberg) and Ostergötland.
The Joint Review Committee included Two-County trial investigators (Holmberg et al., 2009 11 ) and has to be distinguished from the Independent Endpoint Committee set up by Swedish trial overviews (Nyström et al., 1993)29
The Ostergötland county trial was part of the Two-County trial, but results specific to the Ostergötland trial were published in Nyström et al. 1
All breast screening trials calculated relative risks of breast cancer death associated with screening using deaths due to breast cancers found during the intervention period of the screening and of the control groups (Evaluation method) (Table 1). However, the Swedish trials and their overview used a different selection of breast cancer deaths for control groups, as one sentence in the statistical section of the 2002 overview makes clear, ‘The evaluation [method] ignores breast cancer deaths among women whose breast cancer diagnosis was made after the first screening round of the control group was completed’. 1 This means that the breast cancer deaths in the control group that were used for calculating the relative risk included breast cancer deaths related to cancer cases found at first screening of this group (RC1 in Figure 1(b)). This first screening of the control group generally took place in years following the last screening round in the screening group.13–16 Hence, if screening of the control group had not taken place, these cancers would have been diagnosed during the post-intervention period. This incorporation approach was thus equivalent to transferring to the intervention period a number of cancers and associated deaths that were part of the post-intervention period. It is important to note that this approach was applied to the control group only. As a consequence, publications reported more cancers per women in control groups than per women in screening groups.16–18 Translating this incorporation approach in equations displayed in Box 1 gives:
RREM/ST = (DSI/NS)/[(DCI + DRC1)/NC], where RREM/ST stands for the evaluation method specific to Swedish trials. DRC1 are deaths due to breast cancers found at first screening of the control group that pertain to the post-intervention period, (i.e. DCP in Box 1) and not to the intervention period, (i.e. DCI in Box 1).
The Two-County and the Stockholm trials reported numbers and stage of cancers found at first screening of control groups, showing that the incorporation approach resulted in adding 72 advanced (i.e. 20 mm size or more) cancers to the 434 advanced cancers diagnosed in the control group during the intervention period of the Two-County trial 13 and 30 advanced cancers (i.e. stage 2 or more) to the 173 advanced cancers diagnosed in the control group during the intervention period of the Stockholm trial. 19 Because of their high fatality rate, these extra advanced cancers led to a substantial number of extra cancer deaths, i.e. DRC1. Thus, the greater the value of DRC1, the smaller the value of RREM/ST and thus the greater the apparent reduction in the risk of breast cancer death associated with mammography screening.
Alternative calculation of results of Swedish trials
We estimated a relative risk according to the evaluation method that would not incorporate deaths due to cancers found at first screening of control groups, that is, we estimated DCI and DRC1 of the RREM/ST equation. In Swedish trials, the ratio between breast cancer mortality rates in the screening and control groups remained relatively equivalent after 10–12 years of follow-up.1,20 Furthermore, the Two-County trial reported that after 29 years of follow-up, 10% of breast cancer deaths in the control group were associated with cancers found during the first screening of control women. 20 The 10% figure is plausible because follow-up of the additional cancers was shorter than for cancers found during intervention periods. We thus inferred that 10% represented a valid estimate of the proportion of extra deaths added to intervention periods of control groups in the overview of 2002.
Breast cancer deaths in the Swedish trials included in the 2002 overview.*
BC: Breast cancer; RR: Relative risk.
Data from Table 4 of Nyström et al. 1
§RR computed using No. of women 40–74 as denominator.
Table 3 for computation of BC deaths in the control group.
Breast cancer deaths in Swedish mammography trials.
BC: Breast cancer.
From Table 4 of Nyström et al. 1
For Malmö I, the hypothesis was 4.5% and for Malmö II, the hypothesis was 7.5%.
Number of BCs in each trial during the post-intervention period were not provided.
We then reworked results of the overview of 2002 22 in Table 2 using numbers of breast cancer deaths in control groups we estimated in Table 3. The relative risk of breast cancer death over the follow-up period remained unchanged, but the relative risk of breast cancer death for the evaluation method was 0.86 instead of 0.79. For breast cancers diagnosed during the post-intervention period, the relative risk of breast cancer death dropped to 0.83. Sensitivity analysis using 8% or 12% for reworking numbers of breast cancer deaths in control groups of the Östergötland, Goteborg and Stockholm trials did not change the corrected relative risk estimates much (data not shown).
So, proper allocation of breast cancer deaths to the intervention and post-intervention periods led to an equalisation of relative risks found for the intervention, post-intervention and follow-up periods, with a risk of breast cancer death that remained about 15% lower in the screening group throughout the entire trial duration.
Discussion
Computations performed by the overview of Swedish mammography trials incorporated deaths of breast cancers found at first screening of the control group as if these cancers were part of intervention periods. 1 The consequence of this incorporation approach was the overestimation of rates of breast cancer death in the control groups, which ended up in the overestimation of the protection conferred by mammography screening against breast cancer death. Other authors raised similar concerns, estimating that the evaluation method adopted by Swedish trials resulted in including in the control groups many cancers that would not have been found in the screening group, which biased results in favour of screening. 23
Non-Swedish breast screening trials and trials on screening for cancer other than breast cancer never used the incorporation approach, and we found practically no methodological justification for this approach. The Goteborg trial investigators argued that there was a need to compensate for the extra number of cancer found by screening that are included for follow-up to death in the screening group.16,24 However all extra screen-detected invasive cancers in screening groups were early cancers, i.e. tumours less than 20 mm in diameter or stage 1.13,17,19,25 Hence, the conceivable need to compensate for screen detection of extra numbers of early cancer could not justify the transfer to intervention periods of substantial numbers of advanced cancers found at first screening of control groups. Substantial numbers of extra cancers were also found in screening groups of trials of prostate and lung cancer. However, none of these trials resorted to screening the control group after termination of the intervention and to transfer these cancers to the intervention period. The compensation argument invoked by Swedish trial investigators16,24 is thus not tenable.
Our re-calculations of Swedish trial revealed that risks of breast cancer death were similar for cancers found during the intervention and the post-intervention periods, indicating that reductions in the risk of breast cancer death also applied to cancer cases diagnosed when screening (or absence of screening) was the same in both screening and control groups. Such result is compatible with an effect of being allocated to the screening or to the control group on the risk of breast cancer death (allocation effect), but not with an effect of mammography screening (screening effect) on that risk.
Two reasons could explain a lower risk of breast cancer deaths independent of mammography screening. First, the Health Insurance Plan, 26 Age 12 and all Swedish trials1,16,18,20,25,27 that found decreased risk of breast cancer death associated with mammography screening adopted a ‘left-to-nature’ design. Typically, parallel group randomised trials first recruit a group of eligible subjects that are informed on trial objectives, on potential health benefits and probable side effects. Subjects agreeing to participate must first sign an informed consent form after which they are randomised in an intervention or in a control group. In left-to-nature trials, only women invited to participate in breast screening knew they were part of a clinical trial. Women allocated to control groups were never contacted, did not sign an informed consent and were completely ignorant they were part of a trial. Health professionals knew or could detect which women were invited to screening but did not know which women were allocated to control groups. Imbalance between the two groups probably led to increased awareness and better information (e.g. on early breast symptoms) and medical management of women in screening groups. Women invited to screening probably had quicker access to specialised care than women in control groups.
The Two-County trial provides the best evidence that factors other than mammography screening influenced breast cancer mortality. Besides mammography screening, the intervention also encompassed enhancing breast cancer awareness, breast self-examination and rapid referral of women presenting at screening with breast symptoms, all factors that would have, according to investigators, reduced patient delay and led to earlier detection of interval cancers and their treatment. 28 In addition, the Two-County trial randomised women by geographical cluster, each cluster comprising about 2700 women in Dalarna (Kopparberg) county and about 3200 women in Östergötland county. 13 This large cluster randomisation scheme is likely to have exacerbated differences between screening and control groups with respect to information, awareness and medical management. Finally, some data indicate different management of breast cancer patients according to randomisation group: the histological grade of cancers found during the Two-County trial was unknown for 19% of patients in the control group vs. 10% in the screening group (p < 0.0001). 13 Lymph node status was missing for 5.0% of patients in the screening group and 7.3% of patients in the control group (p = 0.0396). 13
It seems likely that Swedish trials have departed from the ‘ceteris paribus’ principle by which an experiment evaluating the effect of one action must make sure that all other things remain equal and will not interfere with study results.
In contrast, the Canadian trials that found no reduction in the risk of breast cancer death associated with mammography screening, adopted the typical parallel group randomised trial design. All enrolled women were volunteers who signed an informed consent form before randomisation and received the same information and medical attention. 10
A second reason for the persistent lower risk of breast cancer death for cancers found in the intervention and post-intervention periods could be biased attribution of causes of death. Of the eight major breast screening trials, only the Health Insurance Plan and the Canadian trial implemented endpoint committees unaware of the screening status of deceased women. In left-to-nature trials, health professionals completing death certificates of being part of local endpoint committees may have known or guessed which women have been invited to screening but had no idea regarding women allocated to control groups. To circumvent this problem, the overview of 2002 used death certificates for cause of death assessment because the overview of 1993 found that causes reported on certificates correlated well with causes established by an independent endpoint committee that had access to all medical and necropsy information. 1 However, in the 2002 overview, there were nearly twice as many breast cancer deaths for the Malmö, Östergötland, Stockholm and Goteborg trials than in the 1993 overview, 29 and it is unknown up to which point the reliability of death certificates was maintained over time.
In conclusion, unconventional computation of the relative risk of breast cancer death impacted on the reported results of the Swedish trials on mammography screening. This led to an intrinsic bias in favour of screening. If calculations of relative risks had been carried out using similar methodological approaches to other cancer screening trials conducted in the more recent era, the Swedish trials would not have found a 20% reduction of breast cancer death due to mammography screening. This conclusion can be verified through a reanalysis of Swedish trial original data according to methods used in other cancer screening trials.
Footnotes
Declarations
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
