Abstract
Objective
To assess the impact of different study designs on outcome data within the European Randomized Study of Screening for Prostate Cancer (ERSPC).
Methods
Observed data from the Gothenburg centre (effectiveness trial with upfront randomization before informed consent) and the Rotterdam centre (efficacy trial with randomization after informed consent) were compared with expected data, which were retrieved from national cancer registries and life tables. Endpoints were 11-year cumulative prostate cancer (PC) incidence, overall mortality and PC-specific mortality.
Results
In Gothenburg, the 11-year PC incidence was higher than predicted (5.8%) in both the intervention (12.4%) and control arms (7.3%). The observed overall mortality was higher than predicted (15.9%) in both the intervention (17.8%) and control arms (18.5%). The observed PC-specific mortality in the intervention arm was 0.56% versus 0.83% in the control arm, while the expected mortality was 0.83%. In Rotterdam, the observed PC incidence in the intervention arm (10.4%) was higher than expected (4.4%). The incidence in the control arm was 4.6%. The observed overall mortality was lower than expected: 13.6% in the intervention arm and 14.0% in the control arm versus an expected mortality of 16.1%. The observed PC-specific mortality was lower than expected (0.65%) in both the intervention (0.27%) and control arms (0.41%).
Conclusions
Our results suggest that an efficacy trial with informed consent prior to randomization may have introduced a ‘healthy screenee bias’. Therefore, an effectiveness trial with consent after randomization may more accurately estimate the PC-specific mortality reduction if population-based screening is introduced.
INTRODUCTION
Randomized controlled trials (RCTs) are the most reliable method of determining the effects of medical interventions. 1,2 Depending on the study design and the aspects of the interventions that the trial aims to evaluate, RCTs can either be classified as efficacy or effectiveness trials. 3–5
Efficacy trials are designed to determine whether an intervention produces the expected results under ideal circumstances. 3–5 Efficacy trials have some common features. Participants of efficacy trials are often selected (poorly adherent participants and those with conditions which might dilute the effect are often excluded), and the outcomes are indirectly generalizable to the clinical setting in routine practice, meaning that it is necessary to evaluate whether the effect in the general population will be similar to that in the selected population.
Effectiveness trials measure the degree of beneficial effect when the intervention is used in routine practice. 3–5 Participants of effectiveness trial are in principle representative of the general population and the interventions are applied as they would be in common practice. The outcomes are, therefore, directly generalizable to routine practice.
The European Randomized Study of Screening for Prostate Cancer (ERSPC, ISRCTN49127736) was initiated to evaluate the effect of screening with prostate-specific antigen (PSA) testing on prostate cancer (PC)-specific mortality. 6 The ERSPC study is conducted in eight countries and each centre adheres to a common core study protocol. Due to different legal requirements for running randomized studies, randomization of men into the trial differed among the participating countries. In three centres men underwent upfront randomization; no written informed consent was necessary before invitation (effectiveness trial). In the other centres, only those who provided written informed consent upon invitation underwent randomization (efficacy trial). 7,8
Because of this difference in randomization, the estimated benefit of screening might differ between an efficacy and effectiveness trial. We therefore aimed to evaluate potential differences between the Rotterdam (the Netherlands) and the Gothenburg (Sweden) branches of the ERSPC (efficacy and effectiveness study design, respectively), by comparing the observed and expected PC incidence, overall mortality and PC-specific mortality in each centre. The findings may contribute to the interpretation of the outcome data of the ERSPC study.
MATERIALS AND METHODS
In both Gothenburg (effectiveness trial) and Rotterdam (efficacy trial), observed data were compared with expected data. The median follow-up was 14.0 years in Gothenburg (interquartile range [IQR] 10.2–14.0) and 11.1 years (IQR 9.4–12.4) in Rotterdam. In order to present robust data and on the basis of the shortest median follow-up, the endpoints were 11-year cumulative PC incidence, overall mortality and PC-specific mortality.
Observed data
Observed data were retrieved from men participating in the Gothenburg and Rotterdam centres of the ERSPC. The screening algorithm used in Gothenburg has been described previously. 9 In brief, the study population comprised men from Gothenburg aged 50–64 years at 31 December 1994. Directly from the population registry a total of 10,000 were randomized to the intervention arm, and 10,000 to the control arm. After randomization, men in the intervention arm were invited (with written information of the study) to biennial PSA-screening. Men with a prior diagnosis of PC at randomization (identified through registry linkage with the regional cancer registry) were not invited, nor were those who died or emigrated before the randomization date. Men allocated to the intervention arm were re-invited every second year until they had reached the upper age limit (67–71 years), died, emigrated or were diagnosed with PC. Men in the control arm were not contacted because no informed consent was needed, but the PC incidence and mortality were assured by linkage with the cancer registry.
The screening algorithm used in Rotterdam has also been described previously. 10 In summary, men living in Rotterdam and surrounding area aged 55–74 years between 1 December 1993 and 31 December 1999 were identified from population registries, and invited to the study. Men with a prior diagnosis of PC were excluded. First, men received an invitation letter together with an information leaflet, providing information about the design and purpose of the study, as well as the screening procedures. After written informed consent was obtained, randomization was carried out. A total of 21,210 men were randomized to the intervention arm, and 21,166 to the control arm. Men in the intervention arm were invited for PSA-screening with a four-year interval until the age of 74 years. Men who refused participation at the first screening round were not re-invited at the second or following screening rounds. Mortality data of all participants who died in the period up to 31 December 2008 were obtained by linking the trial database with Statistics Netherlands.
In both centres, the PC incidence was routinely checked by linkage to the national or regional cancer registry. The cause of death among those men with PC was determined by an independent national causes-of-death committee using predefined flow charts, or by an international committee if no consensus was reached. 11
To achieve a similar age distribution between the Gothenburg and Rotterdam study populations, only men aged 55–64 years at randomization were included in this study.
Expected data
The expected data were based on men who were similar to the study population with respect to gender, calendar age, calendar year and country of participation. The expected Swedish data on PC incidence and PC-specific mortality were obtained from the Swedish Cancer Registry at the National Board of Health and Welfare in Sweden. This registry has a coverage close to 100%. 12 Expected overall mortality was calculated from National as well as Gothenburg city statistics, also available from the National Board of Health and Welfare in Sweden.
The expected Dutch data with respect to PC incidence and PC-specific mortality were retrieved from the Dutch Cancer Registry, which has a completeness of 98%.
13
The expected overall mortality was obtained from nationwide life tables from the Human Mortality Database.
14
This database contains original calculations of death rates and life tables for national populations, as well as the input data used in constructing those tables. A detailed description of the methodology is described here (
Statistical analysis
The observed cumulative incidence of the three endpoints was calculated as 1 minus the observed survival, which was estimated according to the life table method. 15 The distribution of survival times is divided into a number of intervals. For each interval, the number of subjects who entered the respective interval alive, and the number of events that occurred in that interval were computed.
The expected mortality of all endpoints was calculated as 1 minus the expected survival. The expected survival rates were calculated according to the established Ederer II method. 16,17 This method is widely used for estimating expected survival, for the purpose of estimating relative survival. The Ederer II approach controls for heterogeneous observed follow-up times by accounting for when the matched individuals are at risk. Confidence intervals were calculated on the log cumulative hazard scale.
All analyses were performed from the time of randomization until the event (diagnosis of PC, death from PC or overall death), emigration or last follow-up, whichever occurred first. Statistical analyses were carried out with STATA Statistical Software, release 11 (StataCorp LP, College Station, TX, USA), subroutine ‘strs’ (
RESULTS
After age selection, 5896 men in the intervention arm and 5950 men in the control arm were included in Gothenburg. In Rotterdam, 12,422 men were included in the intervention arm and 12,308 in the control arm (Figure 1). Overall, 76.0% of the men allocated to the intervention arm participated in at least one screening round in Gothenburg. In Rotterdam, where consent was obtained before randomization, the response rate for participation and thus randomization was 48.1%. Of all men randomized to the intervention arm in Rotterdam, 94.2% participated in the first screening round, resulting in a net participation rate of 45.3%. Figures 2–4 illustrate the observed and expected cumulative incidences. It should be noted that the median follow-up is 11 years in Rotterdam versus 14 years in Gothenburg.
Flowchart of the observed data
PC incidence
PC incidence in both intervention and control arms are presented in Figures 2a (Gothenburg) and b (Rotterdam). In the Gothenburg study, the observed 11-year cumulative PC incidence was 12.4% (95% confidence interval [CI] 11.5–13.3%) in the intervention arm and 7.3% (95% CI 6.6–8.0%) in the control arm; both of these are higher than the expected cumulative PC incidence of 5.8%. In the Rotterdam study, the observed 11-year cumulative PC incidence in the intervention arm (10.4%, 95% CI 9.9–11.0%) was also higher than the expected cumulative incidence (4.4%). The incidence in the control arm was 4.6% (95% CI 4.2–5.0%).
Observed and expected prostate cancer incidence Gothenburg Observed and expected prostate cancer incidence Rotterdam

Overall mortality
Figures 3a (Gothenburg) and b (Rotterdam) show the observed and expected overall mortality in both centres. In the Gothenburg study, the observed 11-year cumulative overall mortality after randomization was 17.8% (95% CI 16.6–18.8%) in the intervention arm and 18.5% (95% CI 17.6–19.5%) in the control arm. Both rates were higher than the expected cumulative incidence based on the Swedish general population (15.9%), but similar to the cumulative incidence based on the Gothenburg city statistics (17.9%). In the Rotterdam study, the observed 11-year cumulative overall mortality was lower than the expected cumulative overall mortality (13.6% [95% CI 13.0–14.2%] in the intervention arm and 14.0% [95% CI 13.4–14.7%] in the control arm versus expected incidence of 16.1%).
Observed and expected overall mortality Gothenburg Observed and expected overall mortality Rotterdam

PC-specific mortality
Figures 4a (Gothenburg) and b (Rotterdam) present the observed and expected PC-specific mortality in both centres. In the Gothenburg study, the observed 11-year cumulative PC-specific mortality was 0.56% (95% CI 0.39–0.81%) in the intervention arm, and 0.83% (95% CI 0.61–1.12%) in the control arm compared with the expected cumulative incidence of 0.83%. In the Rotterdam study, the observed 11-year cumulative PC-specific mortality was 0.27% (95% CI 0.19–0.39%) in the intervention arm and 0.41% (95% CI 0.30–0.55%) in the control arm, whereas the expected cumulative incidence was 0.65%.
Observed and expected prostate cancer-specific mortality Gothenburg Observed and expected prostate cancer-specific mortality Rotterdam

DISCUSSION
The United States Preventive Services Task Force recently reviewed the literature on PC screening and released an updated draft recommendation against PSA screening. 18 There is insufficient evidence that the benefits outweigh the harms. One of the reasons is that some screening trials did not show benefit in terms of mortality reduction. Therefore, it is critical to understand the mechanisms behind the different outcomes within the trials.
We here provide a unique opportunity to interpret the outcome data of the ERSPC study by investigating the impact of different study designs within the multicentre screening trial.
Gothenburg
In Gothenburg, the observed overall mortality in both the intervention and control arms were higher than the expected Swedish data, but similar to the expected Gothenburg city data (Figure 3a). This ‘city effect’ is in line with observations from previous studies, reporting that men living in urban areas have a worse health status and shorter life-expectancy than those in rural areas. 19,20
A second important finding is that the observed PC incidence in the control arm is higher than expected. This may be due to the higher rate of contamination (i.e. use of PSA testing in the control arm) in the study cohort than anticipated in the general population, although this would be somewhat unexpected because men in the control arm were not contacted about the study. Recent studies, however, have shown that approximately one-third of all Swedish men aged 50–75 years had a PSA test between the years 2000 and 2007, but with large geographical differences. There was a relatively high rate of contamination in the Gothenburg area. 21,22
The observed and predicted PC-specific mortality in the control arm were similar in Gothenburg despite a higher observed PC incidence than expected. This indicates that the unorganised PSA testing that has taken place in the control arm has not influenced PC-specific mortality so far. Also, at a national level there have been no signs of decreased PC-specific mortality in Sweden, despite a steadily increasing PC incidence. 21 However, in the present study the excess incidence was rather low in the first three years (Figure 2a). Therefore, the effect of contamination on PC-specific mortality may become apparent with longer follow-up.
Rotterdam
In Rotterdam, the overall mortality for men in both the intervention and control arms were lower than expected. This can be explained by the so-called ‘healthy screenee bias’ which has been introduced in the Rotterdam cohort. 23 Previous studies have shown that men who chose to participate were healthier than men in the general population. 24–26
Another finding in Rotterdam is that the observed and the expected PC incidence in the control arm were almost identical. This seems to be unexpected because of a peak in contamination in the control arm within the first months of randomization, and an estimated overall contamination rate between 25% and 40%. 27,28 It is possible that a similar trend in PSA-testing and contamination has taken place in the general Dutch population, and has resulted in the similar observed and expected PC incidence.
Men in the control arm of the Rotterdam centre were at a lower risk of dying from PC than expected, despite similar observed and expected PC incidence. An important explanation for this outcome is the healthy screenee bias, which occurred as a result of the applied efficacy study design. Approximately 11% of men who signed the informed consent and thus participated in the Rotterdam centre underwent PSA-testing within four years before study entry. 29 This may have led to ‘pre-selection’ of the study population at baseline, which has been suggested as one of the possible explanations why no screening benefit was demonstrated in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO). 30 Furthermore, the PC-specific mortality may have been lower in the control arm because they were in better health than men in the general population. Although debatable, there is some evidence that co-morbidity (e.g. obesity and metabolic syndrome) is associated with increased risk of dying from PC. 31,32 In addition, early contamination in the control arm 27 is likely to have resulted in the detection of cancers with more favourable prognostic factors, 33,34 and also contributes to the lower observed PC-specific mortality than expected. However, it is unlikely to be possible to determine the precise effect of each of these explanations on the lower observed PC-specific mortality. An alternative explanation could be an as yet undetermined environmental factor. Therefore, the exact mechanism behind the difference in observed and expected incidence has yet to be established. To our knowledge, this difference in observed versus expected cancer-specific mortality has not been described in randomized studies for other cancers.
Notably, the observed PC mortality in Rotterdam appears to have reached a plateau after 11 years, especially in the control arm. However, it is difficult to determine at this point whether this is an effect of the study design or due to the incomplete follow-up. Nevertheless, these few events late in the study will not change the overall results.
Implications
As a consequence of the efficacy design, the study cohort in Rotterdam appears to be a selected, ‘healthier’ group when compared with the general population. This has been reflected in lower overall mortality and, more importantly, lower PC-specific mortality in the control arm than expected. In turn, this may lead to underestimation of the PC-specific mortality reduction and the ‘true’ screening effect. The same is probably true for studies with similar design, as, for example, the PLCO trial. Despite its nearly four-fold size of study population and higher average age at randomization, the number of men dying from PC in the PLCO trial (n = 158) 35 was about twice as high as in the Gothenburg trial (n = 78). 36 Also, the PLCO trial took place in the USA, where PSA testing was already widespread; the ERSPC was conducted in Europe, where background rates of PSA testing were very low. Moreover, 40% of men in the control arm in the PLCO trial underwent PSA testing in the first year, with contamination reaching 52% by year 6. Contamination in the ERPSC in the early years was no more than 15%. 30
Because of the effectiveness design, the trial in Gothenburg was carried out in a population-based cohort. Men from the control arm were not aware of their participation in the trial, and the PC-specific mortality in the control arm was similar to that in the general population. Therefore, although in both Rotterdam and Gothenburg a lower PC-specific mortality was observed in the intervention arm compared with the control arm, the achieved reduction of PC-specific mortality in Gothenburg may be more representative for the ‘true’ effect of PSA-testing when a population-based screening programme is introduced. The difference in randomization may also partly explain the observations made in the main reports of the ERPSC, 7,8 showing that Gothenburg strongly contributes to the achieved mortality reduction.
Limitations
Some limitations of the present study should be discussed. First, we did not compare the observed outcomes between Gothenburg and Rotterdam. Clearly, this would be interesting, but the purpose of this study was to compare the observed and predicted data in each centre, and not to make a head-to-head comparison between the two branches of the ERSPC study. Such comparison would necessitate an analysis of all other existing differences, such as background risk, 37 screening algorithms, 38 contamination rate and treatment. Secondly, unlike the expected data, the observed data do not include men with prevalent PC. This may have led to an underestimation of the PC-specific mortality in the observed data. Thirdly, in Gothenburg we compared the observed data on overall mortality with both Swedish and Gothenburg-specific expected data. It would be interesting to also make the same comparisons in Rotterdam, but the Rotterdam-specific expected data were not available. Should the mortality in Rotterdam be lower than in the general Dutch population, the interpretation of the results might differ. Finally, the results are based on a screening trial with a single PSA threshold for all participants. Therefore, the conclusions are probably not applicable when an individualized risk-based screening strategy is implemented.
CONCLUSION
The difference in study design is likely to have contributed to discrepancies in the observed data and the expected data, in terms of PC incidence, overall mortality and, more importantly, PC-specific mortality. The observed PC-specific mortality in the control arm in a screening trial with randomization after informed consent (efficacy trial) appears to be lower than expected when compared with a trial with randomization prior to informed consent (effectiveness trial). This may result in underestimation of the PC mortality reduction. Our results suggest that an effectiveness trial may more accurately estimate PC-specific mortality reduction if population-based screening is introduced. Other factors such as false-negative screening test results and unnecessary biopsies, overdiagnosis, quality of life and costs must also be considered before a screening programme can be launched.
Footnotes
ACKNOWLEDGEMENTS
This study has been supported by the Dutch Cancer Society (KWF 94–869, 98–1657, 2002–277, 2006–3518), The Netherlands Organization for Health Research and Development (002822820, 22000106, 50–50110–98–311), 6th Framework Program of the EU: P-Mark: LSHC-CT-2004–503011, Beckman Coulter Hybritech Inc., Europe against cancer (SOC 95 35109, SOC 96 201869 05F02, SOC 97 201329, SOC 98 32241); The Swedish Cancer Society (3792-B96–01XAB), Wallach Oy. Hybritech Inc., Schering-Plough Sweden and Abbot Pharmaceuticals Sweden.
