Abstract
Objectives
Overdiagnosis in low-dose computed tomography randomized screening trials varies from 0 to 67%. The National Lung Screening Trial (extended follow-up) and ITALUNG (Italian Lung Cancer Screening Trial) have reported cumulative incidence estimates at long-term follow-up showing low or no overdiagnosis. The Danish Lung Cancer Screening Trial attributed the high overdiagnosis estimate to a likely selection for risk of the active arm. Here, we applied a method already used in benefit and overdiagnosis assessments to compute the long-term survival rates in the ITALUNG arms in order to confirm incidence-excess method assessment.
Methods
Subjects in the active arm were invited for four screening rounds, while controls were in usual care. Follow-up was extended to 11.3 years. Kaplan-Meyer 5- and 10-year survivals of “resected and early” (stage I or II and resected) and “unresected or late” (stage III or IV or not resected or unclassified) lung cancer cases were compared between arms.
Results
The updated ITALUNG control arm cumulative incidence rate was lower than in the active arm, but this was not statistically significant (RR: 0.89; 95% CI: 0.67–1.18). A compensatory drop of late cases was observed after baseline screening. The proportion of “resected and early” cases was 38% and 19%, in the active and control arms, respectively. The 10-year survival rates were 64% and 60% in the active and control arms, respectively (p = 0.689). The five-year survival rates for “unresected or late” cases were 10% and 7% in the active and control arms, respectively (p = 0.679).
Conclusions
This long-term survival analysis, by prognostic categories, concluded against the long-term risk of overdiagnosis and contributed to revealing how screening works.
Introduction
Lung cancer (LC) is the most common cancer and the leading cause of cancer deaths, globally. In Western countries, LC age-adjusted incidence and mortality have decreased for males but are still increasing for females, in parallel with the trend in smoking habits by gender. 1 Recent five-year survival rates for cases in the USA are still under 20%. 2 Chest X-ray (CXR) and sputum cytology randomized screening trials (RSTs) have failed to show a reduction in cause-specific mortality and, in the 1990s, evaluation of low-dose computed tomography (LDCT) screening for LC was begun in one-arm and RSTs. 3 Following the positive outcome of the National Lung Screening Trial (NLST) in the USA in 2011, the USPSTF (United States Preventive Services Task Force) recommended LDCT screening. 4 , 5 In Europe, recommendations were not changed while evaluators waited for the full publication of the results of the NELSON (Dutch-Belgian Randomized Lung Cancer Screening Trial) and other European RSTs. 6 There were critical issues to be reflected on before the implementation of organized screening programmes: the selection of high-risk subjects, assessment procedures, the screening interval and the estimation of overdiagnosis. 7 , 8
“Excess of cumulative incidence” approach and estimation of overdiagnosis
Overdiagnosis is defined as the proportion of confirmed cancer cases (invasive and in situ) diagnosed during screening rounds which would not have come to clinical attention if screening had not taken place. 9 Although alternative approaches have been suggested (such as non-calcified nodule growth measurement), the prospective stop-screen trial, i.e. screening with a limited number of rounds and no screening of the control group, is considered the most informative type of evidence. 10 , 11 In these trials, the risk of overdiagnosis has been quantified as the excess of cumulative incidence of LC in the active arm, using different denominators. 12 A “prevalence peak” of lesions is expected in the screening period, i.e. the lead time by screen-detected cases, followed by a compensatory drop of incidence rates after screening closure. Apart from failures or selection biases at randomization, there are many factors that can affect the estimates: the intensity of the screening protocol, the uptake in the active arm and contamination of the control arm by opportunistic screening or other interventions (such as CXR). 13 In the absence of overdiagnosis in a stop-screen trial, at least five or more years of follow-up commencing at screening closure are considered necessary to equalize the active and control arms (the so-called “catch-up”), a duration conditioned by the natural history of different cancers. 14 However, lead time and its distribution are difficult to estimate across studies due to differences in intensity of the screening regimen and different follow-up durations after screening closure.
Screening works by allowing diagnosis and effective treatment at earlier stages. If screening has been effective, a smaller number of advanced cases should result in a reduced number of deaths from cancer. Statistical modelling has been used to adjust the excess of incidence estimate for lead time, but modelling is often based on uncertain and unsubstantiated assumptions. 11 Estimates from stop-screen, randomized trials are not immediately generalizable to population-based screening programmes; however, RSTs are considered the best setting to quantify overdiagnosis risk. 15
Long-term LC survival analysis and prognostic selection
The survival rates for screen-detected (SD) cases are expected to be higher than for those that were not screen-detected (NSD), i.e. those diagnosed clinically in the screening rounds, in non-participants or in the control arm. As Alan Morrison wrote in 1985, it is not possible to distinguish the effect of lead time or pseudodisease (as he called overdiagnosis) on survival from a true reduction in mortality … . Had the data on pathologic stage at diagnosis been used to predict survival, this also would have been improved in the screened group since survival is better for localized compared to non-localized cancers (and the observed survival higher for cases in the screened group).
16
The Mayo Lung Project (MPL), the most important of the 1970s LC RSTs, evaluated intensive versus non-intensive screening and it did not show, prospectively, any reduction in LC mortality. A total of 9211 male smokers were invited for CXR and sputum cytology every four months for six years in the active arm, while subjects in the control arm were advised to take the same tests annually. The MPL was a stop-screen trial with an initial short follow-up of three years after screening closure. An excess of early cases was found at the end of follow-up in 1983 without any difference in the absolute number of advanced cases. Marcus et al. published an extended follow-up which did not show any change in the absolute excess of cases. 17 The long-term, 10-year survival rate of early cases was still higher in the active arm than in the control arm (59.9% vs. 41.6%), an indicator of the detection of indolent LC cases or of a high competitive risk of dying after a long-term follow-up. They stated that “true improvement in case survival must be accompanied by a reduction in mortality,” and “statistically significant longer survival for patients in the active arm” suggested “screening biases were responsible for the discordance and that overdiagnosis (i.e. prevalently, the detection of indolent cases) was more likely than lead-time bias or length bias.” 18
In this paper, we present an updated evaluation of the prospective excess of cumulative incidence in ITALUNG, a contributory study to the European evaluation of LDCT screening for LC, and a secondary analysis of the study based on a long-term survival method. 19
Methods
The enrolment, screening protocol, participation rates, tumour characteristics and outcomes of the ITALUNG randomized trial were previously presented in detail.19,20 The trial was approved by the Local Ethics Committee of each participating institution (approval number 29–30 of 30 September 2003; number 23 of 27 October 2003; and number 00028543 of 13 May 2004). Briefly, ITALUNG recruitment was population-based and, after informed consent and randomization, subjects were allocated to an active (N = 1613) or control (usual care) (N = 1593) arm. Subjects were 55 to 69 years at the time of enrolment with a smoking history of at least 20 pack-years during the last 10 years (former smokers who had quit more than 10 years previously were excluded), and of both sexes. They were invited to be screened with four annual LDCT tests. The risk profiles of the two arms were comparable. The excess of cumulative incidence was estimated at screening closure (year 4) and at the end of the total follow-up period. LC incidence data – and mortality data – have been updated, adding 24 and 29 LC cases to the active and control arms, respectively, over a median of 11.3 years of follow-up. 21
Data on histology and staging have been collected from clinical/pathological records and the total number of LC cases were subdivided according to the modality of detection and trial arm. Based on staging and whether or not resected, as in the MPL extended follow-up analysis, we defined as “resected and early” (“R and E”) those cases surgically resected and diagnosed as stage I or II. Cases staged III or IV or unresected were classified as “unresected or late” (“U or L”). We compared survival of the early and resected cases between active and control groups. If there was substantial overdiagnosis of indolent cases, we would expect to see a major survival advantage for the active group among these early-stage cases. Kaplan-Meyer 5- and 10-year survivals and Cox hazard ratios were estimated using STATA. 22
Results
At a median of 11.3 years of follow-up, the incidence rates of LC were 52.8 and 59.4 (per 10,000 person-years) for the active and control arms, respectively (RR: 0.89; 95% CI: 0.67–1.18). There were 38 (42%) SD and 53 NSD cases in the active arm. The latter included cases clinically detected in the screening interval (N = 6). There were 100 NSD cases in the control arm.
The histology, staging and surgical resection status of cases by study arm and detection modality (SD or NSD) are detailed in Table 1. In the active arm, 82% of the SD cases were resected, 61% were stage I and 61% adenocarcinomas. In the control arm, 28% of cases were resected, 12% were stage I and 30% adenocarcinomas. The NSD cases of the active arm, including the interval cancers, showed a histology and resection distribution similar to the control arm. Among the NSD cases of both arms, there was a higher proportion of unclassified histology. The bronchioloalveolar adenocarcinoma (BAC) component was pathologically diagnosed in nine SD cases: four detected at baseline screening, four at repeated and one in the screening interval. Out of these, six and three were staged IA and IIIA, respectively. One BAC stage IA-1, N0M0 died from malignant neoplasm of an unspecified part of the bronchus (C349). One BAC staged IIIA, T2N2M0 at diagnosis, died of LC (C162.9). Only one LC patient was classified as pure BAC. No BAC cases were reported in the control arm.
Lung cancers detected in the ITALUNG randomized screening trial (updated follow-up) by detection modality, stage, histological type and surgery.
SD: screen-detected (four rounds); NSD: clinically detected (including six interval cases); control arm was followed up in usual care.
aPathological or clinical.
bBronchioloalveolar carcinomas (BAC) included.
Figure 1(a) to (c) shows the cumulative excess of incidence of the active arm for total, “resected and early” and “unresected or late” in the screening period (year 0–4) and the post-screen phase (5+). After the baseline screening and until the end of the follow-up, a compensatory drop in late LC cases was observed, with a decrease of 25 late cases in the active vs. the control arm. The “unresected or late” active arm, after the peak observed at baseline screening, was at a lower level of incidence until the end of the follow-up time. The proportions of “resected and early” were 38% and 19% in the active and control arms, respectively (p = 0.003). The active arm “resected and early” cases showed a peak of incidence at baseline screening, converging at the level of the control.

(a, b, c) Cumulative ITALUNG lung cancer incidence rates (10,000×) by study arm and prognostic characteristics.
The numbers of cases and deaths from LC or other causes at the end of follow-up are presented by study arm and state at diagnosis in Table 2. The total deaths due to LC were 58 and 74 in the active and control arms, respectively; a difference of 16 LC deaths. There were six deaths from other causes among the LC cases. In both arms, the proportion of deaths in “unresected or late” cases was high (81% and 95%, respectively). Out of 38 SD cases, 10 were “unresected or late” and 9 of them died of LC. In “resected and early” cases, a stage shift attributable to screening diagnostic anticipation in the absence of mortality benefit is possible, as suggested by the seven LC deaths in excess in the active group. The proportion of LC deaths within “resected and early” cases was 31% in the active versus 21% in the control group.
Prognostic characteristics at diagnosis and causes of death in the ITALUNG randomized screening trial by study arm and diagnostic modality.
Note: Absolute difference between the active and control arm in parentheses; median follow-up duration: 11.3 years.
LC: lung cancer; “R and E”: resected and early LC; “U or L”: unresected or late LC.
a20 LC screen-detected cases at baseline and 18 at repeated rounds.
The survival curves, with details of the number of LC cases and deaths, by state at diagnosis are shown in Figure 2. The “resected and early” 10-year survival rate was 64% in the active (N = 35) and 60% in the control arm (N = 19) (+4%; p = 0.689). The “unresected or late” five-year survival rates were 10% and 7% for the active (N = 56) and control arms (N = 81), respectively (p = 0.679), with a maximum of 8.5 years’ follow-up. There were 16 “resected and early” cases more and 25 “unresected or late” less in the active arm at the end of follow-up. The Cox regression analysis estimated, comparing the hazard of the active and control arms, a 22% reduction (HR = 0.78; 95% CI: 0.55–1.11). Adjusting for resection and stage at diagnosis, the HR was 1.1 (95% CI: 0.77–1.56).

ITALUNG lung cancer survival by study arm and prognostic characteristics. At the end of follow-up, LC “resected and early” cases and deaths were 35 and 11 for the active and 19 and 4 for the control groups, respectively. Five-year “resected and early” survival rates were 72% (95% CI: 53%–84%) and 75% (95% CI: 39%–91%) and at 10 years 64% (95% CI: 44%–79%) and 60% (95% CI: 22%–84%), for the active and control groups, respectively. LC “unresected or late” cases and deaths were 56 and 47, for the active and 81 and 69 for the control groups, respectively. LC survival rate of “Unresected or Late” at five years was 10% (95% CI: 3%–21%) and 7% (95% CI: 3%–16%) for the active and control groups, respectively. Last observation in the control arm was at 8.5 years.
Discussion
Randomized screening trials are a controlled setting to assess overdiagnosis risk; however, characteristics of the study design, for example balance in randomization, type of cancer and screening intervention, have an impact on outcomes. In LDCT screening for LC, the RSTs were stop-screen trials, with a limited but varying number of screening rounds, usually with a one-year interval and no screening of the control arm at screening closure. Most of them have now published data with a follow-up of about 10 years and three have formally reported longitudinal estimates of overdiagnosis risk: the NLST, the largest study globally and ITALUNG and the Danish LC Trial (DLCST) which were both small trials with large statistical uncertainties in their estimates. Heleno et al. have published a secondary analysis of the DLCST at about 11 years of follow-up estimating a high overdiagnosis risk (67%). Patz et al. estimated overdiagnosis at 18.5% in the NLST at 6.4 years of follow-up, recognizing limitations due to the short follow-up duration and the offer of CXR to the control arm.23,24 Recently, the NLST Research Team published an extended follow-up NLST trial at 11.3 years from randomization, showing a minimal excess (RR = 1.01; 95% CI: 0.95–1.09). They estimated a risk of overdiagnosis of 3.1%, using the SD LC cases as denominator. 25 The NELSON trial, the largest trial in Europe, estimated 8.9% (95% CI: –18.2–32.4) excess-incidence overdiagnosis at 11 years of follow-up for SD cases, requiring additional time since last test to fix the estimate. 26 In the pooled analysis of the DANTE and MILD trials, SD and interval cases were reported, but in the absence of a formal estimation of the excess of incidence. 27 In their recent publication of the LUSI trial, Becker et al. considered the trial follow-up time after screening closure still premature to estimate overdiagnosis. 28 In the current secondary analysis of the ITALUNG RST, “resected and early” and “unresected or late” cases, after the peak at prevalent screening, showed the expected trend. However, the total cumulative incidence rate of LC was, at end of follow-up, lower in the active arm (-4%), which although not statistically significant, is an indicator of no – or limited – overdiagnosis.
NLST, DLCST and ITALUNG data are compared in Table 3. NLST and ITALUNG were different in several aspects, primarily in study size, screening protocols and risk profile (sex, age groups, smoking habit, pack-years). The incidence rates were 62.9, 57.8 and 27.1 per 10,000 person-years in the NLST, ITALUNG and DLCST control arms, respectively. The corresponding estimates in the active arms were 63.9, 53.9 and 49.4, respectively. The lower incidence rates in the control group of the DLCST have been attributed to the youngest age at randomization (50 years of age in DLSCT, 55 in NLST and ITALUNG) and the higher rates to the higher risk profiles of its active arm subjects. In NLST, CXR screening was offered to the control arm and, therefore, a higher incidence might be expected in that arm. However, Oken et al. compared the Prostate, Lung, Colorectal and Ovarian (PLCO) NLST-eligible cohort with the NLST control arm (CXR) arm concluding “the NLST results are likely a good approximation of the mortality benefit that must have been observed of low dose CT vs. usual care.” 29 In conclusion, the NLST (extended follow-up) and ITALUNG studies showed a comparable small or absent level of overdiagnosis at long-term follow-up, whereas the DLCST’s high estimate might be attributable to imbalances in randomisation, as suggested by Wille et al. 30
Excess of incidence and overdiagnosis estimates in NLST, DLSCT and ITALUNG randomized trials.
LC: lung cancer; LDCT: low-dose computed tomography; UC: usual care; na: not available; NLST: National Lung Screening Trial; DLCST: Danish LC Trial; ITALUNG: Italian Lung Cancer Screening Trial; CXR: Chest X-ray.
aOverdiagnosis I: the total LDCT arm LC cases are the denominator.
bOverdiagnosis II: the screen-detected LC cancer cases are the denominator.
LC screening reduces mortality if diagnostic anticipation and earlier treatment are effective in changing the prognosis of the tumour. Whereas the main limitation of this secondary analysis of the ITALUNG study is small sample size, the long-term survival analysis showed how prognostic selection due to screening can work. In MPL and other old studies of CXR and cytology LC screening, better survival rates for SD cases were reported in the absence of LC mortality reduction. 31 There was a prognostic selection at screening of early cases, indolent at the end of the extended follow-up. In ITALUNG, the long-term survival was similar in the two arms, in the presence of a reduction in the number of advanced cases. In the setting of a stop-screen trial with adequate follow-up time, we confirmed the opportunity to assess overdiagnosis comparing the excess of incidence approach, the information on prognostic selection related to screening and long-term survival rates. The potential competitive mortality from other causes within an LC case had, in ITALUNG, a minimal impact.
The ITALUNG data support the conclusion that overdiagnosis is not a major problem in LDCT screening, as the NLST extended follow-up also suggested. In DLSCT, the excess of incidence is real and based on the detection of a large number of stage I tumours. What happened in DLSCT is an open question which needs to be addressed by a detailed re-analysis of the screening process. We agree with Heleno et al. that the higher risk of subjects is not a plausible explanation of the excess of incidence.
Survival comparisons of large datasets, such as NELSON, NLST and pooling of smaller RSTs, might be usefully assessed by hazard analysis, including as risk factor covariates tumour characteristics, modality of diagnosis and stage. The long-term, prognostic selection of LC cases and state-specific survival approach are not only useful but indispensable for an understanding of how screening works and should be a complementary analysis in the evaluation of overdiagnosis (and in benefit–harm evaluation). A comparative analysis of the RST studies, considering differences and comparability in trial risk profiles and study design, might confirm these findings.
Smoking cessation has been established as effective in the reduction of smoking-related mortality – including from LC – as after-quitting trends in mortality of high-risk smokers showed many years ago. 32 A multimodal intervention (smoking cessation plus LDCT screening) could change at the same time incidence trend and natural history of the disease by means of early detection. This complex impact will need in the future a statistical modelling approach, using assumptions derived from RSTs and long-term survival analysis, to understand the possible contribution of each intervention and treatment changes, in LC mortality reduction.33,34
ITALUNG Working Group
Francesca Maria Carozzi (PI), BSc, Cristina Maddau, BSc, Simonetta Bisanzi, BSc: Regional Prevention Laboratory Unit, ISPRO – Institute for Cancer Research, Prevention and Clinical Network, Florence, Italy. Eugenio Paci, MD (retired), Donella Puliti, Msc, Marco Zappa, MD, Gianfranco Manneschi, Bsc, Leonardo Ventura Bsc, Carmen Visioli, MD, Giovanna Cordopatri, BSc, Francesco Giusti, Bsc Cristina Ocello, Bsc, PhD: Clinical Epidemiology Unit, ISPRO – Institute for Cancer Research, Prevention and Clinical Network, Florence, Italy. Andrea Martini, MSc, Environmental and Occupational Epidemiology Unit, ISPRO – Institute for Cancer Research, Prevention and Clinical Network, Florence, Italy. Giulia Picozzi, MD: Radiodiagnostic Unit, ISPRO – Institute for Cancer Research, Prevention and Clinical Network, Florence, Italy. Mario Mascalchi, MD, PhD, Department of Experimental and Clinical Biomedical Sciences “Mario Serio”, University of Florence, Italy. Andrea Lopes Pegna (retired), MD, Roberto Bianchi, MD, and Cristina Ronchi, MD: Pneumonology Department, Careggi Hospital, Florence Italy. Maurizio Bartolucci, MD, Elena Crisci, MD, Agostino De Francisci, MD, Massimo Falchini, MD, Silvia Gabbrielli, MD, Giuliana Roselli, MD, and Andrea Masi, MD: Radiology Department, Careggi Hospital, University of Florence, Italy. Camilla Comin, MD: Pathology Department, Careggi Hospital, University of Florence, Italy. Luca Vaggelli, MD: Nuclear Medicine Department, Careggi Hospital, Florence, Italy. Alberto Janni, MD: Thoracic Surgery Department, Careggi Hospital, Florence, Italy. Laura Carrozzi, MD, Ferruccio Aquilini, BSc, Stella Cini, MD, Mariella De Santis, MSc, Francesco Pistelli, MD, PhD, Filomena Baliva, BSc, Antonio Chella, MD, and Laura Tavanti, MD, PhD: Cardiothoracic and Vascular Department, University Hospital of Pisa, Italy. Fabio Falaschi, MD, Luigi Battola, MD, Annalisa De Liperi, MD, and Cheti Spinelli, MD: Radiology Department, University Hospital of Pisa, Italy. Alfredo Mussi, MD and Marco Lucchi, MD: Thoracic Surgery Unit, Cardiothoracic and Vascular Department, University Hospital of Pisa, Italy. Gabriella Fontanini, MD, Adele Renza Tognetti, MD, Pathology Department, University Hospital of Pisa, Italy. Michela Grazzini, MD, Florio Innocenti, MD, and Ilaria Natali, BSc: Pneumonology Department, Hospital of Pistoia, Italy. Letizia Vannucchi, MD, Alessia Petruzzelli, MD, Davide Gadda, MD, Anna Talina Neri, MD, and Franco Niccolai, MD: Radiology Department, Hospital of Pistoia, Italy. Alessandra Vella, MD, PhD: Nuclear Medicine Department, Le Scotte University Hospital, Siena, Italy. Members of the Cause of Death Review Panel: Adele Caldarella Adele, MD, Alessandro Barchielli (retired), MD, Tuscany Cancer Registry, ISPRO – Oncological network, prevention and research institute; Carlo Alberto Goldoni, Epidemiology unit, Local Health Unit Modena, Italy.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The ITALUNG study was completely funded by the local government of Tuscany (Decree N. 1014 of 02.25.2004) and by a Research Grant (PRIN 2003) to Professor Mario Mascalchi of the Italian Ministry of Education, University and Research.
