Assessing the effectiveness of a cancer screening test in the presence of another screening modality

Abstract

Objectives

Analysis of cancer screening effectiveness is challenging in part because of competing tests, which are additional screening tests that identify the condition of interest. For example, studies investigating screening with faecal occult blood tests to prevent colorectal cancer mortality need to consider the occurrence of screening colonoscopy. This paper compares analytic approaches to accounting for competing tests in analyses of cancer screening data.

Methods

We used simulations to compare bias and efficiency across approaches in different scenarios, quantify bias, and make recommendations for analyzing the effectiveness of a screening test in the presence of competing tests.

Results

Under all scenarios, the best performing approach for accommodating competing screening tests was censoring at the time of the competing screening test (range in bias across scenarios: −7.6% to 1.6%). Bias from other approaches ranged from 23.9% to 652.1%.

Conclusions

Censoring at the competing screening exam is the recommended approach for studying cancer screening effectiveness in the presence of competing tests. Censoring avoids confounding by prior competing test results and selection bias resulting from analyzing data on participants after they received a competing screening exam. Results from this study are broadly applicable to screening studies for other conditions, including other types of cancer.

Keywords

cancer screening observational study bias epidemiologic methods

Introduction

Much of what we know about cancer screening effectiveness comes from retrospective observational studies. Yet important methodological questions remain about how to use observational data to assess the effect of screening on cancer mortality. One particular analytic challenge is created by the availability of multiple screening modalities for most screen-detectable cancers. For example, colorectal cancer (CRC) is detectable by flexible sigmoidoscopy, faecal occult blood test (FOBT), or colonoscopy, and national clinical guidelines endorse all these preventive methods.¹ Breast cancer can be detected by digital and film mammography and magnetic resonance imaging. Cervical cancer screening methods include human papillomavirus and cytology testing. Many other conditions, including hearing loss and mild cognitive impairment, can be identified using more than one screening modality.

When multiple screening modalities are available for a particular cancer, the different tests are considered to be “competing”, because once a person has been screened with one modality, they are unlikely to be screened with another, at least for a period of time. Another factor that complicates the study of screening effectiveness is that certain test results (eg. cancer diagnosis) may be available in administrative databases, while others (eg. FOBT results, high-risk adenomas detected on colonoscopy) may not. This is a noted challenge in large population-based studies of screening.² Thus, analytic strategies need to be tailored to this limited set of data.

This paper examines observational data analysis approaches that account for competing screening tests and can be used with datasets that contain limited test results. We use CRC screening with the competing tests of FOBT and colonoscopy as an example to quantify and compare bias among approaches. This work adds to our understanding of the benefits and drawbacks of different analytic approaches to analyzing screening tests as they are used in real-world healthcare settings.

Methods

Our approach was to define several realistic but simple screening patterns and several possible analytic approaches for estimating the effectiveness of a screening test for reducing cancer mortality. More complex patterns can be extrapolated as combinations of the simple patterns we have investigated. We evaluated the bias and precision of the different approaches using a simulation study.

Screening patterns

Our example was a CRC screening study on FOBT effectiveness that considered colonoscopy to be a competing screening examination. We identified five possible simple screening patterns (Figure 1): (A) no screening at all; (B) receipt of only FOBT; (C) receipt of only screening colonoscopy; and (D) receipt of FOBT and then screening colonoscopy, or (E) receipt of screening colonoscopy and then FOBT.

Figure 1.

Possible screening patterns for two competing screening modalities. FOBT, faecal occult blood test; T_i₁ is the observed time of FOBT, the screening test of interest, and T_i₂ the observed time of colonoscopy (a competing screening test for the same disease). X_i describes the observed screening pattern for person i, with X_i = (1(observed to use FOBT), 1(observed to use screening colonoscopy)) where 1is the indicator function which takes a value of 1 if the condition is satisfied.

Analytic approaches

We compared five analytic options for studying the effectiveness of screening FOBT for reducing the rate of cancer mortality in the presence of competing screening colonoscopy when screening test results are unknown. Table 1 shows these options and how they could be implemented in cohort and case-control studies.

Table 1.

Approaches for Studying the Effectiveness of Screening Faecal Occult Blood Test (FOBT) in the Presence of Screening Colonoscopy.

Approach	Description	Implementation in cohort study	Implementation in case-control study
Pool	Ignore use of screening colonoscopy	Do not collect or incorporate screening colonoscopy information
Censor	Exclude person-time after screening colonoscopy	Censor at screening colonoscopy	Exclude person from pool of cases and controls after occurrence of screening colonoscopy
Stratify	Estimate effectiveness of FOBT stratified by prior screening colonoscopy	Compute effectiveness estimates stratified by prior screening colonoscopy or include covariate for prior screening colonoscopy with an interaction between FOBT and screening colonoscopy in model
Adjust	Weighting or regression adjustment for screening colonoscopy	Compute effectiveness estimates stratified by prior screening colonoscopy use and combine weighted estimates or include covariate for screening colonoscopy in analytic model
Exclude	Exclude for any screening colonoscopy	Exclude person for any screening colonoscopy	Exclude person from pool of cases and controls for any screening colonoscopy

The first analytic decision is whether to account for the competing screening test at all. Not accounting for the competing test results in pooling people with and without screening colonoscopy when estimating FOBT effectiveness. Accounting for the competing screening test requires an approach for handling screening colonoscopy. In one approach, person-time following exposure to screening colonoscopy is excluded from estimates of FOBT effectiveness. In this approach, individuals are censored at the time of receipt of screening colonoscopy. In another approach, individuals are stratified by prior receipt of screening colonoscopy and data on the effectiveness of FOBT in the presence of competing screening colonoscopy are reported separately. Alternatively, standardized estimates are used to adjust for prior receipt of screening colonoscopy. Standardization is achieved by weighting the estimates of FOBT effectiveness in the presence and absence of screening colonoscopy by the proportion of people who receive and do not receive screening colonoscopy. A final approach is excluding data on people with screening colonoscopy.

To illustrate how the five analytic options could be implemented in practice, we describe estimators for the effectiveness of screening FOBT to reduce CRC mortality, assuming constant mortality rates within the effectiveness window (Appendix). Effectiveness was defined as the additive difference in mortality rate for screened compared with unscreened individuals. We defined the effectiveness window to be the period of person-time when a screening test might reduce mortality from the cancer of interest; this period might differ among tests for the same condition. We made the simplifying assumption that the effectiveness window was approximately equal to the time until a person was next due for screening (ie. the recommended screening interval), although this assumption probably resulted in an underestimate of the length of the effectiveness window. A longer interval would have increased the frequency with which competing tests could occur during the effectiveness window. The estimators illustrate how the approaches differ in events (the numerators) and persons or periods of person-time (the denominators). All approaches correspond to maximum likelihood estimators (MLEs) for the rates of exponentially distributed random variables over some period of person-time, except for the “exclude” approach (Table 1). This approach excluded both person-time and events using future information on use of the competing screening test, and does not correspond to an MLE. The exclude approach is therefore not necessarily a consistent estimator for the event rate. Moreover, although all other approaches corresponded to MLEs for event rates, these rates were not necessarily the rates in the general population—the target of inference—and thus may lead to biased effectiveness estimates.

Simulation study

Simulation study design

We conducted a simulation study to compare the bias and efficiency of alternative estimators of screening test effectiveness in the presence of a competing screening test. In simulations, the objective was to estimate the effectiveness of FOBT relative to no FOBT. Our measure of effectiveness was the difference in risk of cancer mortality for unscreened versus screened individuals during the effectiveness window. We simulated screening data over a period of 10 years, assuming the time to the first use of screening FOBT and colonoscopy were exponentially distributed.

Simulations varied the rates of initial uptake of screening colonoscopy and uptake following FOBT to determine the performance of different analytic approaches for different rates of occurrence of a competing test (Table 2). For each of four scenarios, we simulated 1,000 cohorts of size 100,000. The four scenarios were: (1) low initial rate of screening colonoscopy uptake, high rate of screening colonoscopy after FOBT; (2) high initial rate of screening colonoscopy uptake, high rate of screening colonoscopy after FOBT; (3) low initial rate of screening colonoscopy uptake, low rate of screening colonoscopy after FOBT; and (4) high initial rate of screening colonoscopy uptake, low rate of screening colonoscopy after FOBT.

Table 2.

Colonoscopy Screening Rates in Simulated Data.

Scenario	Screening colonoscopy uptake rate (exams/year) among unscreened	Screening colonoscopy uptake rate (exams/year) within effectiveness window of FOBT
Scenario 1	0.036 (low)	0.036 (high)
Scenario 2	0.16 (high)	0.036 (high)
Scenario 3	0.036 (low)	0.005 (low)
Scenario 4	0.16 (high)	0.005 (low)

FOBT = faecal occult blood test.

The flow of simulated participants through the screening process is shown in Figure 2. Each oval represents a subset of the population with its CRC mortality rate indicated by λ. Arrows indicate the movement from one subgroup of the population into another, with rates of transition specified in exams per year or denoted by η. The entire population starts unscreened. Mortality from CRC in the absence of any screening was assumed to be 75 per 100,000 person-years for age ≥50 based on Surveillance, Epidemiology, and End Results programme (SEER) data from 1973 to 1993.³ Our assumptions for screening test utilization rates,^4–6 mortality rates,^7–11 and probabilities for screening test results^12,13 are in Figure 2. We assumed that, in the absence of other screening tests, biennial FOBT reduced mortality by 15% annually, and colonoscopy reduced the risk of CRC mortality by 31% annually. Mortality rates used in simulations were chosen to correspond to these mortality reductions, assuming exponentially distributed times to event. The CRC mortality rate in the population with negative colonoscopies was assumed to be very low based on reported CRC incidence following negative exams.^10,11

Figure 2.

Simulation of colorectal cancer screening and mortality outcomes. FOBT, faecal occult blood test; CSPY, colonoscopy; p-yrs, person-years; λ, mortality rate; η, rate of transition in exams per year; numbers in italics, probability; +, positive test findings (CRC or high risk adenoma); −, negative findings (no CRC or high risk adenoma); gray, information used to generate the simulated data but assumed unavailable when conducting analysis using observational administrative data.

We assumed screening and choice of test were unrelated to CRC risk (ie. no confounding). For colonoscopy, screening test results were assumed to be definitive, and were therefore simulated as positive (CRC or high risk adenoma) or negative (no CRC or high risk adenoma) from a Bernoulli distribution. For FOBT, we assumed false-positive screening test results were possible, and hence simulated results (true positive, false positive, negative) from a multinomial distribution. Negative results were assumed to contain both true negatives and false negatives. We simulated subsequent mortality conditional on screening test results. Screening tests were assumed to reduce risk by lowering the mortality rate among individuals with true-positive test results. We computed overall mortality rates following a screening test, by averaging the mortality rate across individuals with differing screening test results, weighted by the frequency of screening test results. The conferred decrease in mortality risk was assumed to persist for two years following FOBT and 10 years following colonoscopy. All patients with false-positive FOBT results were assumed to receive diagnostic colonoscopy and experience a lower mortality rate associated with negative colonoscopy for 10 years following the examination. We assumed no further screening following a positive colonoscopy. Following negative colonoscopy, negative FOBT, or false-positive FOBT, we assumed that people could be screened with the other screening modality before being due again for screening. Time to death and time to uptake of the competing screening test were assumed to be exponentially distributed.

Simulated life histories were followed until the earliest of death, 10 years of follow-up, or two years after receipt of FOBT. This ensured that the data resembled what would be available in a cohort study in which participants were followed until death, for 10 years, or through the effective interval following a screening FOBT (ie. until the next recommended screening), whichever occurred first. Our simulation study thus focused on the effect of a single round of FOBT screening and not the effect of repeated screening.

Measures of comparison: Bias and standard errors

For each simulated dataset generated under the scenarios and using the assumptions described above, we estimated the FOBT effectiveness using the five analytic approaches described above and in Table 1. We computed absolute risk (AR) reduction as the difference between estimated mortality rates with and without FOBT screening. For each approach we summarized our findings using the average and standard error of the AR across all 1,000 simulated datasets; bias was estimated by taking the difference between the average AR and the true value (-11 per 100,000 person-years). Percent bias was computed as 100 times the ratio of bias divided by the true AR. Empirical standard errors were computed as the standard deviation of the estimated AR across the simulated datasets to estimate the variability of each approach.

Results

We examined five approaches—pooling, censoring, stratifying, adjusting, or excluding data—for estimating the effectiveness of a screening method (FOBT) in the presence of a competing screening method (colonoscopy). Bias, empirical standard errors, and percent bias were different for each of the approaches (Table 3). The censoring approach had the lowest bias. Of the other methods, the pooled approach was the least biased, particularly when the initial rate of screening colonoscopy uptake was low (scenarios 1 and 3). The other approaches overestimated the effectiveness of FOBT, particularly when the initial rate of screening colonoscopy was high (scenarios 2 and 4). The magnitude of the bias was similar to or larger than the empirical standard errors. Bias of this magnitude makes erroneous inference particularly likely because the true value will often not be covered by confidence intervals. Variability of all estimates was similar, with the exception of the stratification approach, which computed estimates within restricted subsets of the data.

Table 3.

Bias, Empirical Standard Errors, and Percent Bias of Estimates for Attributable Risk of Mortality per 100,000 Person-years for Individuals Screened by FOBT Compared with Unscreened Individuals Based on Four Simulated Scenarios, Assuming a True Attributable Risk of -11 per 100,000 Person-years.

	Scenario 1^a		Scenario 2^a		Scenario 3^a		Scenario 4^a
Analytic approach	Bias (SE)	% bias^b	Bias (SE)	% bias^b	Bias (SE)	% bias^b	Bias (SE)	% bias^b
Pool	−3.75 (7.87)	34.1	−11.06 (7.05)	100.6	−2.63 (7.69)	23.9	−9.45 (7.07)	85.9
Censor	0.60 (8.64)	−5.5	0.44 (9.27)	−4.0	−0.17 (8.20)	1.6	0.84 (9.06)	−7.6
Stratify^c	−60.79 (12.90)	552.6	−60.53 (6.69)	550.3	−60.03 (13.43)	545.7	−60.64 (6.87)	551.3
Stratify (Test 1 before Test 2)	−60.89 (13.25)	553.6	−60.37 (8.22)	548.8	−59.55 (18.22)	541.4	−60.38 (15.31)	548.9
Stratify (Test 2 before Test 1)	−71.73 (13.09)	652.1	−71.55 (6.71)	650.5	−71.08 (13.44)	646.2	−71.65 (6.87)	651.4
Adjust	−8.36 (7.55)	76.0	−21.08 (6.38)	191.6	−6.01 (7.51)	54.6	−18.55 (6.54)	168.6
Adjust (Accounting for test order)	−8.37 (7.55)	76.1	−21.08 (6.38)	191.6	−6.00 (7.51)	54.6	−18.55 (6.54)	168.6
Exclude	−8.92 (9.42)	81.1	−36.74 (11.88)	334.0	−7.41 (8.60)	67.4	−32.09 (11.13)	291.7

FOBT = faecal occult blood test; SE = standard error.

Scenario 1: Low initial rate of colonoscopy uptake; high rate of colonoscopy after FOBT; Scenario 2: High initial rate of colonoscopy uptake; high rate of colonoscopy after FOBT; Scenario 3: Low initial rate of colonoscopy uptake; low rate of colonoscopy after FOBT; Scenario 4: High initial rate of colonoscopy uptake; low rate of colonoscopy after FOBT.

Positive % bias indicates overestimation of the benefit of FOBT relative to true AR of -11 per 100,000 person-years.

Results are for FOBT in the stratum with colonoscopy; results in the stratum without colonoscopy are equal to those in the Censor row.

Discussion

We explored analytic approaches for evaluating the effectiveness of a screening test, when a competing screening modality might be used during the test’s effectiveness window. The best-performing approach for analyzing screening test effectiveness in the presence of a competing test was censoring at the time of the competing test. Simulation studies demonstrated that substantial bias occurs when other approaches were used. Based on our findings, we recommend censoring at the time of the competing screening test. In case-control studies, the censoring approach is equivalent to risk-set sampling, in which people are eligible to be cases or controls until they have a competing screening exam. While stratification by the competing screening exam may be intuitively appealing, for the reasons described below it produces biased estimates.

Results of a prior competing screening test act as a confounder because people who have a positive screening test result are not eligible to be screened again, so only those with a prior negative screening test result go on to receive the screening test of interest. If results of the prior competing screening test are available, traditional methods for handling confounding (eg. stratification or adjustment by prior test results) should be sufficient to eliminate bias. We focused on identifying the best analytic approach when data sources contain the occurrence of a competing screening test, but not results. For instance, in administrative data, information on cancer incidence and cancer mortality, but not individual test results, may be available. This is often the case in studies of CRC screening, where colonoscopy results are not available in administrative data.² In this scenario, stratification by the true confounder (results of a prior competing screening test) is impossible.

The occurrence of a competing screening test after the exam of interest (and during the effectiveness window of the test of interest) causes selection bias, because receipt of the competing test during the effectiveness window of the test of interest is related to the results of the test of interest. Only individuals with a negative result on the screening test of interest can subsequently receive the competing test. So, for example, FOBT will look beneficial when comparison is made between individuals with a negative FOBT result (low risk) and unscreened (average risk) individuals. Because this bias arises from stratification based on results of the test of interest, it is not remediable, even if results of the competing screening test (eg. colonoscopy) are known. In summary, biases that result from competing screening tests before and after the exam of interest occur because the only people screened with both tests are those whose first screening test result was negative; these people are at lower risk of mortality.

Our findings apply to competing screening tests, not diagnostic tests that are performed in response to signs or symptoms of disease. Censoring at diagnostic exams is not recommended, because these exams are often events along a causal pathway in which people are diagnosed with cancer before dying from it. Administrative data algorithms can help to identify test indication.¹⁴ The question of how to study people who change screening regimens when they become due again for screening is different, and is not considered in this paper, which focuses on single rounds of screening.

Estimates of bias and precision are based on simulations to investigate five straightforward screening scenarios, with simplifying assumptions. This approach has several limitations. First, we investigated only two tests with five screening strategies. Real-world applications involve more complex combinations of tests. However, by considering only the test of interest and the first competing test to occur, more complicated strategies can be reduced to fit into our scenarios. Second, mortality rates were assumed to be constant, and the effect of screening tests was assumed to act on mortality via a step function, with risk immediately decreased following screening, then returning to its pre-screening value at the end of the effectiveness window. In reality, mortality rates and the effect of screening over the course of the effectiveness window are likely to be non-constant and the effectiveness window is likely to be longer than we assumed. Bias in situations with more complex mortality rates and screening effects might differ in magnitude from our findings. However, the basic pattern of bias and efficiency is expected to be the same across the five approaches. In spite of the simplicity of our simulations, our results provide guidance to researchers for estimating the comparative effectiveness of screening tests in the presence of competing screening tests, by describing the type of bias that may arise in different analytic approaches.

Our findings are broadly relevant to screening studies. However, understanding how to analyze real-world screening data is especially important for CRC studies because multiple screening modalities are common, and the comparative effectiveness of alternative regimens has primarily been examined using modeling.^15,16 Nonetheless, similar issues are arising for studies of breast cancer screening with screening magnetic resonance imaging, ultrasound, and mammography all competing. The issue of analyzing data in the presence of competing screening tests is likely to become increasingly important with new emerging technologies and an emphasis on real-world comparative effectiveness studies using observational data. Our recommended analytic approach of censoring at the time of the competing screening test is straightforward, can be employed in both cohort and case-control studies, and is applicable to a variety of conditions that are detected by multiple screening modalities.

Footnotes

Acknowledgments

We thank Noel S Weiss (MD, DrPH) and V Paul Doria-Rose (DVM, PhD) for comments on early versions of the manuscript.

Declaration of conflicting interests

The authors have no conflicting interests to declare.

Funding

This work was supported by grants from the National Cancer Institute at the National Institutes of Health (UC2CA148576 to Buist and Doubeni; U54CA163261 to Rutter). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.

References

U. S. Preventive Services Task Force. Screening for colorectal cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med 2008; 1499: 627–37.

Tiro JA, Kamineni A, Levin TR, et al. The colorectal cancer (CRC) screening process in community settings: A conceptual model for the Population-based Research Optimizing Screening through Personalized Regimens (PROSPR) Consortium. Cancer Epidemiol Biomarkers Prev 2014; 23(7): 1147--1158.

National Cancer Institute. Table VI: colon and rectum cancer SEER Cancer Statistics Review, 1973–1995, National Institutes of Health (Bethesda, MD). (http://seer.cancer.gov/csr/1973_1993/colorect.pdf).

Joseph

King

Miller

. Prevalence of colorectal cancer screening among adults - behavioral risk factor surveillance system, United States, 2010. MMWR Surveill Summ 2012; 612: 51–6.

Bandi P, Cokkinides V, Smith RA, et al. Trends in colorectal cancer screening with home-based fecal occult blood tests in adults ages 50 to 64 years, 2000 to 2008. Cancer 2012.

Powell

Burgess

Vernon

. Colorectal cancer screening mode preferences among US veterans. Preventive Medicine 2009; 495: 442–8.

Singh

Nugent

Demers

. The reduction in colorectal cancer mortality after colonoscopy varies by site of the cancer. Gastroenterology 2010; 1394: 1128–37.

Whitlock EP, Lin JS, Liles E, et al. Screening for colorectal cancer: An updated systematic review. Rockville, Maryland, 2008.

Scholefield

Moss

Mangham

. Nottingham trial of faecal occult blood testing for colorectal cancer: a 20-year follow-up. Gut 2012; 617: 1036–40.

10.

Rex

Cummings

Helper

. 5-year incidence of adenomas after negative colonoscopy in asymptomatic average-risk persons [see comment]. Gastroenterology 1996; 1115: 1178–81.

11.

Brenner

Haug

Arndt

. Low risk of colorectal cancer and advanced adenomas more than 10 years after negative colonoscopy. Gastroenterology 2010; 1383: 870–6.

12.

Lieberman

Weiss

Harford

. Five-year colon surveillance after screening colonoscopy. Gastroenterology 2007; 1334: 1077–85.

13.

Smith

Young

Cole

. Comparison of a brush-sampling fecal immunochemical test for hemoglobin with a sensitive guaiac-based fecal occult blood test in detection of colorectal neoplasia. Cancer 2006; 1079: 2152–9.

14.

Fisher

Grubber

Castor

. Ascertainment of colonoscopy indication using administrative data. Dig Dis Sci 2010; 556: 1721–5.

15.

Whitlock

Lin

Liles

. Screening for colorectal cancer: a targeted, updated systematic review for the U.S. Preventive Services Task Force. Ann Intern Med 2008; 1499: 638–58.

16.

Zauber

Lansdorp-Vogelaar

Knudsen

. Evaluating test strategies for colorectal cancer screening: A decision analysis for the U.S. Preventive Services Task Force. Annals of Internal Medicine 2008; 1499: 659–69.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.05 MB