Abstract
Objective
To estimate the expected magnitude of error produced by uncontrolled confounding from health behaviours in observational medical record-based studies evaluating effectiveness of screening colonoscopy.
Methods
We used data from the prospective National Institutes of Health American Association of Retired Persons (NIH-AARP) Diet and Health Study to assess the impact of health behaviour related factors (lifestyle, education, and use of non-steroidal anti-inflammatory drugs [NSAID]) on the association between colonoscopy and colorectal cancer (CRC) mortality. We first examined the difference between adjusted and unadjusted results within the cohort data, and then estimated a broader range of likely confounding errors based on the Breslow-Day approach that uses prevalence of confounders among persons with and without exposure, and the rate ratio reflecting the association between these confounders and the outcome of interest. As dietary factors and habits are often inter-correlated, we combined these variables (physical activity, body mass index, waist-to-hip ratio, alcohol consumption, and intakes of red meat, processed meat, fibre, milk, and calcium) into a “healthy lifestyle score” (HLS).
Results
The estimated error (a ratio of biased-to-true result) attributable to confounding by HLS was 0.959–0.997, indicating less than 5% departure from the true effect of colonoscopy on CRC mortality. The corresponding errors ranged from 0.970 to 0.996 for NSAID, and from 0.974 to 1.006 for education (all ≤3% difference). The results for other CRC screening tests were similar.
Conclusion
Health behaviour-related confounders, either alone or in combination, seem unlikely to strongly affect the association between colonoscopy and CRC mortality in observational studies of CRC screening.
Introduction
Colonoscopy is the most commonly used procedure for colorectal cancer (CRC) screening in the United States. 1 Unlike other tests currently recommended for CRC screening,1,2 such as fecal occult blood testing (FOBT)3–5 and flexible sigmoidoscopy,6–8 its use is not currently supported by evidence from randomized trials. Although randomized controlled trials of screening colonoscopy are under way, their results will not be available until at least 2022–2025.9–11 Therefore, in the foreseeable future, colonoscopy screening policy will be based on evidence from observational studies, where unmeasured confounding is always a concern.12–14
Lifestyle factors, such as physical activity, diet, and alcohol consumption are associated with CRC incidence and, to a lesser extent, CRC mortality.15–17 Studies also show that use of aspirin and other non-steroidal anti-inflammatory drugs (NSAIDs) are associated with a decrease in CRC incidence and mortality. 18 Another factor demonstrably associated with CRC mortality is education.19–21 Lifestyle factors, NSAID use and education are also presumed to be related to healthier behaviours that include screening practices, and for this reason all three may act as confounders in the association between colonoscopy and mortality.22–30 However, these variables are often not available in medical records.
This study investigates to what extent unmeasured confounding by lifestyle-related risk factors, education, and the use of NSAIDs may affect the association of colonoscopy and other CRC screening methods with mortality in observational studies. The results should help future research, and may also be useful in interpreting the previously published observational studies.13,25,31
Methods
Overview
The three categories of possible confounders examined in this study are lifestyle (including diet and habits), education, and use of NSAID. As dietary factors and habits are often inter-correlated, we combined these variables into a “healthy lifestyle score” (HLS).
The data for analyses came from the prospective National Institutes of Health American Association of Retired Persons (NIH-AARP) Diet and Health Study. We first estimated the differences in results for crude, partially adjusted, and fully adjusted models evaluating the association between colonoscopy and mortality within the NIH-AARP cohort. This approach allowed us to assess the impact of confounding that is specific to the NIH-AARP population.
To allow generalization of results beyond the NIH-AARP cohort, we estimated a range of likely confounding errors based on the Breslow-Day method. 32 With the Breslow-Day approach, magnitude of confounding error is calculated based on the difference in prevalence of the confounder among persons with and without the exposure of interest (denoted P1 and P0, respectively), and based on the rate ratio (RRconfounder) reflecting the association between the confounder and the outcome of interest (in this case CRC mortality). The input parameters for these calculations were obtained from the NIH-AARP data. To provide a range of possible confounding error we used the upper and lower 95% limits, rather than point estimates, in all calculations.
Study population
Established in 1995–1996, the prospective NIH-AARP Diet and Health Study mailed out 3.5 million questionnaires to persons between the ages of 50 and 71. The questionnaire asked about demographic characteristics, diet, and other health related behaviours. 33 A follow-up questionnaire that included questions on cancer screening practices was mailed in 1996–97 to 542,095 subjects who responded to the initial mail-out, and 334,906 valid questionnaires were returned.
We excluded cohort members if they: had questionnaires filled out by a proxy (n = 10,383); reported poor health (n = 4,983) or a history of cancer (n = 4,391); provided uninformative or contradictory responses to the CRC screening questions (n = 10,286); or had missing data on lifestyle score factors (n = 147,941) or aspirin/NSAID use (n = 1,482). Thus, the final analytic data set contained 155,440 subjects.
Data on colorectal cancer screening
The NIH-AARP study collected information on whether, in the past three years, a participant underwent testing for blood in the stool and if he or she had received any of the following procedures: flexible sigmoidoscopy, colonoscopy, proctoscopy, or an endoscopy of unidentified type.
These items were used to define a dichotomous variable “colonoscopy”, and two additional derived variables categorized participants with respect to their receipt of “FOBT” and “any CRC screening” procedure. Each screening procedure was compared with the “no screening” reference category
Colorectal cancer deaths & follow-up time
Causes of death were ascertained by linking cohort data to the National Death Index through 31 December 2008. Participants who died from CRC were identified using the International Classification of Diseases 9th edition (ICD-9) codes 153.0–154.8 or ICD-10 codes C18.0–C20.0 for underlying cause of death. The time under observation (mean = 11 years) was calculated by subtracting each subject’s date of return for the follow-up questionnaire from the date of death, date of move outside of cohort residency, or 31 December 2008, whichever was the earliest. A total of 602 cohort members died from CRC, with a cumulative 10-year mortality of 0.4%, which is in agreement with national estimates. 34
Healthy lifestyle score
As lifestyle characteristics are often inter-correlated, we combined candidate variables into a single a priori “healthy lifestyle score” (HLS) and used the resulting score as the confounder of interest. The findings of the 2007 expert report issued jointly by the American Institute for Cancer Research (AICR) and the World Cancer Research Fund (WCRF) 15 served as the basis for selecting the candidate risk factors. The AICR/WCRF report identified ten lifestyle-related factors for which the association with CRC incidence was deemed to be “convincing” or “probable”. Nine of these factors – body mass index (BMI), waist-to-hip ratio, intakes of alcohol, red meat or processed meat products, physical activity, and consumption of milk, calcium and fibre – were incorporated into the HLS. Garlic intake was also listed in the AICR/WCRF report, but was not included in the score because it was not measured in the NIH-AARP questionnaire.
The nine variables were combined into a one-dimensional score, henceforth called the “unweighted HLS”. Each of the nine components was divided into gender-specific quartiles where the “healthiest” quartile of each variable (lowest for the detrimental factors and highest for the beneficial factors) was given 4 points, and the “unhealthiest” quartile was given 1 point. The HLS quartile cutoffs for each variable were based on the total study population (n = 304,863), not the analytical cohort (n = 155,440), but the differences between the two sets of cutoffs were minimal. The nine components were then summed to produce the overall unweighted score with a possible range from 9 through 36, where higher HLS values were hypothesized to confer a lower risk of CRC.
For the alternative version of the score, henceforth called the “weighted HLS”, we combined diet- and body composition-related variables into separate sub-scores and the two sub-scores were then included along with physical activity into the overall HLS. With the exception of physical activity, all other variables in the weighted HLS were expressed for each study subject as the number of standard errors between the mean and the observed value. The standardization was performed separately for men and women. The sex-specific standardized variables for BMI and waist-to-hip ratio were summed to form the body composition sub-score, multiplied by −1 to maintain consistent direction of the hypothesized association, and the resulting sub-score was divided into quartiles. A similar approach was used to develop a diet sub-score that was based on intakes of red and processed meat, milk, calcium, fibre and alcohol. The standardized values for harmful factors (red and processed meat, alcohol intake) were also multiplied by −1.
Statistical analyses
To assess the impact of confounding on the association between colonoscopy and CRC mortality specifically in the NIH-AARP cohort, we constructed Cox PH models that evaluated this association in several ways: 1) without any adjustment, 2) adjusting for a partial list of covariates that included age, sex, and hormone replacement therapy use (HRT) coded as a single three-category variable (male, female with HRT and female without HRT), education, race, diabetes, family history of CRC, 3) by adding HLS, education, and NSAID use to the model, and 4) by including all covariates in the model. Another set of analyses was conducted for FOBT and for “any screening”.
Estimating impact of confounding using Breslow-Day method
Breslow and Day
32
mathematically describe the effect of a dichotomous confounding variable on the association between dichotomous exposure and outcome variables as:
If the impact of an unmeasured confounder is unknown, it is possible to calculate a plausible range of confounding error expressed as the ratio RRbiased/RRtrue or simply RRbiased (if RRtrue is assumed to be 1.0) by estimating a range of likely values for RRconfounder, P1 and P0. In this study, the exposure of interest is colonoscopy, the outcome of concern is CRC mortality, and the confounders under consideration are a composite measure of lifestyle termed the “healthy lifestyle score’ (HLS) and regular (at least weekly) use of NSAIDs. All parameters and their ranges are estimated using the data from the NIH-AARP Diet and Health Study cohort.
To obtain rate ratio estimates for the relation between HLS and CRC mortality (denoted RRconfounder in equations 1, 2 and 3), we divided the unweighted HLS into ordinal quartiles and the weighted HLS into alternative non-quartile a priori cutoffs. For each HLS score, we examined the association with CRC mortality using Cox proportional hazards (PH) models that adjusted for age, sex, HRT, education, race, diabetes, family history of CRC, aspirin/NSAID use and any CRC screening. 35 Each RRconfounder estimate was accompanied by the corresponding 95% confidence interval (CI). All modelling was performed using SAS 9.2 statistical software, and all models were examined for PH assumptions, interactions, co-linearity, and goodness-of-fit.
We also obtained estimates of P1 and P0 (ie. proportions of persons in each ordinal category of HLS values among the colonoscopy screened and non-screened study subjects, respectively) and their corresponding 95% CIs using OpenEpi statistical calculator. 36 We then used the upper and lower limits of the 95% CI for RRconfounder, and P1 and P0 estimates to calculate a range of likely confounding errors introduced by HLS (equation 3) and by the dichotomous aspirin/NSAID variable (equation 2). The same analyses were also performed for FOBT and for “any screening”.
Results
Assessment of confounding within the NIH-AARP cohort
Vital status, demographic factors and medications among NIH-AARP cohort members with and without colorectal cancer screening history.
Numbers do not add up to total due to missing values.
The observed effects of confounding by HLS and NSAID use on the association between colonoscopy and CRC mortality in NIH-AARP cohort were rather small. The unadjusted RR for a Cox proportional hazards model comparing colonoscopy screening in the previous three years with no screening was 0.465 (95% CI: 0.346–0.625). Controlling for all covariates, with the exception of HLS and NSAID use, produced a partially adjusted RR of 0.402 (95% CI: 0.297–0.545), indicating that colonoscopy was associated with a 60% reduction in CRC mortality over 10 years. After including HLS in the model, the RR was 0.405 (95% CI: 0.299–0.549) and did not differ by HLS type (error 0.402/0.405 = 0.993). The results with and without adjustment for NSAID use produced an identical error estimate. When adjusting for both HLS and NSAID use, the error estimate was 0.990 (0.402/0.406). Analyses for FOBT and “any screening” produced very similar results (data not shown).
Healthy Lifestyle Score (HLS) components expected to be associated with increased colorectal cancer risk among NIH-AARP cohort members with and without colonoscopy screening history.
M = males, F = females.
Lifestyle Score (LS) components expected to be associated with decreased colorectal cancer risk among NIH-AARP cohort members with colonoscopy screening history.
M = males, F = females.
The same cutoffs used for males and females.
RRconfounder estimates reflecting the relation of confounding factors of interest (Lifestyle Score and aspirin/NSAID use) to colorectal cancer mortality.
HLS = Healthy Lifestyle Score.
Adjusted for age, sex/HRT, education, race, diabetes, family history of CRC, history of screening and aspirin or NSAID use.
Adjusted for age, sex/HRT, race, diabetes, family history of CRC, history of screening HLS, and aspirin/NSAID use.
Adjusted for age, sex/HRT, education, race, diabetes, family history of CRC, history of screening and HLS.
Assessment of confounding using Breslow-Day Method
Range of P0, P1, and confounding error estimates in the association between colorectal cancer screening and mortality.
HLS = Healthy Lifestyle Score.
Discussion
All calculated confounding error estimates in this study were small, and almost all in the hypothesized direction. On balance, our data and the results reported by others16,17,38–40 indicate that it would take unrealistic differences between P1 and P0, and very pronounced RRconfounder values to produce a confounding error of substantial magnitude. For example, for a dichotomous confounding factor to produce an error of 20%, assuming RRconfounder of 0.7, P0 of 10%, 20% and 30%, the factor would require P1–P0 differences of 65% (75%–10%), 63% (83–20%) and 61% (91%–30%), respectively. The same calculations for a much stronger RRconfounder of 0.5 would require the corresponding P1–P0 differences of 38%, and 36% and 34%, lower, but still clearly outside the realistically expected range.
Our analyses also showed that confounders such as age, sex, HRT, race, and family history of CRC should always be considered because in the AARP data they changed the RR estimate by 13.5% (from 0.465 to 0.402). It is important to point out, however, that in contrast to HLS and NSAID use, these other factors are less likely to remain uncontrolled because usually they can be ascertained from the medical records and/or most questionnaires.
The main limitation of the current analysis is the large number of subjects excluded from the original cohort because of missing information due to incompletely filled out study questionnaires. It is important to note, however, that quartile cutoffs were based on the entire underlying study population, and yet for all HLS components, except BMI, the final analysis cohort was still divided into four almost equal groups, indicating that the loss of data may have occurred at random.
Ascertainment of screening practices was limited to three years prior to questionnaire administration, a feature that may have led to misclassification of screening status. Subjects who underwent a colonoscopy or sigmoidoscopy within the recommended screening timeframe, but outside the three-year window, or were screened after the date of the questionnaire, would have been classified as unscreened, and would therefore have been likely to have biased our RRconfounder estimates towards the null. The analysis for FOBT produced results with respect to the role of confounding almost identical to those observed for colonoscopy. On the other hand, the FOBT data in this study do not indicate whether the test was completed at home or at the physician’s office (a previously common practice that is now deemed inadequate).
Another important limitation that may have affected the observed association between colonoscopy and CRC mortality in the NIH-AARP study is exclusion of subjects with a history of cancer. This may have exaggerated the protective effect of colonoscopy on CRC mortality. It is unlikely, however, that this limitation affected the estimates of confounding error, the main parameter of interest in this study.
Conclusion
In summary, uncontrolled or unmeasured lifestyle-related confounders, either alone or in combination, are unlikely to produce a large spurious association between colonoscopy (or other methods of screening) and CRC mortality. Although limited to a single, albeit large, national cohort, our conclusions about HLS and NSAID are reasonably generalizable because we used the upper and lower 95% limits rather than point estimates in all calculations, and because measures of association in our study are similar to those reported elsewhere.14,15,23,27,28,32–34 It appears that other sources of systematic error, such as non-random selection of participants or misclassification of screening status, are of greater concern than uncontrolled confounding in observational studies of colonoscopy.
Footnotes
Funding
This work was supported by grant # U01CA151736 from the National Cancer Institute of the National Institutes of Health.
