Abstract
Objectives
Cancer risk prediction may be subject to detection bias if utilization of screening is related to cancer risk factors. We examine detection bias when predicting breast cancer risk by race/ethnicity.
Methods
We used screening and diagnosis histories from the Breast Cancer Surveillance Consortium to estimate risk of breast cancer onset and calculated relative risk of onset and diagnosis for each racial/ethnic group compared with non-Hispanic White women.
Results
Of 104,073 women aged 40–54 receiving their first screening mammogram at a Breast Cancer Surveillance Consortium facility between 2000 and 2018, 10.2% (n = 10,634) identified as Asian, 10.9% (n = 11,292) as Hispanic, and 8.4% (n = 8719) as non-Hispanic Black. Hispanic and non-Hispanic Black women had slightly lower screening frequencies but biopsy rates following a positive mammogram were similar across groups. Risk of cancer diagnosis was similar for non-Hispanic Black and White women (relative risk vs non-Hispanic White = 0.90, 95% CI 0.65 to 1.14) but was lower for Asian (relative risk = 0.70, 95% CI 0.56 to 0.97) and Hispanic women (relative risk = 0.82, 95% CI 0.62 to 1.08). Relative risks of disease onset were 0.78 (95% CI 0.68 to 0.88), 0.70 (95% CI 0.59 to 0.83), and 0.95 (95% CI 0.84 to 1.09) for Asian, Hispanic, and non-Hispanic Black women, respectively.
Conclusions
Racial/ethnic differences in mammography and biopsy utilization did not induce substantial detection bias; relative risks of disease onset were similar to or modestly different than relative risks of diagnosis. Asian and Hispanic women have lower risks of developing breast cancer than non-Hispanic Black and White women, who have similar risks.
Keywords
Introduction
Screening mammograms are an important part of routine preventive care for women in many countries. In USA, annual or biennial screening is recommended for women at average risk for breast cancer.1,2 More frequent screening may improve benefit but increases the frequency of overdiagnosis and false-positive results. 3 Randomized trials are currently underway to evaluate risk-stratified screening strategies, which have, in model-based studies, shown promise in balancing screening benefits and harms.4,5 Development of effective targeted strategies is predicated on accurate assessment of a woman's breast cancer risk.
Models to predict breast cancer risk6–11 are typically informed by the incidence of breast cancer diagnosis; thus, their estimates are affected by population screening and diagnostic practices. Detection bias occurs when screening and diagnostic intensity vary across potential risk groups. The resulting models may overestimate cancer risk in subgroups of women with greater rates of screening and biopsy and conversely.12,13 In USA, variations in screening practices have been documented across racial/ethnic groups, with lower utilization among non-Hispanic Black, Hispanic, and Asian/Pacific Islander women.14–16 Observed breast cancer incidence is highest among non-Hispanic White women and lowest among Hispanic and Asian/Pacific Islander women. 17 Disparities in screening and biopsy may lead to differences in incidence of disease that are driven by practice rather than underlying risk. The extent to which observed risks of disease across racial/ethnic groups might reflect variations in screening intensity has not been documented.
While relative risks of breast cancer diagnosis may be subject to detection bias, the risk of developing disease—transitioning from a disease-free to a detectable preclinical state—is not in principle affected. Large breast cancer screening programs with individual-level screening histories and data on screen- and interval-detected incidence provide information that can be used to estimate the risk of developing disease within a model of disease natural history.18–22
In this study, we use individual-level screening and diagnosis histories from the Breast Cancer Surveillance Consortium (BCSC) to estimate the risk of developing preclinical breast cancer by race/ethnicity. We examine the extent to which relative risks of developing disease across racial/ethnic groups diverge from the corresponding relative risks of disease diagnosis to assess the magnitude of detection bias. We also conduct analyses of screening and biopsy to complement and confirm our findings. The BCSC cohort used in this study reflects a population of women who have had at least one screening mammogram. We also project the potential extent of detection bias in population settings with greater variations in screening access.
Methods
Study sample
The BCSC (https://www.bcsc-research.org) data used for natural history modeling includes screening mammograms for women aged 40–54 at their first screen. We used data from BCSC registries able to provide individual-level data for analysis by researchers who are not a part of the BCSC's Statistical Coordinating Center. Six registries contributed data for the period 2000–2018, with one additional registry contributing data for the period 2000–2009.
For the analysis of biopsy utilization in the BCSC, we used screening mammograms for women aged 40–79 from six BCSC registries. Women were excluded if they had a personal history of breast cancer, reported symptoms at the time of screening, or attended a facility that had <75% capture of biopsy results across the period 2009–2017. 23 Additional details regarding the BCSC data source are provided in Supplemental Material, part A.
Definitions
For each woman, we identified screening mammograms using the standard BCSC definition (https://www.bcsc-research.org/data/bcsc_standard_definitions). We censored screening histories at the first of diagnosis of breast cancer, death, age 89, or 2018 (or 2009 for women contributing data through 2009). We censored clinical cancer diagnoses 18 months following a woman's last screening mammogram in the data.
A cancer was considered screen detected if diagnosed within 90 days after a screening mammogram with a final Breast Imaging Reporting and Data System (BI-RADS) assessment of 4 (suspicious abnormality) or 5 (highly suspicious for malignancy) after all diagnostic work-up following a positive screening mammogram. We considered all other cancers to be clinically detected.
Analyses of screening and biopsy utilization
To assess differences in the time to next screen across race/ethnicity groups, we used a marginal Cox model with a robust sandwich estimator to account for multiple observations per woman. 24 We censored each mammogram at the first of diagnosis of breast cancer, death, age 89, 2018 (2009), or 5 years from the mammogram date.
We estimated relative risks of biopsy within 90 days following a positive screening mammogram (final BI-RADS assessment 4 or 5) by fitting a generalized estimating equations 25 log-binomial model, adjusting for age and race/ethnicity and clustering on mammography facility.
We used the cmprsk package 26 in R (version 4.0.2, R Foundation for Statistical Computing, Vienna, Austria) to generate cumulative incidence curves and SAS (version 9.4, SAS Institute, Cary, NC, USA) to model screening and biopsy utilization.
Model of underlying disease
We use a multi-state modeling approach to capture a woman's underlying breast cancer natural history. The three model states are S = {1 = no cancer, 2 = preclinical disease, 3 = clinical disease} (Supplemental Figure 1). Preclinical disease is cancer that has the potential to be detected by mammography screening. Clinical disease is breast cancer diagnosed on the basis of symptoms.
Let X(t) represent the disease state at time t. We assume that X(t) follows a time-homogeneous continuous time Markov chain, parameterized via an intensity matrix Λ that describes rates of transitions between states with an initial distribution of disease states π. Time is measured as the time since a woman's first screening mammogram.
Since no woman has clinical disease at the first screen by definition, the initial disease state distribution characterizes the woman's probability of having preclinical disease versus no cancer at the first screen. We allow
Model of observed BCSC data
While the transitions between model states are not observable, they influence the pattern of observed screen-detected and clinically detected cases. Therefore, we can learn indirectly about π2, λ12, and λ23 from the observed data on screening episode results and cancer diagnoses. Supplemental Material, part B details the statistical model for the observed data, which depends on the initial probability of having preclinical disease, the transition rates as defined above, and the sensitivity of the screening episode. A screening episode consists of a screening mammogram and all associated diagnostic work-up. We refer to a screening episode that led to a screen-detected cancer as a positive episode and define screening episode sensitivity as the probability of a positive episode in a woman with detectable preclinical disease.
We observe that a positive screening episode must include a positive mammogram (M+), a biopsy after the mammogram (B), and a positive biopsy (B+). The sensitivity of the episode, Pr(M+,B,B+ |cancer), can therefore be decomposed as
To estimate the natural history parameters, the model is fit by maximum likelihood to the observed data using the R package multistate (https://r-forge.r-project.org/projects/multistate/).
Measures of cumulative breast cancer risk and relative risk by race/ethnicity
To assess the magnitude of detection bias, we compare the cumulative risk of breast cancer diagnosis to the cumulative risk of disease onset. We quantify the risk of diagnosis by one minus the Kaplan–Meier probability of diagnosis for each race/ethnicity. We similarly calculate the cumulative risk of preclinical disease separately by race/ethnicity based on the model results. Cumulative risks and relative risks (denoted RRdx and RRonset, respectively) are assessed at 5 years after the first screen, with non-Hispanic White women as the reference group. Confidence intervals for each measure are estimated by bootstrapping (bootstrap sample size = 10,000).
Sensitivity analysis
We examined how RRonset varies under different screening episode sensitivities across racial/ethnic groups. Screening episode sensitivity may vary due to differences in the ability of the mammogram to identify latent disease or to differences in biopsy frequency following a positive test. We varied the relative sensitivity for non-Hispanic Black, Hispanic, and Asian women from 0.2 to 1.15 and re-estimated the natural history model parameters and corresponding RRonset measures.
Projection of detection bias in settings beyond the BCSC
In the BCSC sample used in this study, all women are enrolled in screening. In the general population, many women are not enrolled in screening. To explore detection bias in cancer risk prediction studies more generally, we project RRdx under differential screening attendance by group given risk of onset for each group estimated by our model. We present results for two scenarios (Supplemental Material, part C).
Results
Study sample
The BCSC study sample included 104,073 women, of whom 10.2% (n = 10,634) identified as Asian, 10.9% (n = 11,292) as Hispanic, 8.4% (n = 8719) as non-Hispanic Black, and 70.6% (n = 73,428) as non-Hispanic White (Table 1). The mean age at first mammogram was 43.2 years. Follow-up and number of screening mammograms differed by race/ethnicity. The median years of follow-up (years from first screen to cancer censoring date) was 3.6 across all racial/ethnic groups but was somewhat lower for Hispanic (1.5 years) and non-Hispanic Black (2.5 years) women. Overall, 64% of non-Hispanic White women had two or more mammograms compared to 50% of Hispanic women and 53% of non-Hispanic Black women.
Description of Breast Cancer Surveillance Consortium study sample.
Combines BI-RADS density categories heterogeneously dense = c and extremely dense = d.
Screening and biopsy utilization by race/ethnicity
The median time to next screen was 1.90 years (95% CI 1.88 to 1.91 years) for all women and ranged from 1.81 years (95% CI 1.79 to 1.83 years) for non-Hispanic White women to 2.29 years (95% CI 2.23 to 2.35 years) for non-Hispanic Black women. The 5-year cumulative incidence of repeat screening ranged from 69.1% for non-Hispanic Black women to 85.1% for Asian women (Supplemental Figure 2).
After adjusting for age, Hispanic and non-Hispanic Black women had lower hazard rates for next screening exam than non-Hispanic White women (Table 2). The probability of biopsy within 90 days following a positive screening mammogram was similar across racial/ethnic groups.
Hazard ratio for time to next screen and relative risk of biopsy within 90 days following positive screening mammogram and 95% confidence intervals by race/ethnicity.
Risks of diagnosis and onset by race/ethnicity
Figure 1(a) shows the Kaplan–Meier cumulative risk of breast cancer diagnosis after the first screen by race/ethnicity. Non-Hispanic White women have the highest risk of diagnosis, followed closely by non-Hispanic Black women, and trailed by Hispanic and Asian women. Panel b shows the estimated cumulative risk of breast cancer onset by race/ethnicity, based on the fitted natural history model (Supplemental Table 1). The curves show that non-Hispanic White and Black women have very similar risks, whereas those for Hispanic and Asian women are clearly lower. (The apparent linearity of the onset risk curves is a consequence of the distributions estimated by model.) We note a greater divergence in preclinical breast cancer rates than in diagnosis rates, likely because a non-negligible number of women with onset remain undiagnosed at 5 years.

Cumulative risk of breast cancer diagnosis and preclinical breast cancer. (a) Cumulative risk of diagnosis from Kaplan–Meier curves estimated from BCSC data. (b) Cumulative risk of preclinical breast cancer estimated from the natural history model. Curves correspond to risk for the mean age of 43.2 years in the BCSC study sample.
Based on the fitted natural history model, the estimated RRonset at 5 years is quite similar to RRdx across race/ethnicity groups (Table 3). For Hispanic women, RRonset is modestly lower than RRdx, while for Asian women, RRonset is modestly higher than RRdx. The relative risks of disease onset were similar in a model that categorized age as <50 versus ≥50 (Supplemental Table 3).
Relative risk of diagnosis and disease onset 5 years after the start of screening and 95% confidence intervals by race/ethnicity.
Sensitivity analysis
Figure 2 shows the relative risk of disease onset projected from the natural history model estimated with the BCSC sample under different assumptions about the relative sensitivity of screening episodes across racial/ethnic groups. While relative risks of onset for Asian and Hispanic women remain below one for relative sensitivities as low as 0.5, the relative risk for non-Hispanic Black women exceeds one (indicating higher risk of developing disease than non-Hispanic White women) if the relative sensitivity for Black women is 0.9 or below. This suggests that if sensitivity among non-Hispanic Black women is more than 10% lower than among non-Hispanic White women, the data that show similar risks of diagnosis could actually be consistent with a higher risk of disease onset among Black women.

Estimates of relative risk of preclinical disease onset at 5 years under varying assumptions about the screening episode sensitivity across minority racial/ethnic groups, relative to non-Hispanic White women. This reflects either differential mammography sensitivity to detect disease or probability of biopsy following a positive mammogram.
Detection bias beyond the BCSC
Projections of detection bias under more heterogeneous screening scenarios are summarized in Supplemental Material, part D.
Discussion
The problem of detection bias in cancer risk prediction has been recognized for some time,12,13,28 but broadly applicable methods for correcting it have to date been limited. When detection bias is due to differential screening and/or follow-up intensity, and individual screening and diagnosis histories are available, natural history modeling can be employed to move from empirical estimation of risk of disease diagnosis to the risk of preclinical onset.
In the present study, we used natural history modeling to evaluate the extent to which detection bias may impact breast cancer risk estimates in the BCSC. We found that Hispanic and Asian women have lower risks of developing preclinical disease, but risks among non-Hispanic White and Black women are similar. Our results remove the potential for detection bias due to differential screening and diagnostic intensity, providing credibility to observations of lower diagnosis risk in Hispanic and Asian women from the BCSC 6 and other population-based studies. 17
A key component of our study that both informs and supports our results is our analysis of screening and biopsy patterns. We find that biopsy utilization within 90 days after a positive mammogram is similar across racial/ethnic groups in the BCSC. While we observed some differences in time to repeat screening mammogram across groups, our model estimates that these differences are not enough to materially impact detection bias in the BCSC population. Indeed, in our projections for biennial screening, we find differences of only 5% between the relative risk of diagnosis and the relative risk of onset when screening frequency is reduced by as much as 25%. Non-Hispanic Black women in our sample had an estimated relative risk of onset that was 6% higher than the relative risk of diagnosis, as expected given reduced screening frequency. For Hispanic women, however, the results are less intuitive; the estimated relative risk of onset was 11% lower than the relative risk of diagnosis. The latter result may reflect minor misspecification of the natural history model in Hispanic women, which may be amplified by the fact that 50% of Hispanic women had only one screen.
While detection bias due to racial/ethnic differences in screening and biopsy utilization does not appear to be an issue in the BCSC data, our projections suggest the potential for bias under certain scenarios, in particular for studies with a short data collection window and studies with low screening attendance in groups other than non-Hispanic White women. In general, population-based studies may be subject to more bias given that some women in the overall population may not be screened at all. Other cancers with different sojourn times may have different levels of detection bias. A lengthy and variable sojourn time in cancers such as prostate cancer may exacerbate the detection bias problem if screening practices vary across risk factors such as race/ethnicity and family history.
Our method requires individual-level screening and diagnosis histories and specification of screening episode sensitivity to estimate the underlying natural history. Our method is thus applicable to screened cohorts like the BCSC but not to population-based registries like the Surveillance, Epidemiology, and End Results (SEER) registry that lack individual-level screening information. However, based on our analysis of the BCSC data and on our projections that considered broader population variation in access to mammography by race/ethnicity, we infer that relative risks of breast cancer based on SEER incidence data likely are accurate reflections of the relative risks of disease across racial/ethnic groups. Aleshin-Guendel et al. 29 used a similar, model-based approach in a prostate cancer study.
Our study has limitations that are due both to the data and analytic approach employed. We limited our primary study dataset to women with a first screening mammogram in the BCSC. However, surveys of mammography use in the broader population 30 have not shown variations in the frequency of mammography screening across racial/ethnic groups that differ markedly from what we observed in the BCSC. While the BCSC is an authoritative resource, Asian and Hispanic women in the BCSC may differ from Asian and Hispanic women more generally, e.g., in their country of origin and length of time in USA, factors known to impact mammography utilization.16,31,32 In addition, in our sample, follow-up for Hispanic and non-Hispanic Black women was shorter than that for Asian and non-Hispanic White women. Two BCSC registries that contributed data for our study became inactive during the study period, which limited follow-up for some women.
We assume that the preclinical sojourn time is similar across racial/ethnic groups. If this is not the case, for example, if some groups wait longer to seek care for a breast symptom due to lack of insurance or other reasons, their estimated relative risk of disease onset might be higher than our estimates. Estimated risk for Black women may differ from our results if Black women experience faster tumor growth than White women, as suggested by their more frequent diagnoses of triple-negative and advanced-stage disease, and their younger ages at diagnosis. 33 The impact of assuming a shorter sojourn time on the estimated relative risk of disease onset could vary depending on whether the data used for estimation reflect a largely unscreened or a highly screened population. In the case of the highly screened population that is the BCSC, we expect that assuming a shorter sojourn time among Black women would lead to a higher estimated relative risk of onset than we reported. However, published evidence regarding specific differences in mean sojourn time across racial and ethnic groups is scant. A recent study to identify equitable screening strategies for Black women in USA modeled sojourn time distributions conditional on ER (estrogen receptor) and HER2 (human epidermal growth factor 2) status but did not explicitly report differences in mean sojourn time for Black and White women.34,35
We treat the underlying sensitivity (probability of a positive mammogram given underlying disease) as constant across racial/ethnic groups but show in sensitivity analysis that our final inferences about relative risks of onset across racial/ethnic groups are robust. While models have been developed that accommodate variable natural histories and simultaneous estimation of screening test sensitivity, these models are prone to problems of non-identifiability, which occur when multiple sets of parameter estimates explain the observed data equally well. 36
We used age at baseline as a proxy for age cohort given that the average length of follow-up in our data is fairly short. We did not include an explicit effect of menopause but categorized age as <50 versus ≥50 in a sensitivity analysis and found similar relative risks of disease onset for women below versus above age 50.
Besides age, our model does not accommodate factors known to affect sensitivity, such as breast density, that may differ by race/ethnicity.37,38 In our sample, Asian women were more likely and Hispanic and non-Hispanic Black women less likely to have dense breasts than non-Hispanic White women. Dense breasts are known to reduce mammography sensitivity and may impact detection bias in racial/ethnic subgroups with greater density. We explored this possibility in the sensitivity analysis that allowed for reduced sensitivity in Asian women. Results suggested that screening episode sensitivity in Asian women would have to be half that of non-Hispanic White women before we could infer that they actually had increased risk of developing disease relative to non-Hispanic White women. This result strongly suggests that the finding of lower risk of developing breast cancer in Asian women is robust.
In conclusion, our study provides a framework for identifying and addressing detection bias in breast cancer risk estimates by race and ethnicity and implements this framework in the BCSC. Given growing utilization of cancer risk calculators for targeting screening and other disease control approaches, we advocate for greater awareness of the potential problem of detection bias and a collective intention to ensure that risk prediction studies conducted in screened populations focus on understanding risk of disease onset rather than risk of disease diagnosis.
Supplemental Material
sj-docx-1-msc-10.1177_09691413231180028 - Supplemental material for Risk of cancer versus risk of cancer diagnosis? Accounting for diagnostic bias in predictions of breast cancer risk by race and ethnicity
Supplemental material, sj-docx-1-msc-10.1177_09691413231180028 for Risk of cancer versus risk of cancer diagnosis? Accounting for diagnostic bias in predictions of breast cancer risk by race and ethnicity by Charlotte C Gard, Jane Lange, Diana L Miglioretti and Ellen S O’Meara, Christoph I Lee, Ruth Etzioni in Journal of Medical Screening
Footnotes
Acknowledgements
Declaration of conflicting interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: DLM received an honorarium from Society for Breast Imaging for keynote lecture in April 2019 and receives royalties from Elsevier. CIL received a Research Grant from GE Healthcare and receives textbook royalties from McGraw-Hill, Oxford University Press, Wolters Kluwer and payments from the American College of Radiology for JACR Deputy Editor duties.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Research reported in this publication was funded by the Partnership for the Advancement of Cancer Research, supported in part by National Cancer Institute grants U54 CA132383 (NMSU) and U54 CA132381 (Fred Hutch). Dr Lee's time was supported in part by R01CA266377.
The Breast Cancer Surveillance Consortium and its data collection activities are funded by grants from the National Cancer Institute (P01CA154292, U54CA163303), the Patient-Centered Outcomes Research Institute (PCS-1504-30370), and the Agency for Health Research and Quality (R01 HS018366-01A1). The collection of cancer and vital status data is supported in part by several state public health departments and cancer registries throughout USA (
).
All statements in this report are solely those of the authors and do not necessarily represent the views of the Patient-Centered Outcomes Research Institute, its Board of Governors or Methodology Committee, or those of the National Cancer Institute, the National Institutes of Health, or the Agency for Health Research and Quality.
Data availability
The data underlying this article will be shared on reasonable request to the corresponding author, with appropriate regulatory approvals.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
