Abstract
Background and aims
With the exception of African Americans and Hispanics, few studies have dealt with the influence of other types of ethnicity on the prevalence of colon polyps and colorectal cancer. The present study was undertaken to compare the ethnic and socioeconomic distributions of colonic neoplasms among different ethnic groups in the United States.
Methods
A total of 813,057 patients, who underwent colonoscopy during 2008–2014, were recruited from an electronic database of histopathology reports (Miraca Life Sciences) for a cross-sectional study. Using multivariate logistic regression analyses, the presence of hyperplastic polyps, serrated adenomas, tubular adenomas, or adenocarcinomas each served as separate outcome variables. Patient ethnicity was determined using a name-based computer algorithm. Demographic (age, sex, ethnicity) and a variety of socioeconomic risk factors (associated with patients’ ZIP code) served as predictor variables.
Results
About 50% of the study population harbored adenomatous polyps, 25% hyperplastic polyps, 8% serrated adenomas, and 1.4% adenocarcinomas. Tubular adenomas and adenocarcinomas showed similar ethnic distributions, being slightly more common among Hispanics and East Asians. All four types of colonic neoplasm were relatively rare among patients of Asian-Indian descent and relatively common among patients of Japanese descent. Except for Japanese patients, serrated adenomas tended to be less prevalent among East Asians. In general, markers of high socioeconomic status showed a tendency to be negatively associated with the presence of tubular adenoma and adenocarcinoma, but positively with the presence of serrated adenoma.
Conclusion
Ethnicity and socioeconomic factors affect different histology types of polyps differently. Genetic as well as environmental factors interact in the development of colorectal cancer and its precursor lesions.
Keywords
Introduction
The ethnic variation in the occurrence of benign and malignant colonic neoplasms within different ethnic subgroups in the United States (US) population has continued to capture the interest of gastroenterologists and epidemiologists alike. Previous studies have suggested that the occurrence of colon polyps and colorectal cancer are more common among African Americans than Whites.1–5 While some studies have shown a decreased incidence and mortality of colorectal cancer among Hispanics, 6 other studies have found decreased, 4 similar,3,7 or even increased 1 prevalence rates of polyps among Hispanics when compared to Whites. Data from the Surveillance, Epidemiology, and End Results (SEER) program and the Vital Statistics of the US suggested a decreased incidence and mortality, respectively, of colorectal cancer among US residents of Asian descent. 5 Previous studies have been partly limited by small study populations and lack of stratification by histopathology.
The Miraca Life Sciences Database is an electronic repository of histopathologic patient records. Biopsy specimens are submitted to Miraca Life Sciences by approximately 1500 gastroenterologists from private practices distributed throughout the US. Each annual file contains the records of more than 200,000 patients who have undergone upper or lower gastrointestinal endoscopy with tissue biopsies. In the recent past, this database has been used for a variety of pathoepidemiologic studies.8–14 The aim of the present analysis was to use this unique database to study the ethnic variation in the occurrence of various types of colonic neoplasms.
Material and methods
Data source
The Miraca Life Sciences database contains demographic information and a detailed list of all results of surgical pathology reports, which are coded in a pre-defined, standardized, and searchable fashion. The database was searched for the records of all colonoscopies performed between January 2008 and December 2014. If a patient had multiple colonoscopies, only data from the chronologically first procedure were included. Colonoscopies were selected irrespective of their primary indication, such as screening, surveillance of known disease, or workup of new symptoms. In patients with colonic neoplasms, the neoplasms were categorized as hyperplastic polyps, sessile serrated adenomas, tubular adenomas, or adenocarcinomas. An individual patient with simultaneous occurrence of multiple neoplasms could thus contribute to multiple categories. Social history, medication list, and results of laboratory tests are infrequently provided with the pathology specimen and were, therefore, not used for the purpose of this study.
Unique patients with colonic tissue specimens were extracted from the database, and their demographic and histopathologic data were recorded. Socioeconomic information was available from census data associated with the patients’ postal address and Zoning Improvement Plan (ZIP) code. These data included population size per ZIP code, average number of people per household, average house value, average annual income, and percentage of residents with college education. Small population sizes may reflect on more exclusive residential areas. Based on complex computer algorithms, using first and last names, patients were also grouped by ethnicity as follows: Hispanic, Indian (Indian subcontinent), East Asian (Chinese, Korean, Japanese, and Vietnamese), Portuguese, Jewish, Arab, and other. The latter group included US residents (mostly Caucasians and African Americans) and patients not identified with any of the above groups. The algorithm has been explained in greater detail in a recent publication. 14
All data were derived from pre-existing records. No direct contact with either patients or providers was made and no individual patient information was revealed. All patient records were de-identified before being analyzed. For these reasons, the study protocol was exempted from the need for informed consent from its participants.
Statistical analyses
Patients were stratified by ethnicity and presence or absence of various types of colonic neoplasm. The prevalence of each type of neoplasm was expressed as a percentage of all patients belonging to the same ethnic category. In four different multivariate logistic regression analyses, the presence of hyperplastic polyps, serrated adenomas, tubular adenomas, or adenocarcinomas each served as separate outcome variables. Age, sex, ethnicity, population size, people per household, house value, annual income, and college education served as predictor variables. The influences of the predictors on each outcome were expressed as odds ratios (ORs) with their 95% confidence intervals (CIs). For continuous predictor variables, such as age, population size, and house values, the OR and CI were expressed per change over the entire range.
Results
Patient distribution by ethnicity and type of colonic neoplasm
About 50% of the population harbored adenomatous polyps, 25% hyperplastic polyps, 8% serrated adenomas, and 1.4% adenocarcinomas. Figure 1 contains the prevalence of the four histologic types by ethnic subgroup. Several general patterns were discernible. All four types of colonic neoplasm were relatively rare among Americans of Indian and Arab descent and relatively common among Americans of Japanese descent. Tubular adenomas and adenocarcinomas showed a similar ethnic distribution. They both tended to be slightly more common among Hispanics and East Asians. Except for Japanese Americans, serrated adenomas tended to be less prevalent among all East Asians.
Prevalence of hyperplastic polyps (HP), sessile serrated adenomas (SSA), tubular adenomas (TA), or colorectal cancer (CRC) in different ethnic groups.
With the notable exception of serrated adenomas, all types of colon neoplasms were more common in men than women (Figure 2). The prevalence of all types of colon neoplasm showed an age-dependent rise (Figure 3). This rise was most pronounced in tubular adenoma and colorectal cancer. The age-specific prevalence of serrated adenomas and hyperplastic polyps increased between the age groups 0–9 and 50–59 years and subsequently showed a smooth decline.
Gender distribution of hyperplastic polyps (HP), sessile serrated adenomas (SSA), tubular adenomas (TA), and colorectal cancer (CRC). Age distribution of hyperplastic polyps (HP), sessile serrated adenomas (SSA), tubular adenomas (TA), and colorectal cancer (CRC) in different ethnic groups.

Logistic fit for hyperplastic polyps
Odds ratio (OR) and 95% confidence interval (CI) per change in predictor variable over its entire range.
Logistic fit for serrated adenomas
Odds ratio (OR) and 95% confidence interval (CI) per change in predictor variable over its entire range.
Logistic fit tubular adenomas
Odds ratio (OR) and 95% confidence interval (CI) per change in predictor variable over its entire range.
Logistic fit for adenocarcinoma
Odds ratio (OR) and 95% confidence interval (CI) per change in predictor variable over its entire range.
The relationships between the ZIP-associated socio-demographic parameters and the prevalence of colonic neoplasia revealed multiple significant associations. The average number of people per household may be interpreted as an indirect marker for poverty or low socioeconomic status, whereas average income is indicative of affluence or high socioeconomic status. A high average number of people per household was positively associated both with tubular adenoma and adenocarcinoma, whereas high income was inversely associated both with tubular adenoma and adenocarcinoma. College education is generally also indicative of affluence or high socioeconomic status, but it was inversely associated only with adenocarcinoma and not tubular adenoma. Serrated adenomas seemed to behave oppositely to tubular adenoma and adenocarcinoma in that markers of high socioeconomic status, such as annual income and college education, were both positively associated with its prevalence.
The multivariate analyses largely confirmed the ethnic association described above, based on the prevalence rates. Being of Hispanic or East Asian origin (Japanese, Korean, Vietnamese, or Chinese), was associated with an increased OR for harboring tubular adenomas. Although similar relationships also applied to colonic adenocarcinoma, because of low case numbers, they failed to reach statistical significant for colorectal cancer in individual ethnic groups. With all East Asians combined, however, the OR reached statistical significance at OR = 1.15 (95% CI: 1.02–1.33). Jewish ancestry was inversely associated both with tubular adenoma and adenocarcinoma. As noted above, with the exception of Japanese Americans, serrated adenomas were inversely associated with East Asian descent. Being Hispanic was also associated with a low OR for serrated adenomas.
The ZIP-associated socio-demographic factors exerted a much weaker influence on hyperplastic polyps than on the other types of colonic neoplasia. As in other types of colonic neoplasms, the presence of hyperplastic polyps was negatively associated with Indian and positively associated with Japanese descent. Otherwise, hyperplastic polyps tended to be inversely associated with East Asian (except Japanese) and Hispanic ethnicity.
Discussion
The present study was undertaken to compare the ethnic and socioeconomic distributions of colonic neoplasms among different groups in the US. Our study showed that tubular adenomas and adenocarcinomas were characterised by similar ethnic distributions, being slightly more common among Hispanics and East Asians. Except for Japanese Americans, serrated adenomas tended to be less prevalent among all East Asians. In general, markers of high socioeconomic status showed a tendency to be negatively associated with the presence of tubular adenoma and adenocarcinoma, but positively with the presence of serrated adenoma.
There are only a few studies that have dealt with the influence of ethnicity on the prevalence of colon polyps and colorectal cancer. The majority of studies have focused on the occurrence of colonic neoplasms in African Americans and, to a lesser extent, Hispanics. Most studies have reported a higher prevalence of polyps in African Americans when compared to Whites,1–4 as well as higher incidence and mortality rates of colonic adenocarcinoma. 5 Higher incidence and mortality rates among African Americans seem to be also influenced by issues of access to and utilization of health care by the African American population.15–17 Besides ethnicity, socioeconomic factors have been shown to affect incidence, mortality, and survival of colorectal cancer patients. 18 Fewer studies have included a sufficient number of Hispanics for statistical analysis. In contradistinction with African Americans, the issue of colon neoplasm prevalence in Hispanics relatively to Whites is less well settled. Incidence and mortality of colorectal cancer appear to be slightly less than in Whites, 6 whereas the prevalence of polyps was found to be lower, as well as equal or even higher in Hispanics than in Whites.1,3,4,7,19 Lastly, based on few available sources, the prevalence of polyps among Asian Americans appears to be equal to those of Whites, whereas incidence and mortality of colorectal cancer appear to be slightly less than those of White Americans.3,5
In general, the differences between Hispanic or Asians and Whites are smaller than the difference between African Americans and Whites. Part of the problem in establishing the relative frequency of colonic neoplasm in ethnicities other than White and African American relates to the relatively small sample sizes available for patients of different ethnic background. In addition, the terms “Asian” and “Hispanic” include a variety of ethnically diverse people with very different cultural and genetic backgrounds. In many existing studies, a population consisting 90% of Chinese and 10% of Indian individuals would be listed as Asian, as would one consisting of 90% Indians and 10% Chinese. In these two populations many gastrointestinal conditions (e.g. preneoplastic gastric lesions 20 ) would be extremely different and the resulting data would be erroneously interpreted as discrepant. Our method of separating patients in highly specific ethnic groups largely avoids this pitfall, except in the case of Hispanics, who share a similar pool of names in Spain and all Spanish-speaking countries in South and Central America.
Lastly, the ethnic distribution may also vary in different types of colonic neoplasm. In the present study, tubular adenomas and adenocarcinomas were characterized by resembling patterns, probably reflecting the fact that the majority of adenocarcinomas originate from tubular adenomas. Both were slightly more common in Hispanics and East Asians, although the positive associations between colonic adenocarcinoma and Asians failed to reach statistical significance for each individual ethnic group analyzed separately. As a composite group, however, East Asian ethnicity was associated with a slightly but significantly higher prevalence of adenocarcinoma. There were also notable differences with the broad category of Asian Americans. Japanese patients were characterized by consistently high prevalence rates for all colon neoplasms, whereas Indian patients were characterized by consistently low prevalence rates. To complicate matters even further, serrated adenomas behaved differently from tubular adenomas. The prevalence of tubular adenomas was relatively high among East Asians, whereas the prevalence of serrated adenomas was relatively low. In general, serrated adenomas were influenced by other risk factors than tubular adenomas. Markers of high socioeconomic status tended to be negatively associated with tubular adenoma and adenocarcinoma, but positively with serrated adenoma.
Our study has several potential limitations. Because the primary data source was a pathology database, we had little if any access to any detailed information about a multitude of other risk factors, such as comorbidities, social or dietary habits, which have been found to influence the occurrence of colonic neoplasm. For instance, increased consumption of alcohol, animal fat and meat, and decreased consumption of milk and calcium have all been associated with an elevated risk for colonic neoplasm.21–23 Diabetes, obesity, gallstone disease, and smoking also increase the risk, whereas physical activity and the intake of nonsteroidal anti-inflammatory drugs decrease the risk.23–26
Data on socioeconomic status were based on patients’ place of residence, as evidenced by the ZIP code, rather than personal information. Many socioeconomic parameters are no longer entered into the medical record because they relate to sensitive or confidential information. To circumvent such potential limitations, it has become a common practice in epidemiologic research to use socioeconomic data available from a patient’s ZIP code as an alternative marker of socioeconomic status.27–29 However, more precise and personalized information would have probably accentuated the socioeconomic patterns observed by the present analysis.
In spite of the large patient population, in some instances, the number of cases per individual ethnic group still remained small. Because of the nature of this database, the individual ethnic groups were compared by their relative prevalence of colonic neoplasms. The true prevalence could not be calculated since patients without colonoscopy or patients without tissue samples would not be included in this histopathology database. These limitations are likely to have affected various ethnic groups alike. It cannot be ruled out, however, that the observed ethnic variations were also influenced by underlying variations in access to health care in general or colonoscopy in particular.
Computer algorithms to identify ethnicity by name have become widely used tools in health care research, anthropology, sociology, and other population studies.30–38 The technique has been shown to be characterized by high accuracy in correctly assessing ethnicity, with positive predictive values exceeding 95%. The presence of unidentified patients of Hispanic, Asian, or any other ethnicity within the comparison group of other Americans would have biased our statistical analyses toward the null hypothesis. Because our algorithms cannot be used to determine African Americans’ ancestry, we did not further stratify our comparison population.
These potential limitations of the database analysis need to be contrasted with its obvious advantages. One strength of our study relates to its large population of patients who were recruited from endoscopy practices distributed throughout the entire US. The present study included more ethnic subgroups than any previous study. Because the analysis relied on endoscopic and histopathologic findings by board-certified pathologists subspecialized in gastrointestinal pathology, the ascertainment of diagnoses can be assumed to be highly reliable. Different from previous analysis, we were able to stratify colon polyps into different subtypes. Moreover, the size of the database provided the opportunity to include an appreciable number of patients with colorectal cancer.
In conclusion, the present study confirms that prevalence of colonic neoplasms is influenced by patient ethnicity and socioeconomic risk factors. Ethnicity and socioeconomic factors affect different histology types of polyps differently. Genetic as well as environmental factors interact in the development of colorectal cancer and its precursor lesions.
Footnotes
Acknowledgments
Author contributions include the following: Study conception and design: RM Genta, A Sonnenberg and KO Turner; data analysis: A Sonnenberg and RM Genta; writing of manuscript: A Sonnenberg and RM Genta.
Conflict of interest
A Sonnenberg has no conflict of interest to declare. KO Turner and RM Genta are employed by Miraca Life Sciences, Irving, TX.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
