Abstract
Objective
This study aimed to explore the risk factors for colorectal sessile serrated lesions and construct a risk nomogram model.
Methods
Patients were enrolled retrospectively from the Affiliated Xuzhou Municipal Hospital of Xuzhou Medical University from January 2019 to September 2023 and randomized to the training and validation sets at a ratio of 7:3. The predictors for constructing the nomogram model were screened via univariate analysis and multivariate logistic regression analysis. Subsequently, the performance of the model was evaluated.
Results
Multivariate logistic regression analysis revealed that age, history of smoking, history of alcohol consumption, and triglyceride–glucose index were independent risk factors for colorectal sessile serrated lesions (p < 0.05). The area under the curve values of the nomogram model in the training and validation sets were 0.715 (95% confidence interval: 0.676–0.753) and 0.742 (95% confidence interval: 0.669–0.815), respectively. The calibration curves showed good homogeneity between the predicted and actual values. Decision curve analysis showed that this nomogram model can achieve positive clinical benefits.
Conclusions
Age, history of smoking, history of alcohol consumption, and triglyceride–glucose index are independent predictors of colorectal sessile serrated lesions. This nomogram model may predict the risk of colorectal sessile serrated lesions.
Introduction
Colorectal cancer (CRC) is a malignant tumor originating from the colorectal mucosal epithelium. According to the Global Cancer Statistics 2022, CRC ranked third in the number of new cases and second in the total number of deaths worldwide. In Asia, the proportion of malignant tumors with low survival rates, such as digestive tract tumors, is higher than that in other parts of the world. In China, approximately 517,000 new cases of CRC and 240,000 associated deaths were reported in 2022, ranking second and fourth among all malignant tumors, respectively. Therefore, CRC currently poses a substantial disease burden. 1 In addition to the traditional adenomatous polypcarcinoma pathway, the serrated pathway is an important carcinogenesis pathway. It has been reported that approximately 20%–30% of sporadic CRC cases occur through the serrated pathway.2–4 Serrated polyps refer to the serrated structures in crypts observed under a microscope. According to the World Health Organization guidelines (2019), serrated polyps can be classified into three categories: hyperplastic polyps, sessile serrated lesions (SSLs), and traditional serrated adenomas. 5
Colorectal SSLs account for approximately 5%–15% of the serrated polyps. These lesions are flat and irregular with unclear boundaries under endoscopy. The surface of the lesion is often covered with a mucus cap or exhibits a cloud-like appearance. Approximately 75%–90% of colorectal SSLs are located in the proximal colon, making it difficult to detect via colonoscopy, with a higher rate of missed diagnosis and incomplete resection under endoscopy. 6 Therefore, colorectal SSLs are considered the main precancerous lesion of the serrated carcinogenesis pathway and one of the important lesions in the occurrence of interphase CRC. Screening of high-risk groups for SSLs may reduce the incidence of CRC, which could mitigate the adverse effects of colorectal malignancies. However, the current research data on the clinical risk factors for colorectal SSLs remain limited. This study aimed to explore the risk factors for colorectal SSLs by analyzing the clinical data of patients with colorectal SSLs and healthy controls as well as construct and verify a nomogram model to predict the risk of colorectal SSLs.
Materials and methods
General information
Sample size estimation was performed using the R software “4-step method” loaded into the ‘pmsampsize’ package. At least 759 cases were estimated for inclusion in the sample. This study included 310 patients who were diagnosed with colorectal SSLs by a pathologist between January 2019 and September 2023. The inclusion criteria were as follows: (a) aged ≥18 years; (b) good bowel preparation (Boston Bowel Preparation Scale ≥6 points) for the completion of colonoscopy and pathological biopsy; and (c) complete clinical data. The exclusion criteria were as follows: (a) incomplete colonoscopy or failure to reach the ileocecal region; (b) poor bowel preparation that affected observation; (c) patients with CRC or other intestinal diseases, such as inflammatory bowel disease, colorectal ulcer, and familial adenomatous polyp syndrome, who underwent colonoscopy; (d) history of colorectal surgery; (e) history of malignant tumors; and (f) combined with infection or serious diseases of the heart, brain, liver, kidneys, and other organs or other diseases that affected the study outcome. A total of 626 healthy adults who underwent routine endoscopies were selected from the endoscopy center during the same period. Colonoscopy revealed no obvious abnormalities.
Methods
The following clinical data of the participants were collected retrospectively: (a) general clinical data: sex, age, height, weight, body mass index (BMI), history of chronic diseases (hypertension, diabetes), and lifestyle habits (smoking, alcohol consumption); (b) laboratory test results: white blood cell count (WBC), red blood cell count (RBC), lymphocyte count (LYM), neutrophil count (NEU), eosinophil count (EOS), basophil count (BAS), monocyte count (MON), hemoglobin (Hb) level, blood glucose level, total cholesterol (TC) level, triglyceride (TG) level, low-density lipoprotein (LDL) level, high-density lipoprotein (HDL) level, TC/HDL, TG/HDL, and LDL/HDL. The neutrophil/lymphocyte ratio (NLR) and triglyceride–glucose (TyG) index were calculated. The reporting of this study conforms to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines. 7 This study was conducted following the Helsinki Declaration of 1975 as revised in 2013. This study was approved by the Affiliated Xuzhou Municipal Hospital of Xuzhou Medical University Research Ethics Committee (approval no. 2023-KY-059) in August 2023. Written informed consent was obtained from the participants before starting colonoscopy. Patient data were not shared with third parties. All patient details have been de-identified and may not be identified in any way.
BMI refers to the index obtained by dividing the body weight (kg) by the square of body height (m2). History of alcohol consumption is defined as still regular drinking, with an average of ≥1 time per week for >6 months. History of smoking is defined as continuous or cumulative smoking for >6 months. According to the Chinese Guidelines for the Prevention and Treatment of Hypertension (2018 Revision), history of hypertension is defined as a systolic blood pressure level of ≥140 mmHg and/or diastolic blood pressure level of ≥90 mmHg, measured three times in the clinic on different days in patients who did not take antihypertensive drugs in the past or were currently receiving antihypertensive medication. History of diabetes is defined as follows: (a) random intravenous blood glucose level of ≥11.1 mmo1/L; (b) fasting blood glucose level of ≥7.0 mmol/L; (c) postprandial blood glucose level of ≥11.1 mmoL/L or blood glucose level of ≥11.1 mmoL/L at 2 h after oral glucose tolerance test of 75 g anhydrous glucose; (d) glycosylated Hb level of ≥6.5%; and (e) hyperglycemic patients with a history of diabetes who received medical treatment. NLR refers to the ratio of neutrophils to lymphocytes, which suggests a relationship between the systemic inflammatory response state and the progression of colorectal tumors. Neutrophils and lymphocytes are two types of inflammatory cells in the body. Inflammatory reactions lead to abnormal levels of blood inflammatory indicators, such as neutrophils and lymphocytes. The increased levels of neutrophils produce various cytokines and chemokines to promote tumor progression. Lymphocytes are involved in the body’s immune response. Inflammation leads to decreased lymphocyte count and weakens the body’s antitumor ability, which results in the proliferation of tumor cells. There are many studies on NLR in neoplastic diseases. Some studies have indicated that NLR has a certain diagnostic value for colorectal tumors.8–10 The TyG index is calculated using the following formula: TyG index = ln [TG (mmol/L) × 88.545 × fasting blood glucose (mmol/L) × 18/2]. The TyG index is widely used to assess insulin resistance and is closely related to an increased risk of cardiovascular disease. 11 Moreover, studies have indicated that TyG index is associated with an increased risk of colorectal tumors.12–14
Statistical methods
Data processing and statistical analysis were performed using R4.3.1 (R Software Programs (https://cran.r-project.org/web/packages/)) and SPSS Statistics for Windows, version 25.0 (IBM Corp., Armonk, N.Y., USA). Normally distributed quantitative data were expressed as mean ± SD and analyzed via independent sample t-test. Non-normally distributed quantitative data were expressed as median (quartile) and analyzed via the Mann–Whitney U test. Qualitative data were expressed as percentages and analyzed via Pearson’s chi-square test or Fisher’s exact probability test. In univariate analysis, statistically significant variables were further analyzed via multivariate logistic regression to explore the independent risk factors for colorectal SSLs. A nomogram model was constructed. A receiver operating characteristic curve was plotted to analyze the predicted values of the nomogram model. A calibration curve was constructed to analyze the calibration degree between the predicted results and the actual observation results. Decision curve analysis (DCA) was performed to evaluate whether the constructed nomogram model has clinical benefit for use in clinical applications. p-values of <0.05 were considered to indicate statistical significance.
Results
Clinical baseline characteristics
In this study, 936 participants were randomized to the training set (n = 680) and validation set (n = 256) at a sampling ratio of 7:3. Table 1 shows the comparison of participants’ clinical baseline data between the training and validation sets. Among the included variables, no statistically significant differences were observed between the training and validation groups (p > 0.05), indicating that the grouping was reasonable.
Clinical baseline characteristics in the training and validation sets.
BAS: basophil count; BMI: body mass index; EOS: eosinophil count; Hb: hemoglobin; HDL: high-density lipoprotein; LDL: low-density lipoprotein; LYM: lymphocyte count; MON: monocyte count; NEU: neutrophil count; NLR: neutrophil/lymphocyte ratio; RBC: red blood cell count; TC: total cholesterol; TG: triglyceride; TyG index: triglyceride–glucose index; WBC: white blood cell count.
Comparison of clinical data between patients with colorectal SSLs and healthy controls in the training set
Table 2 illustrates statistically significant differences between patients with colorectal SSLs and healthy controls in the training set with regard to age, sex, BMI, history of smoking, history of alcohol consumption, WBC, NEU, NLR, MON, RBC, Hb level, blood glucose level, TG level, TC/HDL, TG/HDL, and TyG index (p < 0.05). Patients diagnosed with colorectal SSLs were older, mostly male, had a history of smoking and alcohol consumption, and had higher levels of BMI, WBC, NEU, NLR, MON, RBC, Hb, blood glucose, TG, TC/HDL, TG/HDL, and TyG index. In contrast, there were no statistically significant differences between the two groups with regard to other measures such as LYM, EOS, BAS, TC, LDL/HDL, and history of diabetes and hypertension (p > 0.05).
Comparison of baseline characteristics between patients with SSLs and healthy controls in the training set.
BAS: basophil count; BMI: body mass index; EOS: eosinophil count; Hb: hemoglobin; HDL: high-density lipoprotein; LDL: low-density lipoprotein; LYM: lymphocyte count; MON: monocyte count; NEU: neutrophil count; NLR: neutrophil/lymphocyte ratio; RBC: red blood cell count; TC: total cholesterol; TG: triglyceride; TyG index, triglyceride–glucose index; WBC, white blood cell count.
Screening of independent risk factors for the occurrence of colorectal SSLs
The statistically different variables obtained from the above univariate analysis were included in the multivariate logistic regression analysis. The Hosmer–Lemeshow test showed a good fitting degree (χ2 = 4.308, p = 0.828). As shown in Table 3, age, history of smoking, history of alcohol consumption, and TyG index were independent risk factors for the occurrence of colorectal SSLs (p < 0.05). The risk of colorectal SSLs increased (odds ratio (OR): 1.048, 95% confidence interval (CI): 1.029–1.067) with aging (p < 0.05). Compared with healthy controls, patients with a history of smoking and alcohol consumption had a nearly doubled risk of SSLs (OR: 1.754, 95% CI: 1.055–2.916 and OR: 1.996, 95% CI: 1.144–3.483, respectively; p < 0.05). In addition, the TyG index was an independent risk factor for the occurrence of colorectal SSLs (Figure 1).
Multivariate logistic regression analysis of SSLs in the training set.
BMI: body mass index; CI: confidence interval; Hb: hemoglobin; MON: monocyte count; NEU: neutrophil count; NLR: neutrophil/lymphocyte ratio; OR: odds ratio; RBC: red blood cell count; TG: triglyceride; TyG index: triglyceride–glucose index.

Forest plot. Multivariate logistic regression of colorectal sessile serrated lesions in the training set. Four statistically significant variables screened via multivariate logistic regression analysis were age, history of smoking, history of alcohol consumption, and TyG index. CI: confidence interval; OR: odds ratio; TyG: triglyceride–glucose.
Construction and verification of the nomogram model
Statistically significant variables were screened using multivariate logistic regression analysis to construct a nomogram model to predict the risk of colorectal SSLs (Figure 2). The scores of each variable were added to determine the risk of colorectal SSLs corresponding to the total score.

Nomogram for the prediction of colorectal SSLs. For each variable, a vertical line was drawn from its respective axis upward to intersect the corresponding score line to determine the score associated with the variable. The individual scores of all variables were then added to obtain the total score. A vertical line was subsequently drawn from the total score axis downward to intersect the probability axis to determine the predicted probability of SSLs. TyG: triglyceride–glucose; SSL: sessile serrated lesion.
The area under the curve (AUC) values predicted by the nomogram model were 0.715 (95% CI: 0.676–0.753) in the training set and 0.742 (95% CI: 0.669–0.815) in the validation set (Figure 3).

Receiver operating characteristic curves for the probability of colorectal sessile serrated lesions in the training (a) and validation (b) sets. AUC: area under the curve.
Calibration curve analysis further showed that the predicted results of the nomogram model were in good agreement with the actual observation results. The Hosmer–Lemeshow test at p > 0.05 suggested a good fitting model (Figure 4).

Calibration curves for the risk nomogram model of colorectal sessile serrated lesions in the training (a) and validation (b) sets.
DCA showed that the decision curve was above the None line and All line. The model provided a positive net benefit within a certain threshold probability range. This affirmed the model’s potential ability for risk assessment of colorectal SSLs in clinical practice (Figure 5).

Decision curve analysis for the risk nomogram model of colorectal sessile serrated lesions in the training (a) and validation (b) sets.
Discussion
Serrated colorectal polyps were first reported by Morson in 1962 and included metaplastic or hyperplastic polyps, which were considered non-neoplastic lesions lacking malignant potential at that time. 15 Researchers gradually discovered other similar lesions such as SSLs, which are neoplastic lesions with cancerous potential and the main precancerous lesions of serrated CRC. They are also a significant contributing factor to the occurrence of postcolonoscopy CRC. 16 The natural progression of SSLs is a long-term process that usually takes 10–15 years to develop into dysplasia and eventually into malignancy.17–19 However, the serrated pathway tends to be cancerous at a faster rate compared with CRC developed via the classical adenoma–carcinoma pathway. Once SSLs appear in areas with dysplasia, the lesion may develop into CRC within a short time, with a poor overall prognosis.20,21
It is currently believed that early detection and treatment of colorectal precancerous lesions is one of the effective ways to reduce the incidence and mortality of CRC. The science popularization and screening of CRC have been conducted as early as possible in European and American countries as well as in Japan and South Korea. Therefore, CRC screening is generally more popular in these countries, with a 5-year survival rate of ≥64%. However, in China, CRC screening started late, the science popularization work has been inadequate, and the population acceptance is generally low. Therefore, the early diagnosis rate of CRC is low in China, with a 5-year survival rate of only 57.6%.22,23 Colonoscopy is currently the most important method for screening colorectal SSLs. However, colonoscopy resources in China are scarce and unevenly distributed. High-quality endoscopy requires good intestinal preparation, sufficient withdrawal time, pigment endoscopy, and complete theoretical knowledge of endoscopists. Risk stratification has not yet been achieved, exposing some low-risk groups to risks caused by unnecessary invasive examinations. Hence, endoscopic screening strategies may not be as cost-effective as expected.
First, high-risk patients with colorectal SSLs are screened by non-invasive screening techniques based on questionnaires and blood, urine, stool samples, etc. Then, the targeted colonoscopy is more suitable for Chinese conditions. 24 Therefore, this study analyzed the risk factors for colorectal SSLs by comparing the clinical data of patients with colorectal SSLs with the healthy population. A prediction model was constructed to assist clinicians in identifying high-risk patients as early as possible and arranging for further colonoscopy. This may avoid overmedication and save medical resources to a certain extent.
This study found that age, history of smoking, history of alcohol consumption, and TyG index were independent predictors of colorectal SSLs. This result is partially consistent with those of previous studies. With aging, the risk of colorectal SSLs increases. Anwar et al. 25 compared the results of initial follow-up colonoscopy in 6297 patients and found that older age (≥75 years) was independently associated with high-risk sessile serrated adenomas at follow-up. This provides evidence regarding whether the relevant population will undergo subsequent monitoring colonoscopy. IJspeert et al. 26 concluded that patients aged <50 years had a lower risk of SSLs than those aged >50 years, but the risk did not increase significantly with aging. More importantly, it was also found that smoking could greatly increase the risk of colorectal SSLs. Choe et al. 27 suggested that smoking is the only important risk factor for the progression of colorectal SSLs (OR: 1.394, 95% CI: 1.012–1.764). Bailie et al. 28 reported that smoking increased the risk of colorectal SSLs by more than 3-fold (OR: 3.40, 95% CI: 1.90–6.07). Recently, a meta-analysis 29 showed that smoking increased the risk of CRC in a dose-dependent manner, and the duration and intensity of smoking were directly proportional to CRC. Therefore, smoking may increase the risk of CRC through the microsatellite instability pathway (characterized by microsatellite instability, CpG island methylator phenotype, and BRAF mutation), which is one of the molecular mechanism characteristics of colorectal SSLs. 6 Several studies have also shown an association between alcohol consumption and colorectal SSLs. High levels of alcohol consumption are associated with an increased risk of colorectal SSLs, with an OR of approximately 1.8 compared with nondrinkers.28,30 Furthermore, the results of this study suggest that the TyG index is an independent risk factor for colorectal SSLs. Insulin resistance (IR) is a known risk factor for CRC.31,32 The TyG index is a reliable surrogate for assessing IR.33,34 Relevant studies have also demonstrated that the elevated TyG index is a risk factor for CRC.12–14 However, there are only a few studies on the direct correlation between the TyG index and colorectal SSLs, and it remains uncertain whether there exists a correlation between them. TyG index is a surrogate index of IR that is closely related to metabolic syndrome. 35 A large number of previous studies have revealed that metabolic-related factors such as abnormal blood lipid levels, obesity, and diabetes are closely related to SSLs, which may suggest an association between the TyG index and colorectal SSLs.36–39 These results suggest that colorectal SSLs can be prevented to some extent by altering modifiable risk factors (such as improving lifestyle, reducing smoking and drinking, and maintaining a healthy metabolic state through dietary control and exercise). Future studies may further explore the mediating mechanisms of these risk factors and provide a scientific basis for the development of personalized preventive measures.
This study comprehensively included the risk factors for colorectal SSLs. Multiple covariates were considered in the multivariate logistic regression analysis to more accurately assess the association between variables and colorectal SSLs. This may reduce the risk of possible confounding effects in the univariate analysis and thus generate results with greater explanatory power and clinical significance. Then, a nomogram was constructed to determine the risk of colorectal SSLs. According to the AUC values and calibration curve results, the nomogram showed good discrimination and calibration ability in predicting the risk of colorectal SSLs, which indicated its high clinical value. The decision curve further showed that this model provided positive net benefits within a certain threshold probability range. These findings highlight the potential of the nomogram as a practical tool for clinicians to identify and stratify the risk of colorectal SSLs, which may contribute to timely preventive interventions.
There are still certain limitations in this study. First, this was a retrospective study. Multivariate analysis revealed that a history of smoking and history of alcohol consumption are independent predictors of colorectal SSLs. The medical history mainly depends on the self-report of the participants, which may cause bias. Moreover, it is difficult to further analyze the correlation between the duration and amount of smoking and alcohol consumption and colorectal SSLs. Second, this is a single-center study with a small sample size and lacks external tests from other institutions. Therefore, the risk prediction effect of other populations outside this medical institution remains unknown. Future studies should conduct a multicenter prospective study to verify the effectiveness of this prediction model. Finally, although data on a large number of variables have been collected, some predictors (such as dietary composition) that may be related to the occurrence and development of colorectal SSLs are not included in this study. Future studies need to continue the collection of relevant data to comprehensively assess the risk factors for colorectal SSLs.
In conclusion, age, history of smoking, history of alcohol consumption, and TyG index are independent predictors of colorectal SSLs. The prediction model of colorectal SSLs constructed in this study demonstrates high diagnostic efficacy, which is helpful for clinical risk assessment and decision-making.
Footnotes
Acknowledgements
We thank Professor Zhaolin Lu from the School of Information and Control Engineering at China University of Mining and Technology who shared theoretical knowledge regarding artificial intelligence.
Author contributions
Sihui Huang: Conceptualization, Methodology, Data curation, and Writing—Original draft preparation. Shiyu Liu: Methodology and Data curation. Fang Tan & Hu Chen: Methodology. Guangxia Chen: Conceptualization, Supervision, and Writing—Reviewing and editing. All authors have read and approved the final manuscript.
Consent to participate
Written informed consent was obtained from the participants before starting colonoscopy. Patient data were not shared with third parties.
Data availability statement
Not applicable.
Declaration of conflicting interest
The authors declare that there is no conflict of interest.
Ethical considerations
The study was conducted in accordance with the Declaration of Helsinki and was approved by the Affiliated Xuzhou Municipal Hospital of Xuzhou Medical University Research Ethics Committee (approval no. 2023-KY-059) in August 2023.
Funding
This study was supported by the Xuzhou Key R&D Program (Social Development) Project (No.KC22095), the Xuzhou Municipal Health and Health Commission Medical Leading Talents Training Program (No.XWRCHT20210025), and Xuzhou Key Research and Development Program General Project (No. KC23167).
