An epithelial–mesenchymal transition-related prognostic model for colorectal cancer based on weighted gene co-expression network analysis

Abstract

Objective

To identify susceptibility modules and genes for colorectal cancer (CRC) using weighted gene co-expression network analysis (WGCNA).

Methods

Four microarray datasets were downloaded from the Gene Expression Omnibus database. We divided the tumor samples into three subgroups based on consensus clustering of gene expression, and analyzed the correlations between the subgroups and clinical features. The genetic features of the subgroups were investigated by gene set enrichment analysis (GSEA). A gene expression network was constructed using WGCNA, and a protein–protein interaction (PPI) network was used to identify the key genes. Gene modules were annotated by Gene Ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses.

Results

We divided the cancer cases into three subgroups based on consensus clustering (subgroups I, II, III). The green module identified by WGCNA was correlated with clinical characteristics. Ten key genes were identified according to their degree of connectivity in the protein–protein interaction network: FYN, SEMA3A, AP2M1, L1CAM, NRP1, TLN1, VWF, ITGB3, ILK, and ACTN1.

Conclusion

We identified 10 hub genes as candidate biomarkers for CRC. These key genes may provide a theoretical basis for targeted therapy against CRC.

Keywords

Colorectal cancer weighted gene co-expression network consensus clustering epithelial–mesenchymal transition biomarker prognosis

Introduction

Colorectal cancer (CRC) is one of the most prevalent malignant tumors worldwide. It is the third leading cause of cancer-related death in the United States¹ and the second leading cause worldwide, leading to an estimated more than 1.9 million cases and 935,000 deaths in 2020.² Despite combined surgery, chemotherapy, and radiotherapy,³ CRC is still responsible for major mortality every year.⁴ Most cancer-related deaths result from late distant metastases, underlining the importance of diagnosing tumors at an early stage.⁵ At present, the diagnosis of CRC mainly relies on colonoscopy, while other tests designed to discover early lesions, such as occult blood tests, fecal immunochemical tests, and more recent fecal DNA testing (Multitarget stool DNA, Cologuard®)⁶ still have difficulties. There is thus a need to identify both effective biomarkers to distinguish cancer patients and hub genes to predict patient progress and prognosis.

Bioinformatics has played an increasingly significant role in numerous fields, and along with the fast development of high-throughput sequencing technology,^7,8 has frequently been used to identify biomarkers.^9,10 However, several studies have focused on differences in gene expression between samples, rather than exploring the underlying connections among genes.^11,12

Weighted gene co-expression network analysis (WGCNA) is used to study gene expression patterns and incorporate genes with similar expression patterns into the same modules. In this study, assuming that the gene expression network followed a scale-free distribution, we constructed a gene co-expression network using the WGCNA algorithm¹³ and then created a hierarchical clustering tree according to the dissimilarity coefficients of different nodes. Furthermore, we sorted high-similarity genes into the same modules and low-similarity genes into different modules and visualized these modules. This study supports the classification of patients with CRC into subgroups to aid individualized therapy, and identifies a key gene module correlated with clinical factors for diagnosis and treatment.

Materials and methods

Raw data

The flowchart for this study is shown in Figure 1. Transcriptome RNA-seq data for four CRC arrays (GSE106582, GSE41258, GSE44076, GSE87211) (Table 1) and the corresponding clinical information were downloaded from the Gene Expression Omnibus database (GEO, http://www.ncbi.nlm.nih.gov/geo/). The selection criteria were as follows: 1) gene expression data for CRC and normal mucosa samples (normal tissue samples from polyps were excluded); 2) arrays containing a minimum of 50 tumor and normal mucosa samples; and 3) inclusion of >5,000 genes in the GEO platform.

Figure 1.

Flow chart of data preparation and analysis.

Table 1.

Characteristics of the microarray datasets.

Accession/ID	Experimental group (n)	Control group (n)
GSE106582	Colorectal cancer tissue sample (77)	Adjacent normal mucosa tissues (117)
GSE41258	Primary colon adenocarcinomas (186)	Corresponding normal mucosa tissues (54)
GSE44076	Colorectal cancer tissue sample (98)	Paired and single normal mucosa tissues (148)
GSE87211	Colorectal cancer tissue sample (203)	Corresponding normal mucosa tissues (160)

The GEO is a public database and patients included in the database have provided ethical approval for use of their data for research and publication. This study was based on open-source data and there was therefore no need for ethics approval or informed consent.

Consensus clustering

The four samples from the GEO database were normalized by principal component analysis (PCA) using the R package “PCA”,¹⁴ and the tumor patients in the four samples were divided into three clusters using the “ConsensusClusterPlus” package¹⁵ according to the gene expression mode.

Comparison of clinical traits among consensus clusters

The clinicopathological data for the patients in the four samples were downloaded from the GEO database and the clinical traits were compared among the three clusters by consensus clustering using the “ggpubr” R package. P < 0.05 was considered statistically significant.

WGCNA

The R package “WGCNA” was applied to the four samples from the GEO database to construct a gene co-expression network.¹³ An adjacency matrix was transformed into a topological overlap matrix (TOM) and genes were divided into several modules according to the TOM-based dissimilarity measure. We set a soft-thresholding power of 12 (scale free R² = 0.9), cut height of 0.3, and minimal module size of 10 to select eight important modules.

Protein–protein interaction (PPI) analysis

The PPI network of genes in the green module was produced by the STRING website (https://string-db.org/) and reconstructed using Cytoscape version 3.6.1. The top 10 ranked protein-coding genes were also selected according to the degree of connectivity according to STRING.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses

GO and KEGG enrichment analyses of 1108 genes from the green module were performed using “clusterProfiler”,¹⁶ “enrichplot”, and “ggplot2” packages. Only terms with both P- and q-values <0.05 were considered to be significantly enriched and then visualized using the “GOplot”¹⁷ package.

Gene set enrichment analysis (GSEA)

The gene sets “h.all.v2022.1.Hs.symbols.gmt”, “c2.cp.kegg.v2022.1.Hs.symbols.gmt”, and “c5.go.v2022.1.Hs.symbols.gmt” were downloaded from the Molecular Signatures Database (MSigDB, http://software.broadinstitute.org/gsea/msigdb/index.jsp) as the reference gene set. The transcriptomic data of the three consensus clusters from the GEO database were subjected to GSEA analysis using the “clusterprofiler” package, and only gene sets with nominal P < 0.05 and false discovery rate q < 0.06 were considered significant.¹⁸

Statistical analysis

The data were analyzed by Wilcoxon’s rank sum and Kruskal–Wallis rank sum tests using R version 4.0.2 (https://www.r-project.org/). Correlations between gene modules and clinical factors were determined by Pearson’s correlation analysis.

Results

Data normalization and consensus clustering of CRC samples

The four microarray datasets (Table 1) and the corresponding clinical information (Table 2) were downloaded from the GEO database and normalized by PCA (Figure 2). Consensus clustering was applied to the cancer transcriptomic data of the datasets, and we chose a value of k = 3 (Figure 3) according to the item consensus scores (Figure 3d) to divide the tumor patients into three subgroups (subgroups I, II, and III). The clear boundaries of the three subtypes indicated that the clustering was reliable (Figure 4a).

Table 2.

Clinical information.

	GSE106582	GSE41258	GSE44076	GSE87211
Age, year, median (IQR)	64 (56,74)	66 (57,75)	71 (65,78)	62.7 (56.9, 70.3)
Sex	194	239	98	363
Male	133 (68.6%)	125 (52.3%)	71 (72.4%)	248 (68.3%)
Female	61 (31.4%)	114 (47.7%)	27 (21.4%)	115 (31.7%)
Location	–	239	98	363
Cecum	–	38 (15.9%)	0 (0%)	0 (0%)
Colon	–	96 (40.2%)	98 (100%)	0 (0%)
Sigmoid	–	60 (25.1%)	0 (0%)	0 (0%)
Rectum	–	45 (18.8%)	0 (0%)	363 (100%)
T stage	–	186	–	203
1	–	5 (2.7%)	–	0 (0%)
2	–	33 (17.7%)	–	11 (5.4%)
3	–	135 (72.6%)	–	174 (85.7%)
4	–	13 (7.0%)	–	18 (8.9%)
N stage	–	186	–	203
0	–	93 (50.0%)	–	115 (56.7%)
1	–	46 (24.8%)	–	88 (43.3%)
2	–	47 (25.3%)	–	0 (0%)
M stage	–	186	–	203
0	–	125 (67.2%)	–	189 (93.1%)
1	–	61 (32.8%)	–	14 (6.9%)
Stage	–	186	98	–
I	–	28 (15.0%)	0 (0%)	–
II	–	44 (23.7%)	98 (100%)	–
III	–	52 (28.0%)	0 (0%)	–
IV	–	62 (32.3%)	0 (0%)	–

IQR, interquartile range.

Figure 2.

Principal component analysis for gene expression distribution. (a) Four gene arrays (GSE106582, GSE41258, GSE44076, GSE87211) from the Gene Expression omnibus database and (b) Four gene groups in (a) were normalized by principal component analysis.

Figure 3.

Consensus clustering for gene expression. (a) Genes in the above four samples were divided into three subgroups by consensus clustering (consensus matrix k = 3). (b) Cumulative distribution function (CDF) for k = 2–10. (c) Relative change in area under CDF curve for k = 2–10 and (d) Cluster-consensus plot. Item consensus (IC) is the average consensus value between the subgroup and members of a consensus cluster. Colored bars represent subgroups of certain cluster, with height corresponding to IC values.

Figure 4.

Correlation of subgroups with clinical characteristics. (a–f) Correlation of tumor location (cecum, right colon, transverse colon, left colon, sigmoid, rectum) with subgroup and (g–k) Correlation of cancer stage with subgroup. *P < 0.05; **P < 0.01; ***P < 0.001; ns, not significant.

Correlation of consensus clusters with clinical traits

We examined the correlations between the three subgroups identified by consensus clustering and various clinical traits. There were higher proportions of patients with cecum carcinoma in subgroup III, rectal cancer in subgroup I, and sigmoid tumor in subgroup II. Subgroups I and III included more late-stage (stage III, IV) samples (Figure 4).

Identification of key modules based on WGCNA

We identified the key modules of CRC by WGCNA using the four datasets and divided all the genes into eight modules (Figure 5). The heatmap of module–trait correlations indicated that the green module was closely correlated with clinical traits (Figure 5e), especially T stage (Pearson’s correlation coefficient = 0.3, P = 5 × 10⁻²⁶). The green module included 406 genes. Genes in the green module were upregulated in subgroup I (Figure 5f), which included more late-stage patients as noted above.

Figure 5.

Identification of important modules correlated with clinical characteristics in samples from the Gene Expression Omnibus (GEO) database based on weighted gene co-expression network analysis. (a) Clustering dendrograms of genes based on RNA-seq data for four arrays (GSE106582, GSE41258, GSE44076, GSE87211) from the GEO database. (b) Analysis of the scale-free fit index (left) and mean connectivity (right) for various soft-thresholding powers. (c) Clustering of module eigengenes. Red line represents cut height (0.25). (d) Dendrogram of all clusters based on dissimilarity measures (1-TOMO). (e) Heatmap of correlation between module eigengenes and clinical characteristics of colorectal cancer. Each cell contains the correlation coefficient and P value and (f) Heatmap of relationship between modules and clusters.

PPI analysis of green module

The PPI network of the proteins encoded by 406 genes in the green module is shown in Figure 6a. The top 10 core genes of the network selected according to the degree of connectivity are presented in Figure 6b and Table 3, including FYN, SEMA3A, AP2M1, L1CAM, NRP1, TLN1, VWF, ITGB3, ILK, ACTN1.

Figure 6.

Protein–protein interaction network of genes in green module. (a) Red boxes represent up-regulated genes (darker color indicates higher expression) ; blue boxes represent down-regulated genes and (b) Hub gene cluster of green module.

Table 3.

Top 10 genes in protein–protein interaction network of green module.

Rank	Gene name	Interaction score
1	FYN	14.0
2	TLN1	12.0
3	ITGB3	12.0
4	AP2M1	10.0
5	VWF	8.0
6	ILK	8.0
7	L1CAM	7.0
8	SEMA3A	7.0
9	NRP1	7.0
10	ACTN1	6.0

GO and KEGG enrichment of modules from WGCNA

GO and KEGG enrichment analyses were used to annotate the modules from WGCNA (Figure 7). GO analysis showed that the green module was enriched in genes associated with biological processes, including extracellular structure organization, extracellular matrix (ECM) organization, cell−substrate adhesion, axonogenesis, cell junction organization, and cell junction assembly. KEGG pathway enrichment analysis showed that this module was significantly enriched in cell adhesion molecules, phosphoinositide 3-kinase/Akt signaling, mitogen-activated protein kinase signaling, ECM−receptor interaction, and focal adhesion. These enriched genes and pathways were considered to be related to modification of cellular interactions.

Figure 7.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. (a) GO enrichment and (b) KEGG analysis of genes in modules from weighted gene co-expression network analysis. (c) Heatmap of correlation between enriched pathways in GO enrichment and consensus clusters and (d) Heatmap of correlation between enriched KEGG signaling pathways and consensus clusters.

GSEA for annotation of genes in clusters

GSEA enrichment was applied to investigate the functions of genes in the consensus clusters (Figure 8). Gene sets of calcium-independent cell–cell adhesion via plasma membrane cell adhesion molecules, collagen catabolic process, collagen fibril organization, ECM structural constituent tensile strength, acyl-COA binding, and oxidoreductase activity acting on NADPH oxygen as acceptor were upregulated in subgroup I, while gene sets of mismatch repair were upregulated in subgroup II and positive regulation of mesenchymal cell proliferation, Th1 cell differentiation, and abnormal iron homeostasis were upregulated in subgroup III.

Figure 8.

Gene set enrichment analysis was used to annotate the characteristics of genes from clusters. Hallmark genes upregulated in (a) subgroup I, (b) subgroup II, and (c) subgroup III.

Discussion

CRC has long been a major disease burden worldwide, responsible for huge losses in terms of both lives and money; however, the molecular pathogenesis of CRC remains unclear. Tumor heterogeneity means that CRCs respond inconsistently to specific treatments, leading to various detailed classifications of the disease and stratified therapeutic strageties.¹⁹ Consensus molecular subtypes (CMS) is one of the most well-known gene expression-based CRC classifications, which sorts tumors into four clusters: CMS1, microsatellite unstable and strong immune activation; CMS2, marked Wnt and Myc signaling activation; CMS3, evident metabolic dysregulation; and CMS4, prominent transforming growth factor-β activation.²⁰ However, this clustering mainly focuses on the genetic features of the tumors rather than the clinical characteristics of the patients. In this study, we divided CRCs into three subgroups by consensus clustering and annotated them with clinical factors. The results showed that subgroups I and III contained more late-stage cases and may thus have a higher tendency for local advancement, requiring strict removal of the primary site, e.g., by surgery. There were some differences in the distribution of cancer locations among the subgroups, suggesting a possible correlation between gene expression patterns and tumor location. However, the mechanism of this phenomenon needs further investigation.

GSEA was applied to the subgroups and showed that materials and pathways involved in altering the extracellular microenvironment were upregulated in subgroup I. Overexpression of epithelial cell adhesion molecules, decomposition of collagen, and modification of matrix indicated that tumors in subgroup I may be more likely to metastasize, with decreased intercellular adhesion.^21–23 Oxidoreductase and binding of acyl-COA were also enhanced in subgroup I, indicating the possible release of reactive oxygen species, contributing to tumor progression and chemotherapy resistance.^24,25 These results suggest that interference with the redox reaction may be a possible therapeutic strategy for patients in subgroup I.²⁶ The Keynote 177 (NCT02563002) trial found that pembrolizumab could be introduced to first-line therapy for microsatellite instability-high patients,²⁷ while programmed cell death 1 (PD1) inhibitors are not effective in patients with microsatellite-stable tumors. Mismatch repair-related genes were upregulated in subgroup II, suggesting a possible lack of response to PD-1 inhibitors in this population. Subgroup III showed upregulation of epithelial–mesenchymal transition (EMT), which is an essential factor in cancer progression, associated with aggressive behavior and drug resistance.²⁸ This group also demonstrated increased genes associated with Th1 cell differentiation, indicating great potential for immune therapy,²⁹ while abnormal iron homeostasis may provide a breakthrough for novel CRC treatment strategies in these patients.²⁹

WGCNA was used to construct a gene network and eight modules were then established. The green module was significantly correlated with clinicopathological factors such as T, N, M classification and tumor stage. In addition, genes in the green module were upregulated in subgroup I following consensus clustering, which was considered to include more late-stage patients. In conclusion, overexpression of the green module might indicate a more severe cancer stage and a poor prognosis. GO and KEGG analyses were used to profile the functions of the modules, and the green module was shown to be enriched in pathways involved in tumor matrix modification and intercellular adhesion transformation.

The green module included 406 genes, of which the top 10 ranked genes were FYN, SEMA3A, AP2M1, L1CAM, NRP1, TLN1, VWF, ITGB3, ILK, ACTN1. FYN encodes a non-receptor tyrosine kinase (Fyn), which plays an important part in the genesis and progression of malignancies by adjusting morphogenic transformation, cellular motility, cell growth, and cell death.^30–32 FYN is regarded as a vital oncogene in CRC and is involved in various pathways, such as AMP-activated protein kinase signaling,³³ Rho guanine nucleotide exchange factor 16 pathways,³⁴ and the signal transducer and activator of transcription 5/Notch2 axis.³⁵ In addition, Fyn is thought to be a driver of EMT, as well as promoting the metastasis of CRC.^36–38 ITGB3 encodes integrin subunit beta 3 and has been reported to influence various processes of tumor development, including reprogramming tumor metabolism, maintaining tumor stemness, promoting angiogenesis, enhancing drug resistance, modifying the tumor microenvironment, and prompting EMT.³⁹ In CRC, reactive oxygen species-mediated upregulation of ITGB3 promoted the migration and invasion capacities of tumor cells, while its inhibition significantly relieved metastasis and improved survival in an animal model.^40,41 Talin-1 (TLN1) is associated with cell adhesion and motility, making it a potential predictive biomarker for tumor metastasis and prognosis.^42,43 Semaphorin-3A (SEMA3A) is a member of the semaphorin family of membrane-bound glycoproteins,⁴⁴ and has been shown to exert an important influence on the direction of axonal growth. However, its roles in immune suppression, tumorigenesis, and angiogenesis have also been identified in recent years.^45–47 AP2M1 encodes adaptor-related protein complex 2 subunit mu 1, which is involved in chemotherapy resistance and senescence escape through the CDK4–EZH2–AP2M1 pathway.^48,49 L1CAM, encoding L1 cell adhesion molecule, is not expressed in the normal intestinal epithelium but is expressed in the regenerated epithelium after colitis and in CRC organoids. L1CAM has been shown to be required for orthotopic carcinoma propagation, liver metastatic colonization, and chemoresistance.⁵⁰ After extravasating from the primary location, metastatic clones use L1CAM to adhere and spread on the surface of capillaries and activate the mechanotransduction-sensitive transcription factors yes-associated protein 1 and myocardin-related transcription factor, which are indispensable for metastatic outgrowth in perivascular sites.^51–53 NRP1 (neuropilin-1) is a vital gene in cancer invasion, metastasis, and EMT and is upregulated in CRC,^54–56 offering a novel target for treatment.^57,58 VWF (von Willebrand factor) and ACTN1 (alpha-actinin-1) are associated with cancer angiogenesis and metastasis, which are risk factors for prognosis.^59,60 Upregulated expression of ILK (encoding integrin-linked kinase) is found in CRC and has been reported to be related to EMT, tumor progression, and drug resistance.⁶¹ These 10 core genes are thus mainly related to EMT and tumor metastasis, which may describe the traits of the green module.

In addition, most previous classifications of CRC have only considered the genetic status, whereas CRC comprises a group of complicated disorders requiring comprehensive analysis. We divided the tumor samples into three subgroups to consider possible individualized treatments, and combined the patients’ clinical characteristics with their genetic features. The 10 identified hub genes may serve as prognostic markers and potentially effective indicators for clinical and targeted therapy of CRC.

Our research also had several limitations. First, we only used data from public databases, and more real-world data are needed to confirm the results. Second, more specific models are needed to identify the different pathological and molecular subtypes of CRC, which will be addressed in our further studies. Finally, more studies are needed to validate the functions of the 10 genes identified in the current study.

Conclusion

We identified 10 hub genes (FYN, SEMA3A, AP2M1, L1CAM, NRP1, TLN1, VWF, ITGB3, ILK, ACTN1) as prognostic CRC markers and potentially effective indicators for the clinical and targeted therapy of CRC.

Footnotes

Acknowledgements

The author would like to thank the associate editor and reviewers for their useful feedback, which has improved this paper.

Data availability statement

All the data analyzed in this work was acquired from the GEO database (), which is an open public database.

Declaration of conflicting interests

The author declares that they have no conflicts of interest related to this work.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Lina Zhang

References

Siegel

Miller

Goding Sauer

, et al. Colorectal cancer statistics, 2020. CA Cancer J Clin 2020; 70: 145–164.

Sung

Ferlay

Siegel

, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; 71: 209–249.

Zhang

Chen

The current status of treatment for colorectal cancer in China. Medicine (Baltimore) 2017; 96: e8242.

Siegel

Miller

Fuchs

, et al. Cancer statistics, 2021. CA Cancer J Clin 2021; 71: 7–33.

Celià-Terrassa

Kang

Distinctive properties of metastasis-initiating cells.

Genes Dev 2016; 30: 892–908.

Maida

Macaluso

Ianiro

, et al. Screening of colorectal cancer: present and future. Expert Rev Anticancer Ther 2017; 17: 1131–1146.

Huang

AHC.

Bioinformatics reveal five lineages of oleosins and the mechanism of lineage evolution related to structure/function from green algae to seed plants.

Plant Physiol 2015; 169: 453–470.

Türei

Földvári-Nagy

Fazekas

, et al. Autophagy Regulatory Network – a systems-level bioinformatics resource for studying the mechanism and regulation of autophagy. Autophagy 2015; 11: 155–165.

Ren

Zhao

Sun

, et al. Identification of plasma biomarkers for distinguishing bipolar depression from major depressive disorder by iTRAQ-coupled LC-MS/MS and bioinformatics analysis. Psychoneuroendocrinology 2017; 86: 17–24.

10.

Rong

Huang

Tian

, et al. COL1A2 is a novel biomarker to improve clinical prediction in human gastric cancer: integrating bioinformatics and meta-analysis. Pathol Oncol Res 2018; 24: 129–134.

11.

Song

Ren

, et al. Identification of potential crucial genes associated with carcinogenesis of clear cell renal cell carcinoma. J Cell Biochem 2018; 119: 5163–5174.

12.

Zhu

Renaud

Guo

Bioinformatics-based identification of miR-542-5p as a predictive biomarker in breast cancer therapy.

Hereditas 2018; 155: 17.

13.

Langfelder

Horvath

WGCNA: an R package for weighted correlation network analysis.

BMC Bioinformatics 2008; 9: 559.

14.

Giuliani

The application of principal component analysis to drug discovery and biomedical data.

Drug Discov Today 2017; 22: 1069–1076.

15.

Wilkerson

Hayes

DN.

ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking.

Bioinformatics 2010; 26: 1572–1573.

16.

Wang

Han

, et al. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 2012; 16: 284–287.

17.

Walter

Sánchez-Cabo

Ricote

GOplot: an R package for visually combining expression data with functional analysis. Bioinformatics (Oxford, England) 2015; 31: 2912–2914. Epub ahead of print 9. DOI: 10.1093/bioinformatics/btv300.

18.

Hänzelmann

Castelo

Guinney

GSVA: gene set variation analysis for microarray and RNA-seq data.

BMC Bioinformatics 2013; 14: 7.

19.

Punt

CJA

Koopman

Vermeulen

From tumour heterogeneity to advances in precision treatment of colorectal cancer.

Nat Rev Clin Oncol 2017; 14: 235–246.

20.

Guinney

Dienstmann

Wang

, et al. The consensus molecular subtypes of colorectal cancer. Nat Med 2015; 21: 1350–1356.

21.

Janiszewska

Primi

Izard

Cell adhesion in cancer: Beyond the migration of single cells.

J Biol Chem 2020; 295: 2495–2505.

22.

Liang

Huang

, et al. Prognostic significance of abnormal matrix collagen remodeling in colorectal cancer based on histologic and bioinformatics analysis. Oncol Rep 2020; 44: 1671–1685.

23.

Bennasroune

Langlois

, et al. Functional interplay between collagen network and cell behavior within tumor microenvironment in colorectal cancer. Front Oncol 2020; 10: 527.

24.

Basak

Uddin

Hancock

The role of oxidative stress and its counteractive utility in colorectal cancer (CRC). Cancers (Basel) 2020; 12: 3336.

25.

Ashizawa

Shimizu

Shoda

, et al. NADPH oxidase 5 has a crucial role in cellular motility of colon cancer cells. Int J Oncol 2021; 59: 63.

26.

Jiang

, et al. NADPH oxidase 1 is highly expressed in human large and small bowel cancers. PLoS One 2020; 15: e0233208.

27.

André

Shiu

Kim

, et al. Pembrolizumab in microsatellite-instability-high advanced colorectal cancer. N Engl J Med 2020; 383: 2207–2218.

28.

Boesch

Spizzo

Seeber

Concise review: aggressive colorectal cancer: role of epithelial cell adhesion molecule in cancer stem cells and epithelial-to-mesenchymal transition.

Stem Cells Transl Med 2018; 7: 495–501.

29.

Zhang

Iron homeostasis and tumorigenesis: molecular mechanisms and therapeutic opportunities.

Protein Cell 2015; 6: 88–100.

30.

Saito

Jensen

Salgia

, et al. Fyn: a novel molecular target in cancer. Cancer 2010; 116: 1629–1637.

31.

Elias

Ditzel

HJ.

Fyn is an important molecule in cancer pathogenesis and drug resistance.

Pharmacol Res 2015; 100: 250–254.

32.

Resh

MD.

Fyn, a Src family tyrosine kinase. Int J Biochem Cell Biol 1998; 30: 1159–1162.

33.

Zhang

Chan

, et al. Fyn-phosphorylated PIKE-A binds and inhibits AMPK signaling, blocking its tumor suppressive activity. Cell Death Differ 2016; 23: 52–63.

34.

Chen

, et al. FYN is required for ARHGEF16 to promote proliferation and migration in colon cancer cells. Cell Death Dis 2020; 11: 652.

35.

Lee

Yoo

, et al. FYN promotes mesenchymal phenotypes of basal type breast cancer cells through STAT5/NOTCH2 signaling node. Oncogene 2018; 37: 1857–1868.

36.

Jiang

Dong

Gong

, et al. Inflammatory genes are novel prognostic biomarkers for colorectal cancer. Int J Mol Med 2018; 42: 368–380.

37.

Hogan

Judge

O’Callaghan

, et al. Introducing a novel and robust technique for determining lymph node status in colorectal cancer. Ann Surg 2014; 260: 94–102.

38.

Gujral

Chan

Peshkin

, et al. A noncanonical Frizzled2 pathway regulates epithelial-mesenchymal transition and metastasis. Cell 2014; 159: 844–856.

39.

Zhu

Kong

Wang

, et al. ITGB3/CD61: a hub modulator and target in the tumor microenvironment. Am J Transl Res 2019; 11: 7195–7208.

40.

Reinmuth

Liu

Ahmad

, et al. Alphavbeta3 integrin antagonist S247 decreases colon cancer metastasis and angiogenesis and improves survival in mice. Cancer Res 2003; 63: 2079–2087.

41.

Lei

Huang

Gao

, et al. Proteomics identification of ITGB3 as a key regulator in reactive oxygen species-induced migration and invasion of colorectal cancer cells. Mol Cell Proteomics 2011; 10: M110.005397.

42.

Barbazán

Alonso-Alconada

Muinelo-Romay

, et al. Molecular characterization of circulating tumor cells in human metastatic colorectal cancer. PLoS One 2012; 7: e40476.

43.

Powner

Kopp

Monkley

, et al. Tetraspanin CD9 in cell migration. Biochem Soc Trans 2011; 39: 563–567.

44.

Sumi

Hirose

Yanoshita

, et al. Semaphorin 3A inhibits inflammation in chondrocytes under excessive mechanical stress. Mediators Inflamm 2018; 2018: 5703651.

45.

Lepelletier

Moura

Hadj-Slimane

, et al. Immunosuppressive role of semaphorin-3A on T cell proliferation is mediated by inhibition of actin cytoskeleton reorganization. Eur J Immunol 2006; 36: 1782–1793.

46.

Gaur

Bielenberg

Samuel

, et al. Role of class 3 semaphorins and their receptors in tumor growth and angiogenesis. Clin Cancer Res 2009; 15: 6763–6770.

47.

Staton

CA.

Class 3 semaphorins and their receptors in physiological and pathological angiogenesis.

Biochem Soc Trans 2011; 39: 1565–1570.

48.

Le Duff

Gouju

Jonchère

, et al. Regulation of senescence escape by the cdk4-EZH2-AP2M1 pathway in response to chemotherapy. Cell Death Dis 2018; 9: 199.

49.

McMahon

Boucrot

Molecular mechanism and physiological functions of clathrin-mediated endocytosis.

Nat Rev Mol Cell Biol 2011; 12: 517–533.

50.

Ganesh

Basnet

Kaygusuz

, et al. L1CAM defines the regenerative origin of metastasis-initiating cells in colorectal cancer. Nat Cancer 2020: 1:28–45.

51.

Valiente

Ganesh

, et al. Pericyte-like spreading by disseminated cancer cells activates YAP and MRTF for metastatic colonization. Nat Cell Biol 2018; 20: 966–978.

52.

Tampakis

Tampaki

Nonni

, et al. L1CAM expression in colorectal cancer identifies a high-risk group of patients with dismal prognosis already in early-stage disease. Acta Oncol 2020; 59: 55–59.

53.

Fang

Zheng

Zhao

HJ.

L1CAM is involved in lymph node metastasis via ERK1/2 signaling in colorectal cancer. Am J Transl Res 2020; 12: 837–846.

54.

Vivekanandhan

Mukhopadhyay

Genetic status of KRAS influences transforming growth factor-beta (TGF-β) signaling: An insight into neuropilin-1 (NRP1) mediated tumorigenesis.

Semin Cancer Biol 2019; 54: 72–79.

55.

Zhu

Zhai

, et al. LncRNA TTN-AS1 promotes the progression of cholangiocarcinoma via the miR-320a/neuropilin-1 axis. Cell Death Dis 2020; 11: 637.

56.

Chen

Wei

Wang

, et al. Long non‑coding RNA 00152 functions as a competing endogenous RNA to regulate NRP1 expression by sponging with miRNA‑206 in colorectal cancer. Int J Oncol 2018; 53: 1227–1236.

57.

Liu

Meng

Peng

, et al. Impaired AGO2/miR-185-3p/NRP1 axis promotes colorectal cancer metastasis. Cell Death Dis 2021; 12: 390. Epub ahead of print 4. DOI: 10.1038/s41419-021-03672-1.

58.

Guo

Luo

, et al. Cdc42 subcellular relocation in response to VEGF/NRP1 engagement is associated with the poor prognosis of colorectal cancer. Cell Death Dis 2020; 11: 171.

59.

Lackner

Jukic

Tsybrovskyy

, et al. Prognostic relevance of tumour-associated macrophages and von Willebrand factor-positive microvessels in colorectal cancer. Virchows Arch 2004; 445: 160–167.

60.

Fukumoto

Kurisu

Yamada

, et al. α-Actinin-4 enhances colorectal cancer cell invasion by suppressing focal adhesion maturation. PLoS One 2015; 10: e0120616.

61.

Tsoumas

Nikou

Giannopoulou

, et al. ILK expression in colorectal cancer is associated with EMT, cancer stem cell markers and chemoresistance. Cancer Genomics Proteomics 2018; 15: 127–141.