Abstract
Long non-coding RNAs (lncRNAs) are widely transcribed in the genome, but their expression profile and roles in colorectal cancer are not well understood. The aim of this study was to investigate the long non-coding RNA expression profile in colorectal cancer and look for potential diagnostic biomarkers of colorectal cancer. Long non-coding RNA microarray was applied to investigate the global long non-coding RNA expression profile in colorectal cancer. Gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses were performed using standard enrichment computational methods. The expression levels of selected long non-coding RNAs were validated by quantitative reverse transcription polymerase chain reaction. The relationship between long non-coding RNA expression levels and clinicopathological characteristics of colorectal cancer patients was assessed. Coexpression analyses were carried out to find the coexpressed genes of differentially expressed long non-coding RNAs, followed by gene ontology analysis to predict the possible role of the selected long non-coding RNAs in colorectal carcinogenesis. In this study, a total of 1596 long non-coding RNA transcripts and 1866 messenger RNA transcripts were dysregulated in tumor tissues compared with paired normal tissues. The top upregulated long non-coding RNAs in tumor tissues were CCAT1, UCA1, RP5-881L22.5, NOS2P3, and BC005081 and the top downregulated long non-coding RNAs were AK055386, AC078941.1, RP4-800J21.3, RP11-628E19.3, and RP11-384P7.7. Long non-coding RNA UCA1 was significantly upregulated in colon cancer, and AK055386 was significantly downregulated in tumor with dimension <5 cm. Functional prediction analyses showed that both the long non-coding RNAs coexpress with cell cycle related messenger RNAs. The current long non-coding RNA study provided novel insights into expression profile in colorectal cancer and predicted the potential roles of long non-coding RNAs in colorectal carcinogenesis. Among the dysregulated long non-coding RNAs, UCA1 was found to be associated with anatomic site, and AK055386 was found associated with tumor size. Further functional investigations into the molecular mechanisms are warranted to clarify the role of long non-coding RNA in colorectal carcinogenesis.
Introduction
Colorectal cancer (CRC) is one of the most common cancers worldwide. 1 Although the incidence of CRC in China is lower than that in the Western countries, it has increased in recent decades and has become a substantial burden in China.2,3 Accumulating evidence showed that one of the most efficient way to prevent CRC is early diagnosis. 4 However, existing biomarkers, such as carcinoembryonic antigen (CEA), are not suitable for the early detection of CRC, due to their low specificity and sensitivity for CRC diagnosis. Therefore, it is necessary to find novel biomarkers for early detection of CRC.
Long non-coding RNAs (lncRNAs) are non-coding RNAs greater than 200 nucleotides in length. LncRNAs can regulate gene expression either locally or globally, and there are multiple means by which they can regulate downstream target genes. 5 Increasing evidence showed that lncRNAs play important roles in gene regulation and thus affect various aspects of cellular homeostasis, including proliferation, survival, migration, or genomic stability. 6 In recent years, several studies have implicated lncRNAs in various malignancies. For instance, HOTAIR has been reported as being overexpressed in breast cancer and7,8 CRC.9–11 CCAT2 was identified as a novel lncRNA overexpressed in CRC which promotes metastasis and chromosomal instability. 12 The expression of metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) was found to be associated with CRC metastasis, 13 and the downregulation of MALAT1 by resveratrol could decrease the nuclear localization of β-catenin and attenuate Wnt/β-catenin signaling, thereby inhibiting CRC invasion and metastasis.14,15 To our knowledge, there are only few studies focusing on the global expression of lncRNAs in CRC so far. The global lncRNAs expression profile in CRC is not fully elucidated, and more evidences are needed to uncover the role of lncRNAs in CRC.
In this study, we investigated the genome-wide expression profile of lncRNAs and messenger RNAs (mRNAs) in CRC tissues and paired normal tissues using microarray to assess the relationship between the aberrantly expressed lncRNAs and CRC, to find CRC-associated lncRNAs and to predict their possible roles in CRC carcinogenesis.
Methods and materials
Subjects and specimens
Six CRC patients were randomly recruited for lncRNA microarray profile analysis from the Hangzhou First People’s Hospital. Then, other 95 CRC patients who had undergone curative surgery at Department of Gastrointestinal surgery, Hangzhou First People’s Hospital, from July 2013 to December 2013 were recruited in this study for expressional validation. All the patients were pathologically confirmed as colorectal adenocarcinoma. The tissues were immediately preserved in RNA Later® Stabilization Solution (Invitrogen, Carlsbad, CA, USA) after removal from the body and stored at −80°C. Tumors were staged according to the tumor–node–metastasis (TNM) staging system of the International Union Against Cancer (5th edition). 16 Clinicopathological features and other lifestyle factors were collected from Hospital Information System. Written informed consents were obtained from all patients, and this study was approved by the Medical Ethical Committee of Zhejiang University School of Medicine.
RNA isolation
Total cellular RNA was isolated from each sample using a homogenizer (IKA®-Works Guangzhou, China) and TRIzol reagent (Invitrogen) and then purified using the RNeasy Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s protocol. RNA from each specimen was quantified using a NanoDrop ND-1000 spectrophotometer (NanoDrop, Wilmington, DE, USA). RNA integrity was assessed using standard denaturing agarose gel electrophoresis, and the purity was estimated by the ratio of absorbance at 260 to 280 nm (A260/A280).
Microarray profiling and quantitative reverse transcription polymerase chain reaction validation
Microarray analysis was carried out using Arraystar Human LncRNA Microarray V3.0 (KangChen Bio-tech Inc., Shanghai, China), which is designed for the global profiling of 30,586 lncRNAs and 26,109 coding transcripts from public transcriptome databases (RefSeq, University of California Santa Cruz (UCSC) known genes, GENCODE, etc), as well as landmark publications. Agilent Feature Extraction software (version 11.0.1.1) was used to analyze the acquired array images. Quantile normalization and subsequent data processing were performed using the Gene Spring GX v12.1 software package (Agilent Technologies, Santa Clara, California, USA). Differentially expressed lncRNAs and mRNAs were identified through fold change filtering.
Real-time quantitative reverse transcription polymerase chain reaction (qRT-PCR) was used for expression validation. The expression levels of selected lncRNAs were first examined by qRT-PCR in the same tissues with the microarray (six tissue pairs) and then, more tissue samples (95 tissue pairs) were used for further validation. Total RNA (0.5 µg) was used as a template to prepare complementary DNA (cDNA; cat. no. A3500; Reverse Transcription System; Promega Corporation, Madison, WI, USA). The lncRNA expression level was quantified using SYBR Premix EX Taq (TaKaRa, Shanghai, China) on the ABI 7500 sequence detection system (Advanced Biosystem, Thermo Fisher Scientific, Waltham, MA, USA). All experiments were performed in triplicate (glyceraldehyde 3-phosphate dehydrogenase (GAPDH) as the internal control). Finally, the 2−ΔΔCt method was performed to calculate the relative expression. The primers were obtained from Golden Bridge International Co., Ltd. (Beijing, China). The primer sequences are shown in Table S1.
Coexpression, gene ontology, and pathway analyses
Pearson’s correlation coefficient of the selected lncRNA with each dysregulated mRNA was calculated to find its coexpressed mRNAs. Absolute value of correlation coefficient is >0.9, and the correlation p < 0.05 was considered to be statistically significant.
DAVID software was used for gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations on mRNAs. LncRNA gene functions were predicted using the hypergeometric cumulative distribution function based on the coexpression of mRNA using GO and KEGG annotations. Fisher’s exact test was used to determine whether the overlap between the differential expression list and the GO annotation list was greater than that would be expected by chance. p values were used to estimate the significance of GO term and pathway enrichment in mRNAs. The threshold of statistical significance was set as a p < 0.05. Corrections for multiple comparisons were carried out using the Benjamini–Hochberg false discovery rate (FDR) adjustment. 17 FDR-adjusted p < 0.05 was regarded as potentially important.
Statistical analysis
All statistical analyses were performed using the SAS statistical software, version 9.2 (SAS Institute, Cary, NC, USA), unless otherwise noted. p < 0.05 was considered to be statistically significant. Continuous and categorical variables were tested by the paired t test and Pearson’s χ2 test, respectively.
Results
LncRNA and mRNA expression profiles
Genome-wide profiling analysis was performed to investigate the expression of lncRNAs and mRNAs between CRC and adjacent normal tissues. Among the 30,586 lncRNAs in the microarray, 556 lncRNAs were found to be significantly upregulated in tumor tissues, and 1040 lncRNAs were found to be downregulated (fold change ≥ 2.0, p < 0.05). The top differentially expressed lncRNAs were selected for further validation using qRT-PCR. The top upregulated lncRNAs in tumor tissues were CCAT1, UCA1 (urothelial carcinoma–associated 1), RP5-881L22.5, NOS2P3, and BC005081, and the top downregulated lncRNAs were AK055386, AC078941.1, RP4-800J21.3, RP11-628E19.3, and RP11-384P7.7 (Table 1). Besides, among the 26,109 mRNAs in the microarray, 889 mRNAs were significantly upregulated in tumor tissues, and 977 were found to be downregulated (fold change ≥ 2.0, p < 0.05; Figure 1 and Tables S2 and S3).
Information of selected differential expressed lncRNAs.
LncRNA: long non-coding RNA.

Long non-coding RNA (lncRNA) and mRNA expression profiles in colorectal cancer (CRC) tissues and paired normal tissues. The scatter plot and the volcano plot illustrate the distributions of the data in the (a and c) lncRNA and (b and d) mRNA profiles. The values of the x- and y-axes in the scatter plot were the averaged normalized signal values of the group (log2 scaled). The green lines in the scatter and volcano plots show the significant fold change.
To validate the result of microarray, the expression of five upregulated lncRNAs (CCAT1, UCA1, RP5-881L22.5, NOS2P3, and BC005081) and five downregulated lncRNAs (AK055386, AC078941.1, RP4-800J21.3, RP11-628E19.3, and RP11-384P7.7) was first examined by qRT-PCR in the same tissue pairs with the microarray and then in more tissue pairs for further validation. As shown in Figure 5, four lncRNAs were upregulated in cancer tissue, and six were downregulated. Nine of the ten selected lncRNAs showed the same change patterns with those in the microarray results, though the expression fold change of qRT-PCR results was much lower than those of the microarray results. The qRT-PCR result of NO2SP3 indicated that NOS2P3 was downregulated in CRC tissues, which was inconsistent with the result of the microarray. Table 2 showed the results of qRT-PCR in 95 tissue samples. Three lncRNAs were significantly upregulated and six lncRNAs were significantly downregulated in cancer tissues (p < 0.05) compared with the paired normal tissues.
Expression levels (ΔCt) of selected lncRNAs by qRT-PCR.
LncRNA: long non-coding RNA; qRT-PCR: quantitative reverse transcription polymerase chain reaction.
GO and pathway analysis
We performed GO and pathway analyses in the upregulated and downregulated genes, respectively, to investigate the possible processes involved in CRC. In the biological process of ontology, upregulated genes enriched in the different terms. The top three enriched processes were cellular process (598 genes), metabolic process (451 genes), and cellular metabolic process (414 genes), while most of the downregulated genes were enriched in biological regulation (443 genes), regulation of biological process (414 genes), and regulation of cellular process (387 genes). In the cellular component analysis, cell part (648 genes), cell (648 genes), and intracellular (571 genes) were the top three significant cellular components associated with the upregulated genes, while cell periphery (229 genes), plasma membrane (228 genes), and extracellular part (148 genes) were the most significant cellular component terms associated the downregulated genes. The top three enriched ontology terms in the molecular function ontology are binding (525 genes), protein binding (354 genes), and catalytic activity (287 genes), while the binding (571 genes), regulation of cation binding (362 genes), and ion binding (217 genes) enriched most of the downregulated genes (Figures 2 and 3). Besides, KEGG pathway analysis showed that upregulated mRNAs were enriched in 18 pathways, of which the most significant one was cell cycle pathway (enrichment score = 4.83). Downregulated mRNAs were enriched in 34 pathways, of which the most significant one was dopaminergic synapse pathway (enrichment score = 5.96; Figure 4).

Gene ontology (GO) enrichment analyses of upregulated genes. GO analysis of upregulated genes was performed according to (a1 and a2) biological process, (b1 and b2) cell component, and (c1 and c2) molecular function. (a1, b1, and c1) Presentation of the GO terms which most genes enriched in. (a2, b2, and c2) Presentation of GO terms that have the highest enrichment fold.

Gene ontology (GO) enrichment analysis of downregulated genes. GO analysis of downregulated genes was performed according to (a1 and a2) biological process, (b1 and b2) cell component, and (c1 and c2) molecular function. (a1, b1, and c1) Presentation of the GO terms which most genes enriched in. (a2, b2, and c2) Presentation of GO terms that have the highest enrichment fold.

The top 10 enrichment scores in the pathway analysis of the (a) upregulated and (b) downregulated mRNAs.

Expression levels of lncRNAs in microarray and qRT-PCR. The y-axis represents the log-transformed median fold changes (T/N) in expression (p < 0.05). The qRT-PCR results were consistent with the microarray data except NOS2P3.
Association of UCA1 and AK055386 expression (ΔΔCt) with clinicopathological characteristics
We further investigated whether UCA1 and AK055386 expression are associated with CRC clinicopathological characteristics and cancer biomarkers (AFP (alpha-fetoprotein), CEA, CA19-9). As shown in Table 3, UCA1 expression level was associated with tumor location (p = 0.031). UCA1 was significantly upregulated in colon cancer, while it was not significant in rectal cancer. UCA1 was also significantly correlated with famous cancer biomarker AFP, the correlation coefficient was 0.349 (p < 0.001). On the contrary, AK055386 was found to be associated with the dimension of tumor. For patients whose tumor dimension ≥5 cm, AK055386 was significantly less expressed than that in patients with tumor dimension <5 cm (Table 3).
Association of UCA1 and AK055386 expression (ΔΔCt) with clinicopathological factors.
SD: standard deviation; TNM: tumor–node–metastasis; CEA: carcinoembryonic antigen; AFP: alpha-fetoprotein; lncRNA: long non-coding RNA.
Correlation coefficient with the two lncRNA expression levels was calculated and displayed in the table.
Coexpression analysis on UCA1 and AK055386
Coexpression analyses of UCA1 and AK055386 with differentially expressed mRNAs in microarray were carried out to find possible target genes and the role of UCA1 and AK055386 in CRC. Among the differential expressed genes, the expression of 746 genes were correlated with UCA1, and 899 were correlated with AK055386 (Tables S4 and S5). In GO analysis, For UCA1, 5 GO terms remained significantly enriched after FDR correction (FDR < 0.05), which were cell division, M phase of mitotic cell cycle, chromosome, centromeric region, condensed chromosome, cytoskeletal protein binding (Table 4). For AK055386, nine biological process ontology terms and two cellular component ontology terms were significantly enriched after FDR (FDR < 0.05; Table 5).
Significantly enriched gene ontology terms of UCA1.
GO: gene ontology; FDR: false discovery rate.
Significantly enriched gene ontology terms of AK055386.
GO: gene ontology; FDR: false discovery rate.
Discussion
CRC is one of the most common cancers worldwide. During the past decades, many studies have reported that early diagnosis promises better prognosis and overall survivals. However, the sensitivity and specificity of existing biomarkers are relatively low. LncRNAs have been recently reported to play an important role in various malignancies, including CRC.18–20 However, global lncRNA expression profile in CRC is not fully clear, and more evidences are needed to uncover the role of lncRNAs in CRC. In this study, we performed microarray analysis in CRC tissue and paired normal tissue to investigate the global expression profile of lncRNA and look for lncRNAs that have potential to be a CRC diagnostic biomarkers.
With lncRNA microarray, this study provided a global lncRNA expression profile in CRC. In this study, we found 556 lncRNAs were upregulated and 1040 lncRNAs were downregulated in CRC tissues. Previous studies on lncRNAs reported several aberrantly expressed lncRNAs, such as H19, 21 CRNDE, 22 HOTAIR, 9 and CCAT1 23 associated with CRC, and these lncRNAs remained aberrantly expressed in the present lncRNA expression profile (fold change ≥ 2.0, p < 0.05). GO analysis identified several biological processes that might be involved in CRC. These ontology terms implied that CRC carcinogenesis is mainly correlated with disorder of cell metabolic process and changes of the cellular part, which is consistent with previous mechanistic studies of CRC. 24
UCA1 was first reported as a promising biomarker for bladder cancer in 2006.25,26 Following studies showed that UCA1 was associated with various cancers such as breast cancer, 26 hepatocellular cancer, 27 tongue cancer, 28 and CRC. 29 In this study, UCA1 was found to be upregulated in both the microarray analysis and qRT-PCR validation, and UCA1 expression level is associated with tumor location. Together with the result of GO analysis, we suggest that UCA1 may play its role in colorectal carcinogenesis and its function may be tissue specific. Yang et al. 30 reported that UCA1 can promote CRC by regulating the activity of cyclic adenosine monophosphate (cAMP) responsive element binding protein and thus affecting cell cycle progression. In this study, we also found that the genes coexpressed with UCA1 were enriched in cell cycle ontology term. This finding again provided evidence that UCA1 was playing some role in cell cycle.
In this study, lncRNA AK055386 was identified to be significantly downregulated in CRC tissues and was associated with tumor size (p < 0.001). AK055386 is a novel lncRNA from UCSC database, which has not been studied so far. Coexpression analysis found 899 genes coexpressed with AK055386. In the GO analysis for AK055386, 11 GO terms were enriched, nine of which were related with cell cycle. Our results suggested that AK055386 may play some role in CRC carcinogenesis by regulating cell cycle, while further experimental evidence is warranted for validation.
Despite the interesting profiling of lncRNA expression in CRC, there were several limitations in our study. First of all, given the relative small sample size of microarray analysis, some biases may be raised and affect further analyses. Second, GO is only one way to predict gene functions, many other approaches could be applied to clarify the real functions of the genes. Besides, other relationships between two genes such as neighboring in chromosome location have not been taken into account yet, which may have some influence on gene–gene interactions.
In conclusion, our study provided a genome-wide lncRNA expression profile in CRC. In this study, a total of 1596 lncRNA transcripts and 1866 mRNA transcripts were deregulated in tumor tissues compared with paired normal tissues. UCA1 was found to be associated with anatomic site of CRC, and AK055386 was found to be associated with tumor dimension. Further functional study and investigation into mechanism are warranted to clarify the role of lncRNAs in colorectal carcinogenesis and the value of lncRNAs in CRC diagnosis.
Footnotes
Acknowledgements
F.J. and H.J. contributed equally to this work and should be considered as co-first authors.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by grants from National Natural Science Foundation of China (No. 81373087), National Program on Key Basic Research Project (973 Program, No. 2015CB554003), Zhejiang Provincial Natural Science Foundation of China (No. LQ15H260001), Public Projects of Zhejiang Province (No. 2014C33224), and Program of Hangzhou medical College (No. 2016B05). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
