Correct microarray analysis approaches in ‘Hsa-circRNA11783-2 in peripheral blood is correlated with coronary artery disease and type 2 diabetes mellitus’

Abstract

We read with great interest the article by Li et al. The results are illuminating for understanding the relationship among Hsa-circRNA11783-2, coronary artery disease (CAD) and type 2 diabetes mellitus (T2DM). However, from our perspective, the bioinformatics analyses need further context as the statistics for differential fold changes in expression data are not explained fully. The authors seem to use unadjusted p values for detecting differentially expressed circular RNA (differentially expressed genes (DEGs)) between the control, CAD and T2DM group. Due to the high false positives caused by a large number of probes and multiple comparisons, it seems essential to analyse microarray data properly to reach a reliable result by a statistical method. Only selecting circular RNA with greater than two fold change with unadjusted p values < 0.05 in expression is not reliable and suitable for high-level microarray analysis.

Keywords

Bioinformatics microarray analysis statistics circular RNA

We read with great interest the article by Li et al.¹ The results are illuminating for understanding the relationship among Hsa-circRNA11783-2, coronary artery disease (CAD) and type 2 diabetes mellitus (T2DM). However, from our perspective, the bioinformatics analyses need further context as the statistics for differential fold changes in expression data are not explained fully. For example, the authors seem to use unadjusted p values for detecting differentially expressed circular RNA (differentially expressed genes (DEGs)) between the control, CAD and T2DM group. Due to the high false positives caused by a large number of probes and multiple comparisons, it seems essential to analyse microarray data properly to reach a reliable result by a statistical method. Only selecting circular RNA with greater than twofold change with unadjusted p values < 0.05 in expression is not reliable and suitable for high-level microarray analysis. In our opinion, there is nothing wrong with using a Student’s t-test for large multivariate data as long as it was corrected, for example, using Benjamini and Hochberg which was admittedly important but easy to be omitted.

We would like to suggest using specialized high-level microarray analysis such as Limma (linear models for microarray analysis),² commonly used for statistical testing and analysis of differential expression data using linear models, and choosing more than 1.5-fold expression changes and false discovery rate (FDR) < 0.05 as the cutoff is an appropriate and conservative approach to obtain DEGs, in our view. Moreover, significant analysis of microarray³ is another and considerable non-parametric statistical algorithm; a twofold expression change and FDR < 0.1 is a rational cutoff to obtain DEGs.

Although the authors appear to perform additional analysis such as quantitative polymerase-chain-reaction (Q-PCR) or other methodology for DEGs in order to verify the results of the microarray experiments from our reading of the article, it is impractical using Q-PCR or other technology to verify all DEGs. Choosing the optimal statistical approach⁴ and obtaining accurate and convincing results of DEGs analysis are basis for further data analysis. We welcome the authors to offer further explanation of their data analysis and experimental approach. We suggest transcriptomics data-intensive research would benefit from these considerations and innovations in statistical and data analytical approaches.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship and/or publication of this article.

References

Zhao

Jian

et al . Hsa-circRNA11783-2 in peripheral blood is correlated with coronary artery disease and type 2 diabetes mellitus. Diab Vasc Dis Res. 2017; 14(6): 510–515.

Law

Alhamdoosh

et al . RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR. F1000Res 2016; 5: 1408.

Tusher

Tibshirani

Chu

Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 2001; 98: 5116–5121.

Chrominski

Tkacz

Comparison of high-level microarray analysis methods in the context of result consistency. PLoS ONE 2015; 10: e0128845.