Discussion on the paper ‘Statistical contributions to bioinformatics: Design,modelling,structure learning and integration’ by Jeffrey S. Morris and Veerabhadran Baladandayuthapani

Abstract

Bioinformatics is an important research area for statisticians. This discussion provides some additional topics to the paper, namely on statistical contributions to detect differential expressed genes, for protein structure prediction, and for the analysis of highly correlated features in Glycomics datasets.

Keywords

derived traits Directed networks FDR filtering glycomics protein structure

Get full access to this article

View all access options for this article.

References

Aitchison

(2003) The Statistical Analysis of Compositional Data . Caldwell, NJ: Blackburn Press.

Benjamini

, Hochberg

(1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B , 57, 289–300.

Berman

, Westbrook

, Feng

, Gilliland

, Bhat

, Weissig

, Shindyalov

, Bourne

(2000) The Protein Data Bank. Nucleic Acids Research , 28, 235–42.

Boomsma

, Mardia

, Taylor

, Ferkingho-Borg

, Krogh

, Hamelryck

(2008) A generative, probabilistic model of local protein structure. PNAS , 105, 8932–37.

El Bouhaddani

, Houwing-Duistermaat

, Salo

, Perola

, Jongbloed

, Uh

(2016) Evaluation of O2PLS in Omics data integration. BMC Bioinformatics , 17, 11.

Calza

, Raelsberger

, Ploner

, Sahel

, Leveillard

, Pawitan

(2007) Filtering genes to improve sensitivity in oligonucleotide microarray data analysis. Nucleic Acids Research , 35, e102.

Dillies

, Rau

, Aubert

, Hennequet-Antier

, Jeanmougin

, Servant

, Keime

, Marot

, Castel

, Estelle

, Guernec

, Jagla

, Jouneau

, Laloe

, Le Gall

, Schaeffer

, Le Crom

, Guedj

, Jaffrezic

(2013) A comprehensive evaluation of normalization methods for illumina high-throughput RNA sequencing data analysis. Brief Bioinform , 14, 671–83.

Dill

, Ozkan

, Weikl

, Chodera

, Voelz

(2007) The protein folding problem: When will it be solved? Current Opinion in Structural Biology , 17, 342–46.

Efron

, Tibshirani

, Storey

, Tusher

(2001) Empirical Bayes analysis of a micro-array experiment. Journal of the American Statistical Association , 96, 1151–60.

10.

Floudas

, Fung

, McAllister

, Monni-gmann

, Rajgaria

(2006) Advances in protein structure prediction and de novo protein design: A review. Chemical Engineering Science , 61, 966–88.

11.

Gusnanto

, Ploner

, Pawitan

(2005) Fold-change estimation of differentially expre-ssed genes using mixture mixed-model. Statistical Applications in Genetics and Molecular Biology , 4, article 26.

12.

Hamelryck

, Mardia

, Ferkingho-Borg

(2012) Bayesian Methods in Structural Bioinformatics . Berlin, Germany: Springer.

13.

Hart

, Copeland

(2010) Glycomics hits the big time. Cell , 143, 672–76.

14.

Monsees

, Taqmimi

, Kraft

(2009) Genome-wide association scans for secondary traits using case-control samples. Genetic Epidemiology , 33, 717–28.

15.

Kohl

, Klein

, Hpchrein

, Oefner

, Spang

, Gronwald

(2012) State-of-the art data normalization methods improve nmr-based metabolomic analysis. Metabolomics , 8, 146–60.

16.

Krumsiek

, Suhre

, Illig

, Jerzy Adamski

, Theis

(2011) Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC Systems Biology , 5, 21.

17.

Pearson

(1897) Mathematical contributions to the theory of evolution: On a form of spurious correlation which may arise when indices are used in the measurements of organs. Proceedings of the Royal Society of London , 60, 489–98.

18.

Ploner

, Calza

, Gusnanto

, Pawitan

(2006) Multidimensional local false discovery rate for microarray studies. Bioinformatics , 22, 556–65.

19.

Reiding

, Ruhaak

, Uh

, El Bouhaddani

, Van Den Akker

, Plomp

(2017) Human plasma N-glycosylation as analyzed by matrix-assisted laser desorption/ionization-fourier transform ion cyclotron resonance-MS associates with markers of inflammation and metabolic health. Molecular and Cellular Proteomics , 16, 228–42.

20.

Rodriguez-Girondo

, Kakourou

, Salo

, Perola

, Mesker

, Tollenaar

RAEM

(2017) On the combination of omics data for prediction of binary outcomes. In Datta

Susmita

Bart

JA Mertens

eds, Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data Using Mass Spectrometry , 259–75. Berlin, Germany: Springer-Verlag.

21.

Sebastiani

, Solovie

, DeWan

, Walsh

, Puca

, Hartley

(2012) Genetic signatures of exceptional longevity in humans. Plos One , 7, e29848. URL http://break dx.doi.org/10.1371/journal.pone.0029848 (last accessed 17 April 2017).

22.

Storey

(2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society, Series B , 64, 479–98.

23.

Tissier

, Tsonaka

, Mooijaart

, Slagboom

, Houwing-Duistermaat

(2017) Secondary phenotype analysis in ascertained family designs: Application to the Leiden longevity study. To appear in Statistics in Medicine.

24.

Trygg

, Wold

(2003) O2-PLS, a two-block (X-Y) latent variable regression (LVR) method with an integral OSC filter. Journal of Chemometrics , 17, 53–64.

25.

Walt

(2012) Transforming Glycoscience: A roadmap for the future. Committee on assessing the importance and impact of glycomics and glycosciences. Washington, DC: The National Academis Press.