Biomarker Discovery,Disease Classification,and Similarity Query Processing on High-Throughput MS/MS Data of Inborn Errors of Metabolism

Abstract

In newborn errors of metabolism, biomarkers are urgently needed for disease screening, diagnosis, and monitoring of therapeutic interventions. This article describes a 2-step approach to discovermetabolic markers, which involves (1) the identification ofmarker candidates and (2) the prioritization of thembased on expert knowledge of diseasemetabolism. For step 1, the authors developed a new algorithm, the biomarker identifier (BMI), to identifymarkers fromquantified diseased versus normal tandemmass spectrometry data sets. BMI produces a ranked list ofmarker candidates and discards irrelevant metabolites based on a quality measure, taking into account the discriminatory performance, discriminatory space, and variance ofmetabolites’ concentrations at the state of disease. To determine the ability of identified markers to classify subjects, the authors compared the discriminatory performance of several machine-learning paradigms and described a retrieval technique that searches and classifies abnormal metabolic profiles from a screening database. Seven inborn errors of metabolism— phenylketonuria (PKU), glutaric acidemia type I (GA-I), 3-methylcrotonylglycinemia deficiency (3-MCCD), methylmalonic acidemia (MMA), propionic acidemia (PA), medium-chain acylCoAdehydrogenase deficiency (MCADD), and 3-OH longchain acyl CoA dehydrogenase deficiency (LCHADD)—were investigated. All primarily prioritized marker candidates could be confirmed by literature. Somenovel secondary candidateswere identified (i.e., C16:1 andC4DCfor PKU, C4DCfor GA-I, and C18:1 forMCADD), which require further validation to confirmtheir biochemical role during health and disease.

Keywords

biomarker discovery disease classification similarity query processing tandemmass spectrometry metabolic disorders

References

Chace DH , DiPerna JC , NaylorEW : Laboratory integration and utilization of tandem mass spectrometry in neonatal screening: a model for clinical mass spectrometry in the next millennium. Acta Paediatr (Suppl) 1999;88:45-47.

CharrowJ , Goodman SI , McCabeER , Rinaldo P : Tandemmass spectrometry in newborn screening. Genet Med 2000;2:267-269.

Gamache PH , Meyer DF , Granger MC , Acworth IN : Metabolomic applications of electrochemistry/mass spectrometry. J Am Soc Mass Spectrom 2004;15:1717-1726.

DunnWB , Bailey NJ , Johnson HE : Measuring the metabolome: current analytical technologies. Analyst 2005;130:606-625.

RoschingerW , Olgemoller B , Fingerhut R , Liebl B , Roscher AA : Advances in analytical mass spectrometry to improve screening for inheritedmetabolic diseases. Eur J Pediatr 2003;162(Suppl 1):S67-S76.

Wilcken B , Wiley V , Hammond J , Carpenter K : Screening newborns for inborn errors of metabolism by tandem mass spectrometry. N Engl J Med 2003;348:2304-2312.

Strauss AW : Tandem mass spectrometry in discovery of disorders of the metabolome. Clin Invest 2004;113:354-356.

Neville P , Tan PY , Mann G , Wolfinger R : Generalizable mass spectrometry mining used to identify disease state biomarkers from blood serum. Proteomics 2003;3:1710-1715.

Lee JW , Weiner RS , Sailstad JM , Bowsher RR , KnuthDW , O’Brien PJ , et al: Method validation and measurement of biomarkers in nonclinical and clinical samples in drug development: a conference report. Pharm Res 2005;22:499-511.

10.

Gao J , Garulacan LA , Storm SM , Opiteck GJ , Dubaquie Y , Hefta SA , et al: Biomarker discovery in biological fluids. Methods 2005;35:291-302.

11.

German JB , Bauman DE , Burrin DG , Failla ML , Freake HC , King JC , et al: Metabolomics in the opening decade of the 21st century: building the roads to individualized health. J Nutr 2004;134:2729-2732.

12.

American College of Medical Genetics/American Society of Human Genetics Test and Technology Transfer Committee Working Group : Tandem mass spectrometry in newborn screening. Genet Med 2000;2:267-269.

13.

Blau N , Thony B , Cotton RGH , Hyland K : Disorders of tetrahydrobiopterin and related biogenic amines. In Scriver CR , Kaufman S , Eisensmith E , Woo SLC , Vogelstein B , ChildsB (eds): TheMetabolic andMolecularBases of Inherited Disease. 8th ed. New York: McGraw-Hill, 2001.

14.

Donlon J , Levy H , Scriver CR : Hyperphenylalaninemia: phenylalanine hydroxylase deficiency. In Scriver CR , Beaudet AL , Sly SW , Valle D (eds): The Metabolic and Molecular Bases of Inherited Disease [Online]. New York: McGraw-Hill, 2004.

15.

Hoffmann GF , Zschocke J : Glutaric aciduria type I: from clinical, biochemical and molecular diversity to successful therapy. J Inherit Metab Dis 1999;22:381-391.

16.

Clayton PT , Doig M , Ghafari S , Meaney C , Taylor C , Leonard JV , et al: Screening for medium chain acyl-CoA dehydrogenase deficiency using electrospray ionisation tandem mass spectrometry. Arch Dis Child 1998;79:109-115.

17.

Dezateux C : Newborn screening for medium chain acyl-CoA dehydrogenase deficiency: evaluating the effects on outcome. Eur J Pediatr 2003;162(Suppl 1):S25-S28.

18.

Rinaldo P , Matern D , Bennett MJ : Fatty acid oxidation disorders. Annu Rev Physiol 2002;64:477-502.

19.

Duda RO , Hart PE , Stork GG : Pattern Classification. NewYork: JohnWiley, 2001.

20.

Hosmer DW , Lemeshow S : Applied Logistic Regression. 2nd ed. New York: JohnWiley, 2000.

21.

Baumgartner C , Böhm C , Baumgartner D , Marini G , Weinberger K , Olgemöller B , et al: Supervised machine learning techniques for the classification of metabolic disorders in newborns. Bioinformatics 2004;20:2985-2996.

22.

Hall MA , Holmes G : Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowledge Data Eng 2003;15:1437-1447.

23.

Purohit PV , RockeDM : Discriminant models for high-throughput proteomics mass spectrometer data. Proteomics 2003;3:1699-1703.

24.

Vlahou A , Schorge JO , GregoryBW , ColemanRL : Diagnosis of ovarian cancer using decision tree classification of mass spectral data. J Biomed Biotechnol 2003;5:308-314.

25.

Ball G , Mian S , Holding F , Allibone RO , Lowe J , Ali S , et al: An integrated approach utilizing artificial neural networks and seldi mass spectrometry for the classification of human tumors and rapid identification of potential biomarkers. Bioinformatics 2002;18:395-404.

26.

Baumgartner C , Böhm C , Baumgartner D : Modelling of classification rules on metabolic patterns including machine learning and expert knowledge. J Biomed Inform 2005;38:89-98.

27.

Mitchell TM : Machine Learning. Boston: McGraw-Hill, 1997.

28.

Cristianini N , Shawe-Taylor J : An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge, UK: Cambridge University Press, 2000.

29.

Shawe-Taylor J , Cristianini N : Kernel Methods for Pattern Analysis. Cambridge, UK: Cambridge University Press, 2004.

30.

Gelman A , Carlin JB , Stern HS , Rubin DB : Bayesian Data Analysis 2nd ed. London: Chapman & Hall/CRC Press, 2004.

31.

Raudys S : Statistical andNeuralClassifiers. London: Springer-Verlag, 2001.

32.

Witten IH , Frank E : Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco: Morgan Kaufmann, 2000.

33.

LilienRH , FaridH , Donald BR : Probabilistic disease classification of expression-dependent proteomic data from mass spectrometry of human serum. J Comput Biol 2003;10:925-946.

34.

Baggerly KA , Morris JS , CoombesKR : Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments. Bioinformatics 2004;20:777-785.

35.

Yu JS , Ongarello S , Fiedler R , ChenXW , ToffoloG , Cobelli C , Trajanoski Z : Ovarian cancer identification based on dimensionality reduction for highthroughput mass spectrometry data. Bioinformatics 2005;21:2200-2209.

36.

Thomason MJ , Lord J , BainMD , Chalmers RA , Littlejohns P , Addison GM , et al: A systematic review of evidence for the appropriateness of neonatal screening programmes for inborn errors of metabolism. J Public Health Med 1998;20:331-343.

37.

Pandor A , Eastham J , Beverley C , Chilcott J , Paisley S : Clinical effectiveness and cost-effectiveness of neonatal screening for inborn errors of metabolism using tandemmass spectrometry: a systematic review. Health Technol Assess 2004;8:iii,1-121.

38.

Beecher C : The human metabolome. In HarriganGG , GoodacreR (eds): Metabolic Profiling: Its Role in Biomarker Discovery and Gene Function Analysis. Berlin: Kluwer Academic, 2003.