Ideal Discrimination of Discrete Clinical Endpoints Using Multilocus Genotypes

Abstract

Multifactor Dimensionality Reduction (MDR) is a method for the classification and prediction of discrete clinical endpoints using attributes constructed from multilocus genotype data. Empirical studies with both real and simulated data suggest that MDR has good power for detecting gene-gene interactions in the absence of independent main effects. The purpose of this study is to develop an objective, theory-driven approach to evaluate the strengths and limitations of MDR. To accomplish this goal, we borrow concepts from ideal observer analysis used in visual perception to evaluate the theoretical limits of classifying and predicting discrete clinical endpoints using multilocus genotype data. We conclude that MDR ideally discriminates between low risk and high risk subjects using attributes constructed from multilocus genotype data. We also show that the classification approach used once a multilocus attribute is constructed is similar to that of a naïve Bayes classifier. This study prov ides a theoretical foundation for the continued development, evaluation, and application of MDR as a data mining tool in the domain of statistical genetics and genetic epidemiology.

Keywords

multifactor dimensionality reduction epistasis gene-gene interaction ideal observer naïve Bayes classifier