Abstract
Statistical validation of gene clusters is imperative for many important applications in comparative
genomics which depend on the identification of genomic regions that are historically
and/or functionally related. We develop the first rigorous statistical treatment of max-gap
clusters, a cluster definition frequently used in empirical studies. We present exact expressions
for the probability of observing an individual cluster of a set of marked genes in
one genome, as well as upper and lower bounds on the probability of observing a cluster
of
Get full access to this article
View all access options for this article.
