Abstract
Molecular interaction data plays an important role in understanding biological processes at
a modular level by providing a framework for understanding cellular organization, functional
hierarchy, and evolutionary conservation. As the quality and quantity of network and
interaction data increases rapidly, the problem of effectively analyzing this data becomes significant.
Graph theoretic formalisms, commonly used for these analysis tasks, often lead to
computationally hard problems due to their relation to subgraph isomorphism. This paper
presents an innovative new algorithm, MULE, for detecting frequently occurring patterns and
modules in biological networks. Using an innovative graph simplification technique based
on ortholog contraction, which is ideally suited to biological networks, our algorithm renders
these problems computationally tractable and scalable to large numbers of networks.
We show, experimentally, that our algorithm can extract frequently occurring patterns in
metabolic pathways and protein interaction networks from the KEGG, DIP, and BIND
databases within seconds. When compared to existing approaches, our graph simplification
technique can be viewed either as a pruning heuristic, or a closely related, but computationally
simpler task. When used as a pruning heuristic, we show that our technique reduces
effective graph sizes significantly, accelerating existing techniques by several orders of magnitude!
Indeed, for most of the test cases, existing techniques could not even be applied
without our pruning step. When used as a stand-alone analysis technique, MULE is shown
to convey significant biological insights at near-interactive rates. The software, sample input
graphs, and detailed results for comprehensive analysis of nine eukaryotic PPI networks are
available at
Get full access to this article
View all access options for this article.
