Abstract
The division of a protein interaction network into biologically meaningful modules can aid with automated detection of protein complexes and prediction of biological processes and can uncover the global organization of the cell. We propose the use of a graph summarization (GS) technique, based on graph compression, to cluster protein interaction graphs into biologically relevant modules. The method is motivated by defining a biological module as a set of proteins that have similar sets of interaction partners. We show this definition, put into practice by a GS algorithm, reveals modules that are more biologically enriched than those found by other methods. We also apply GS to predict complex memberships, biological processes, and co-complexed pairs and show that in most settings GS is preferable over existing methods of protein interaction graph clustering.
Get full access to this article
View all access options for this article.
