Abstract
Functionally related genes often appear in each other's neighborhood on the genome; however,
the order of the genes may not be the same. These groups or clusters of genes may
have an ancient evolutionary origin or may signify some other critical phenomenon and
may also aid in function prediction of genes. Such gene clusters also aid toward solving the
problem of local alignment of genes. Similarly, clusters of protein domains, albeit appearing
in different orders in the protein sequence, suggest common functionality in spite of being
nonhomologous. In the paper, we address the problem of automatically discovering clusters
of entities, be they genes or domains: we formalize the abstract problem as a discovery
problem called the πpattern problem and give an algorithm that automatically discovers
the clusters of patterns in multiple data sequences. We take a model-less approach and introduce
a notation for
Keywords
Get full access to this article
View all access options for this article.
