Sage Journals: Discover world-class research

Abstract

ABSTRACT

Computational recognition of native-like folds of an anonymous amino acid sequence from a protein fold database is considered to be a promising approach to the three-dimensional (3D) fold prediction of the amino acid sequence. We present a new method for protein fold recognition through optimally aligning an amino acid sequence and a protein fold template (protein threading). The fitness of aligning an amino acid sequence with a fold template is measured by (1) the singleton fitness, representing the compatibility of substituting one amino acid by another and the combined preference of secondary structure and solvent accessibility for a particular amino acid, (2) the pairwise interaction, representing the contact preference between a pair of amino acids, and (3) alignment gap penalties. Though a protein threading problem so defined is known to be NP-hard in the most general sense, our algorithm runs efficiently if we place a cutoff distance on the pairwise interactions, as many of the existing threading programs do. For an amino acid sequence of size n and a fold template of size m with M core secondary structures, the algorithm finds an optimal alignment in O(Mn^15C+1 + mn^C+1) time and O(Mn^C+1) space, where C is a (small) nonnegative integer, determined by a particular mathematical property of the pairwise interactions. As a case study, we have demonstrated that C is less than or equal to 4 for about 75% of the 293 unique folds in our protein database, when pairwise interactions are restricted to amino acids ≤ 7Å apart (measured between their beta carbon atoms). An approximation scheme is developed for fold templates with C > 4, when threading requires too much memory and time to be practical on a typical workstation.

Get full access to this article

View all access options for this article.

An Efficient Computational Method for Globally Optimal Threading1

Abstract

ABSTRACT

Get full access to this article