Abstract
Many researchers had proved that both single amino acid composition and dipeptide composition can influence protein thermostability. We use ν-support vector machines approach to predict hyperthermophilic protein, thermophilic protein and mesophilic protein from single amino acid composition, dipeptide composition and the combination of the two factors. For the prediction accuracies, we conclude that, single amino acid composition is suitable for prediction of mesophilic protein; dipeptide composition is suitable for prediction of hyperthermophilic protein and thermophilic protein; when considering the combination of the two factors, the prediction accuracy of hyperthermophilic protein is 84.1%, thermophilic protein is 83.4%, mesophilic protein is 84.4%, average accuracy is 84.0%. It shows that the protein thermostability can be predicted properly based on the combination of single amino acid composition and dipeptide composition. Obviously, dipeptide composition is correlative significantly to protein thermostability based on the prediction accuracies.
