Abstract
Protein subcellular localization prediction is currently receiving much attention in the field of protein research. Many researchers make great efforts to study single-site protein subcellular localization, but the experimental data shows that many proteins can be found in two or more sub-cellular locations, prompting the study of multisite protein sub-cellular localization. This study utilized a Gpos-mPLOC data set and pseudo amino acid compositions, physicochemical properties of amino acid composition, and entropy density as three effective feature extraction methods. Then, these features were then placed in a multi-label k nearest neighbor classifier to predict subcellular protein locations. Experimental results verified that this approach provides a localization precision of 66.73% through the Jack-knife test.
