Abstract
This paper proposes a particle swarm K-means optimization (PSKO)-based granular computing (GrC) model to preprocess skewed class distribution in order to enhance the classification accuracy for the class imbalance problem. The GrC model obtains knowledge from information granules rather than from numerical data. It also processes multi-dimensional and sparse data by using singular value decomposition and latent semantic indexing (LSI). The data possessing features of multiple dimensions and scarcity can be preprocessed using LSI in order to reduce the number of data dimensions as well as records. Ten benchmark data sets are employed to demonstrate the effectiveness of the proposed model. Experiment results indicate that the proposed model has better classification performance with both imbalanced and balanced data. In addition, the computational result for prostate cancer prognosis reveals that the proposed model really can support physicians in judging the condition of prostate cancer patients with a more accurate survival rate estimation.
Keywords
Get full access to this article
View all access options for this article.
