Abstract
Abstract
Microarray data referencing gene expression profiles provide valuable answers to a variety of problems, and contributes to advances in clinical medicine. Gene expression data typically has a high dimension and a small sample size. Generally, only relatively small numbers of gene expression data are strongly correlated with a certain phenotype. To analyze gene expression profiles correctly, feature (gene) selection is crucial for classification. Feature (gene) selection has certain advantages, such as effective extraction of genes that influence classification accuracy, elimination of irrelevant genes, and improvement of the classification accuracy calculation. In this paper, we propose a two-stage feature selection method, which uses information gain to implement a gene-ranking process, and combines an improved particle swarm optimization with the K-nearest neighbor method and support vector machine classifiers to calculate the classification accuracy. The experimental results show that the proposed method can effectively select relevant gene subsets, and achieves higher classification accuracy than previous studies.
Get full access to this article
View all access options for this article.
