Abstract
Sample classification is a most critical task in microarray data analysis. But representation of microarray data with the huge search space of thousands of gene makes this work more complex and difficult. To handle this problem both an efficient gene selection technique and efficient classifier is required. In this paper, we propose a multi-criterion Pareto differential evolution technique for feature selection. This technique first uses a wrapper technique i.e. a population based differential evolution gene selection (DEGS) algorithm for feature selection. The motivation of choosing differential evolution as compared to other learning technique is it tries to assign optimal ranks to each gene using probability distribution factor present in the microarray dataset using classification error as the fitness function. It is observed that these selections contain relevant genes as well as some irrelevant genes. So in the second phase bi-objective filter technique, called as Pareto based optimization is used to select minimum number of top ranked genes in the feature selection. Here we have considered information gain (IG) and Signal to noise ratio (SNR) as two objective functions for Pareto optimization. To verify the importance and relevance of the selected genes, classification using K-nearest neighbour (KNN), naïve Bayesian classifier (NB), artificial neural network (ANN) and support vector machine (SVM) is done. Our experiment is conducted over four well known microarray dataset. The experimental work shows that the proposed method is better than the existing searching method in terms of both classification error and predicted feature sets. The classification result shows that performance of SVM classifier is better than the result obtained using KNN, NB and ANN. Finally this method highlights its performance in terms of both relevance of genes and classification output.
Keywords
Get full access to this article
View all access options for this article.
