Abstract
Feature selection is a crucial data pre-processing step in classification problems. The wrapper approach is widely used due to their good classification performance. However, it is very computational expensive due to the cross validation scheme in the evaluation phase. In order to solve this problem, this paper proposes a novel hybrid two-stage feature selection method based on differential evolution (HTSDE). In the first stage, a cluster validity index named DB index is employed to evaluate the feature subset and the wrapper approach in used in the second stage to improve the classification accuracy of the feature subsets. In order to find global optimal feature subsets, different trail vector generation strategies of DE are used in the two stages where the first stage focuses on global exploration and the second stage emphasizes fast convergence. The hybrid method is able to combine the advantages of both DB index and wrapper approach and improve the computational efficiency of the wrapper approach while maintaining the classification performance. HTSDE is compared with several state-of-the-art feature selection methods on 12 datasets. Experimental results show the proposed HTSDE achieves higher classification accuracy than both wrapper and filter approaches. Moreover, its computational cost is much less than those wrapper approaches.
Keywords
Get full access to this article
View all access options for this article.
