Abstract
Feature selection is a pre-processing method that identifies the significant features from high-dimensional data and able to diminish the computational cost of the learning algorithm because of removing the irrelevant and redundant features. It has traditionally been applied in a wide range of problems that include biological data processing, pattern recognition, and computer vision. The aim of this paper is to identify the best feature subsets from the benchmark datasets which improve the performance of the classifiers. Existing filter-based feature selection approaches fail to choose the relevant features from the original feature sets. To obtain the tiny subset of relevant features, we have introduced a novel filter-based feature selection method, called ReCFS. The proposed method is a combination of both feature-feature correlation and nearest neighbor weighted features to find an optimal subset of features to minimize correlation among features. The effectiveness of the selected feature subset by proposed method is evaluated by using two classifiers such as Naïve Bayes and K-Nearest Neighbour on real-life datasets. For the diverse performance measurements, the experiments are conducted on eight real-life datasets of varied dimensionality and number of instances. The result demonstrates that the proposed method has found promising feature subsets which improved the classification accuracy over competing feature selection methods
Get full access to this article
View all access options for this article.
