Abstract
Feature selection is a preprocessing technique in the field of data analysis, which is used to reduce the number of features by removing irrelevant, noisy, and redundant data, thus resulting in acceptable classification accuracy. This process constitutes a commonly encountered problem of global combinatorial optimization. This paper presents a novel optimization algorithm called complementary distribution binary particle swarm optimization (CD-BPSO). CD-BPSO uses a complementary distribution strategy to improve the search capability of binary particle swarm optimization (BPSO) by facilitating global exploration and local exploitation via complementary particles and original particles, respectively. This complementary approach introduces new "complementary particles" into the search space. These new particles are generated by using half of all particles selected at random, and replace the selected particles when the fitness of the global best particle has not improved for a number of consecutive iterations. The K-nearest neighbor (K-NN) method with leave-one-out cross-validation (LOOCV) was used to evaluate the quality of the solutions. The proposed method was applied and compared to ten classification problems taken from the literature. Experimental results indicate that CD-BPSO improves on the BPSO algorithm with a complementary strategy that prevents entrapment in a local optimum. In the feature selection problem, BPSO preserves knowledge of good feature selection combinations in all the particles and thus the swarm can find optimum combinations of solutions by following the best particle, and either obtains higher classification accuracy or uses fewer features than other feature selection methods.
Keywords
Get full access to this article
View all access options for this article.
