Abstract
Label noise, which has not been well studied yet, is present in many machine learning problems and would make negative influence to both the classifier and feature selection procedure. To address this issue, we propose a novel mutual information estimator using Parzen window based on a probabilistic label noise model, which could be robust to incorrect label samples. Then we utilize the estimator to achieve a robust feature selection algorithm for label noise. Experimentation is executed over a toy dataset and eight real world datasets. Results after performing classification with a kNN classifier reveal that the proposed approach is sound and able to reduce the influence of label noise effectively and to improve the performance of feature selection in the presence of label noise.
Get full access to this article
View all access options for this article.
