Abstract
Rule-learning extracts the knowledge from a dataset and represent it in a form that is easy for people to understand. RIPPER (Repeated Incremental Pruning to Produce Error Reduction) and PART (Partial Decision Trees) are two well-known schemes for rule-learning. However, due to overpruning of RIPPER and skew-sensitivity of PART, it is difficult to use two methods to learn from imbalanced datasets. To bypass these difficulties, we propose a K-L divergence-based PART (KLPART) that use K-L divergence as a splitting criterion to build partial decision trees. An experimental framework is carried out with a wide range of imbalanced datasets over RIPPER, PART, KLPART and the combination of these methods for classification with SMOTE processing. The results obtained, which contrasted through nonparametric statistical tests, show that KLPART is robust in the presence of class imbalance, especially when combined with SMOTE. We thereby recommend the use of KLPART with SMOTE when learning from imbalanced datasets.
Get full access to this article
View all access options for this article.
