Abstract
The task of discovering sets of good rules from imbalanced class datasets may not come easy for existing class association rule mining algorithms. The reason is that they often generate rules belonging to the dominant classes. For example, in medical applications, some symptoms of illness are not popular, and the doctors are very interested in the rules associated with these symptoms. This paper proposes a novel approach for mining class association rules (CARs) in imbalanced class datasets. Firstly, assuming there are n given classes, the training dataset is split into n corresponding groups. For each group, the data is clustered by the k-means algorithm into k groups where the value of k is equal to the number of records of the smallest group. Secondly, we combine all records from the groups after clustering and use the CAR-Miner-Diff algorithm to mine all CARs. We also propose an iterative method to get a highly accurate classifier. From experiments, we show that the proposed approach outperforms existing algorithms while maintaining a large number of useful rules in the classifier.
Keywords
Get full access to this article
View all access options for this article.
