Abstract
This paper analyzes the existing decision tree classification algorithms and finds that these algorithms based on variable precision rough set (VPRS) have better classification accuracy and can tolerate the noise data. But when constructing decision tree based on variable precision rough set, these algorithms have the following shortcomings: the choice of attribute is difficult and the decision tree classification accuracy is not high. Therefore, this paper proposes a new variable precision rough set based decision tree algorithm (IVPRSDT). This algorithm uses a new standard of attribute selection which considers comprehensively the classification accuracy and number of attribute values, that is, weighted roughness and complexity. At the same time support and confidence are introduced in the conditions of the corresponding node to stop splitting, and they can improve the algorithm's generalization ability. To reduce the impact of noise data and missing values, IVPRSDT uses the label predicted method based on match. The comparing experiments on twelve different data sets from the UCI Machine Learning Repository show that IVPRSDT can effectively improve the classification accuracy.
Get full access to this article
View all access options for this article.
