Abstract
Data reduction can increase generalization abilities of the learning model and shorten learning time. It can be particularly helpful in analyzing big data sets. This paper focuses on the machine learning from examples with data reduction. In the paper data reduction is carried out by selection of relevant instances, called prototypes. The discussed approach bases on the assumption that the selection of prototypes is carried-out by a team of agents and that the prototype instances are selected from clusters of instances under the constraint that from each cluster a single prototype is obtained. For cluster initialization the kernel-based fuzzy clustering algorithm is used. Main feature of the proposed approach is integrating data reduction with the stacking technique. Stacked generalization assures diversification among prototypes, and hence, base classifiers. To validate the proposed approach we have carried-out computational experiment. We have also evaluated experimentally the influence of the clustering method and the number of stacking folds used, on the classification accuracy.
Get full access to this article
View all access options for this article.
