Abstract
This paper describes an empirical research work based on the use of a suitable data structure, named Flow Graph (FG), that can be induced from a supervised training data set. A FG can be approached as a weighted and labeled digraph that summarizes a given supervised training set, aiming at its analysis. FGs can also be used as a repository of the information embedded in training sets, that supports the extraction of classification rules, aiming at the definition of classifiers. The work described in this paper reviews FGs and related concepts, as originally proposed i.e., a suitable structure for modeling discrete data, and proposes its customization for dealing with continuous data. The customization consists of a pre-processing step where a discretization process is carried out in a two-step hybrid approach named HFG (Hybrid Flow Graph). Several experiments with focus on the classifiers extracted from HFGs were conducted and their results were analyzed with focus on both, the value of some metrics associated with the induced digraph-based structure as well as the performance of the classifier extracted from the structure. For the experiments 19 diversified datasets were used and the classification results were comparatively analyzed with those obtained by classifiers induced using four other algorithms namely, J48, Naïve Bayes, k-Nearest-Neighbor and Support Vector Machine.
Keywords
Get full access to this article
View all access options for this article.
