Abstract
An approach to performing linguistic summaries of graph datasets, with particular focus on usage of ontologies is presented in this paper. This well-known mining technique is based on fuzzy set theory, which is used to model natural language words (e.g. ‘many’, ‘tall’), and in result - generates natural-like sentences describing the data. Although intensely developed, before our work this method has been applied only to relational databases, while more and more data is available in graph model. A special case of such graph datasets is the Semantic Web, in which ontologies provide meaning, therefore enabling advanced machine learning. In our paper we analyze the problem of generating linguistic summaries for a graph data case (for which the method cannot be directly applied), with associated ontologies. The key element of ontologies are concept hierarchies, which are the core of our work. Firstly, due to heterogeneity and lack of schema we propose to use an ontological concept (including all sub-concepts in hierarchy) as a subject for summaries, and extract their attributes (neighboring vertexes). Then we show that by ascending these ontological concept hierarchies (so by attribute-based induction) we obtain additional, generalized summaries. We show this process for both summarizers and qualifiers, and propose an extension to their respective imprecision measures -
Get full access to this article
View all access options for this article.
