Abstract
The availability of Linked Data on the web offers significant opportunities to enhance data integration and interoperability across diverse systems and applications. Discovering types and building hierarchies is a key task to enhance the semantic organisation of resource description framework (RDF) graphs. However, this task is challenging due to entity variability, rich semantic contexts and the frequent absence of explicit type annotations. This article addresses the problem of missing entity types by developing an automatic method for predicting entity types to ensure that schemas remain relevant and reflect the data’s structure. Current solutions do not incorporate RDF2Vec embeddings within a sophisticated architecture that represents the hierarchical relationship between entity types. Taking DBpedia as a case study, one of the largest datasets in the Linked Data Cloud, we present a novel approach using a sequence-to-sequence architecture, specifically an Encoder–Decoder model with an attention mechanism, which effectively models structured sequences and captures contextual dependencies, essential for accurate entity typing and the generation of semantically meaningful type hierarchies. Our experiments demonstrate that using a bidirectional gated recurrent unit (GRU) cell in the Encoder with an attention mechanism yields the best performance, and our findings indicate that this approach shows promising results with different evaluation metrics, providing a practical solution for schema enrichment in Linked Data.
Get full access to this article
View all access options for this article.
