Sage Journals: Discover world-class research

Abstract

The availability of Linked Data on the web offers significant opportunities to enhance data integration and interoperability across diverse systems and applications. Discovering types and building hierarchies is a key task to enhance the semantic organisation of resource description framework (RDF) graphs. However, this task is challenging due to entity variability, rich semantic contexts and the frequent absence of explicit type annotations. This article addresses the problem of missing entity types by developing an automatic method for predicting entity types to ensure that schemas remain relevant and reflect the data’s structure. Current solutions do not incorporate RDF2Vec embeddings within a sophisticated architecture that represents the hierarchical relationship between entity types. Taking DBpedia as a case study, one of the largest datasets in the Linked Data Cloud, we present a novel approach using a sequence-to-sequence architecture, specifically an Encoder–Decoder model with an attention mechanism, which effectively models structured sequences and captures contextual dependencies, essential for accurate entity typing and the generation of semantically meaningful type hierarchies. Our experiments demonstrate that using a bidirectional gated recurrent unit (GRU) cell in the Encoder with an attention mechanism yields the best performance, and our findings indicate that this approach shows promising results with different evaluation metrics, providing a practical solution for schema enrichment in Linked Data.

Keywords

Attention mechanism deep learning Linked Data RDF schema sequence-to-sequence types prediction

Get full access to this article

View all access options for this article.

References

Hallo

Luján-Mora

Maté

, et al. Current state of linked data in digital libraries. J Inf Sci 2016; 42(2): 117–127.

Kellou-Menouer

Kardoulakis

Troullinou

, et al. A survey on semantic schema discovery. VLDB J 2021: 1–36.

Paulheim

Bizer

. Type inference on noisy RDF data. In: Alani

Kagal

Fokoue

, et al. International semantic web conference. Cham: Springer, pp. 510–525.

Goasdoué

Guzewicz

Manolescu

Rdf graph summarization for first-sight structure discovery. VLDB J 2020; 29(5): 1191–1218.

Trouli

Pappas

Troullinou

, et al. Summer: structural summarization for RDF/S KGs. Algorithms 2022; 16(1): 18.

Agathangelos

Troullinou

Kondylakis

, et al. Incremental data partitioning of RDF data in spark. In: The semantic web: ESWC 2018 satellite events, Heraklion, Greece, 3–7 June 2018, pp. 50–54. Cham: Springer.

Nolle

Chekol

Meilicke

, et al. Automated fine-grained trust assessment in federated knowledge bases. In: The semantic Web–ISWC 2017: 16th international semantic web conference, Vienna, Austria, 21–25 October 2017, pp. 490–506. Cham: Springer.

Wang

Mao

Wang

, et al. Knowledge graph embedding: a survey of approaches and applications. IEEE Trans Knowl Data Eng 2017; 29(12): 2724–2743.

Paulheim

Knowledge graph refinement: a survey of approaches and evaluation methods. Semant Web 2016; 8(3): 489–508.

10.

Bizer

Heath

Berners-Lee

Linked data-the story so far. In: Sheth

(ed.) Linking the world’s information: essays on Tim Berners-Lee’s invention of the World Wide Web. Hershey, PA: IGI Global, 2023, pp. 115–143.

11.

Berners-Lee

Linked data-design issues, http://www.w3.org/DesignIssues/LinkedData.html (2006, accessed 22 January 2025).

12.

Melo

Völker

Paulheim

Type prediction in noisy RDF knowledge bases using hierarchical multilabel classification with graph and latent features. Int J Artif Intell Tools 2017; 26(2): 1760011.

13.

Hamel

Fareh

Missing types prediction in linked data using deep neural network with attention mechanism: case study on DBpedia and UniProt datasets. In: Ziemba

Chmielarz

Wątróbski

(eds) Special sessions in the advances in information systems and technologies track of the conference on computer science and intelligence systems. Cham: Springer, 2023, pp. 212–231.

14.

Parundekar

Classification of things in DBpedia using deep neural networks. arXiv:180202528 2018.

15.

Biswas

Portisch

Paulheim

, et al. Entity type prediction leveraging graph walks and entity descriptions. In: International semantic web conference. Cham: Springer, pp. 392–410.

16.

Sofronova

Biswas

Alam

, et al. Entity typing based on rdf2vec using supervised and unsupervised methods. In: The semantic web: ESWC 2020 satellite events, Heraklion, Greece, 31 May–4 June 2020, pp. 203–207. Cham: Springer.

17.

Biswas

Sofronova

Sack

, et al. Cat2type: Wikipedia category embeddings for entity typing in knowledge graphs. In: Gentile

Gonçalves

(eds) Proceedings of the 11th knowledge capture conference. New York: ACM, pp. 81–88.

18.

Ristoski

Rosati

Di Noia

, et al. RDF2Vec: RDF graph embeddings and their applications. Semant Web 2019; 10(4): 721–752.

19.

Jin

Hou

, et al. Fine-grained entity typing via hierarchical multi graph convolutional networks. In: Inui

Jiang

, et al. (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). pp. 4969–4978. USA: ACL.

20.

Paulheim

Ristoski

Portisch

Embedding knowledge graphs with RDF2vec. Cham: Springer Nature, 2023.

21.

Bahmid

Zouaq

Hybrid question answering using heuristic methods and linked data schema. In: 2018 IEEE/WIC/ACM international conference on web intelligence (WI), Santiago, Chile, 3–6 December 2018, pp. 446–451. New York: IEEE.

22.

Agathangelos

Troullinou

Kondylakis

, et al. RDF query answering using apache spark: review and assessment. In: 2018 IEEE 34th international conference on data engineering workshops (ICDEW), Paris, 16–20 April 2018, pp. 54–59. New York: IEEE.

23.

Hadri

Fareh

. A comprehensive review of schema extraction approaches in linked data. In: 2024 international conference on advances in electrical and communication technologies (ICAECOT), Setif, Algeria, 1-3 October 2024, pp. 1–6. New York: IEEE.

24.

Yousuf

Lahzi

Salloum

, et al. A systematic review on sequence-to-sequence learning with neural network and its models. Int J Electr Comput Eng 2021; 11(3): 2315–2326.

25.

Noh

SH.

Analysis of gradient vanishing of RNNs and performance comparison. Information 2021; 12(11): 442.

26.

Tang

Shi

Wang

, et al. Memory visualization for gated recurrent neural networks in speech recognition. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 2736–2740. New York: IEEE.

27.

Luong

MT.

Effective approaches to attention-based neural machine translation. arXiv:150804025 2015.

28.

Vaswani

Shazeer

Parmar

, et al. Attention is all you need. Adv Neural Inf Process Syst 2017; 30.

29.

Bahdanau

Cho

Bengio

Neural machine translation by jointly learning to align and translate. arXiv:14090473 2014.

30.

Pattuelli

Rubinow

The knowledge organization of DBpedia: a case study. J Doc 2013; 69(6): 762–772.

31.

Färber

Bartscherer

Menne

, et al. Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semant Web 2017; 9(1): 77–129.

32.

Craswell

Mean reciprocal rank. In Encyclopedia of database systems, 2009, pp. 1703–1703.

Towards a knowledge graph schema: Taxonomy entity typing with sequence-to-sequence architectures

Abstract

Keywords

Get full access to this article

References