Schema induction from incomplete semantic data

Abstract

With the development of the Semantic Web, more and more semantic data including many useful knowledge bases has been published on the Web. Such knowledge bases always lack expressive schema information, especially disjointness axioms and subclass axioms. This makes it difficult to perform many critical Semantic Web tasks like ontology reasoning, inconsistency handling and ontology mapping. To deal with this problem, a few approaches have been proposed to generate terminology axioms. However, they often adopt the closed world assumption which is opposite to the assumption adopted by the semantic data. This may lead to a lot of noisy negative examples so that existing learning approaches fail to perform well on such incomplete data. In this paper, a novel framework is proposed to automatically obtain disjointness axioms and subclass axioms from incomplete semantic data. This framework first obtains probabilistic type assertions by exploiting a type inference algorithm. Then a mining approach based on association rule mining is proposed to learn high-quality schema information. To address the incompleteness problem of semantic data, the mining model introduces novel definitions to compute the support and confidence for pruning false axioms. Our experimental evaluation shows promising results over several real-life incomplete knowledge bases like DBpedia and LUBM by comparing with existing relevant approaches.

Keywords

Ontology learning knowledge bases open world assumption association rule mining semantic web

Get full access to this article

View all access options for this article.

References

Berners-Lee

Hendler

Lassila

et al., The semantic web, Scientific American 284(5) (2001), 28–37.

Bizer

Heath

and Berners-Lee

, Linked data – the story so far, Int. J. Semantic Web Inf. Syst. 5(3) (2009), 1–22.

Mitchell

T.M.

Cohen

W.W.

Hruschka

E.R.

, Jr., et al., NMever-ending learning, in: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25–30, 2015, Austin, Texas, USA., 2015, pp. 2302–2310.

C.D.

Ratner

Ré

Shin

Wang

and Zhang

, Incremental knowledge base construction using deepdive, VLDB J. 26(1) (2017), 81–105.

Lehmann

Isele

Jakob

Jentzsch

Kontokostas

Mendes

P.N.

Hellmann

Morsey

van Kleef

Auer

and Bizer

, Dbpedia – A large-scale, multilingual knowledge base extracted from wikipedia, Semantic Web 6(2) (2015), 167–195.

Suchanek

F.M.

Kasneci

and Weikum

, Yago: A core of semantic knowledge, in: Proceedings of WWW 2007, 2007, pp. 697–706.

Wang

and Zhu

K.Q.

, Probase: a probabilistic taxonomy for text understanding, in: Proceedings of SIGMOD 2012, 2012, 481–492.

Fleischhacker

and Völker

, Inductive learning of disjointness axioms, in: Proceedings of OTM 2011, 2011, pp. 680–697.

Völker

and Niepert

, Statistical schema induction, in: Proceedings of ESWC 2011, 2011, pp. 124–138.

10.

Töpper

Knuth

and Sack

, Dbpedia ontology enrichment for inconsistency detection, in: Proceedings of I-SEMANTICS 2012, 2012, pp. 33–40.

11.

Völker

Fleischhacker

and Stuckenschmidt

, Automatic acquisition of class disjointness, J. Web Sem. 35 (2015), 124–139.

12.

Meilicke

Völker

and Stuckenschmidt

, Learning disjointness for debugging mappings between lightweight ontologies, in: Proceedings of EKAW 2008, 2008, pp. 93–108.

13.

Noessner

Niepert

Meilicke

and Stuckenschmidt

, Leveraging terminological structure for object reconciliation, in: The Semantic Web: Research and Applications, 7th Extended Semantic Web Conference, ESWC 2010, Heraklion, Crete, Greece, May 30–June 3, 2010, Proceedings, Part II, 2010, pp. 334–348.

14.

Nolle

Meilicke

Chekol

M.W.

Nemirovski

and Stuckenschmidt

, Schema-based debugging of federated data sources, in: ECAI2016 – 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands – Including Prestigious Applications of Artificial Intelligence (PAIS 2016), 2016, pp. 381–389.

15.

Wang

and Zhuang

, Approximating model-based abox revision in dl-lite: Theory and practice, in: Proceedings of AAAI 2015, 2015, pp. 254–260.

16.

Bühmann

Lehmann

and Westphal

, Dl-learner – A framework for inductive learning on the semantic web, J. Web Sem. 39 (2016), 15–24.

17.

Fanizzi

d’Amato

and Esposito

, DL-FOIL concept learning in description logics, in: Proceedings of ILP 2008, 2008, pp. 107–121.

18.

Zhu

Gao

Pan

J.Z.

Zhao

and Quan

, Tbox learning from incomplete data by inference in belnet+, Knowledge-Based Systems 75(5) (2015), 30–40.

19.

Zhu

Gao

Pan

Zhao

and Quan

, Ontology learning from incomplete semantic web data by belnet, in: Proceedings of ICTAI 2013, 2013, pp. 761–768.

20.

Paulheim

and Bizer

, Type inference on noisy RDF data, in: Proceedings of ISWC 2013, 2013, pp. 510–525.

21.

Han

Kamber

and Pei

, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2012.

22.

Galárraga

L.A.

Teflioudi

Hose

and Suchanek

, AMIE: Association rule mining under incomplete evidence in ontological knowledge bases, in: Proceedings of WWW 2013, 2013, pp. 413–422.

23.

Lehmann

Hitzler

, Concept learning in description logics using refinement operators, Machine Learning 1–2(78) (2010) 203–250.

24.

Hellmann

Lehmann

and Auer

, Learning of owl class expressions on very large knowledge bases and its applications, Interoperability Semantic Services and Web Applications (2011) 104–130.

25.

Bühmann

and Lehmann

, Universal OWL axiom enrichment for large knowledge bases, in: Proceedings of EKAW 2012, 2012, pp. 57–71.

26.

Völker

Vrandečić

Sure

and Hotho

, Learning disjointness, in: Proceedings of ESWC 2007, 2007, pp. 175–189.

27.

Miller

G.A.

, Wordnet: A lexical database for english, Commun, ACM 38(11) (1995), 39–41.

28.

Guarino

and Welty

C.A.

, A formal ontology of properties, in: Proceedings of EKAW 2000, 2000, pp. 97–112.