Sage Journals: Discover world-class research

Abstract

Distant supervision is a widely applied approach in field of relation extraction, which could automatically generate large amounts of labeled training corpus with minimal manual effort. However, the labeled training corpus may have many false positive instances, which would hurt the performance of relation extraction. Moreover, in traditional distant supervised approaches, extraction models adopt human-design features with complicated natural language processing (NLP) preprocessing. It may cause poor performance either. To address these two shortcomings, in this work, we propose a novel Long Short Term Memory (LSTM) network integrated with multi-instance learning. Our approach is supposed to learn and extract features automatically from the data itself and treats distant supervision as a multi-instance learning problem to settle the problem of false positive instances. Experimental results demonstrate that our proposed approach is effective and achieve better performance than traditional methods.

Keywords

Distant supervision relation extraction LSTM sentence embedding multi-instance learning

Get full access to this article

View all access options for this article.

References

Parikh

A.P.

Cohen

S.B.

and Xing

E.P.

, Spectral unsupervised parsing with additive tree metrics, In: Proceedings of ACL, 2014, pages 1062–1072.

Dos Santos

C.N.

Xiang

and Zhou

, Classifying relations by ranking with convolutional neural networks, In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, 2015.

Wilson

and Martinez

, The general inefficiency of batch training for gradient descent learning, Neural Networks 16(10) (2003), 1429–1451.

Zelenko

Aone

and Richardella

, Kernel methods for relation extraction, J. Mach. Learn. Res. 3 (2003), 1083–1106.

Zeng

Liu

Lai

et al., Relation Classification via Convolutional Deep Neural Network, In: COLING, 2014, pages 2335–2344.

Zeng

Liu

Chen

et al., Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks, In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015, pp. 17–21.

Huang

Ahuja

Downey

Yang

et al., Learning representations for weakly supervised natural language processing tasks, Journal of Computational Linguistics 40(1) (2014), 85–120.

Suchanek

Ifrim

and Weikum

, Combining linguistic and statistical analysis to extract relations from web documents, In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006, pp. 712–717.

Suchanek

Fan

Hoffmann

et al., Advances in automated knowledge base construction, SIGMOD Records Journal, March 2013.

10.

and Weld

, Open information extraction using Wikipedia, In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2010.

11.

Hinton

et al., Improving Neural Networks by Preventing Co-adaptation of Feature Detectors, Computer Science 3(4) (2012), 212–223.

12.

Zhou

G.D.

Zhang

D.H.

et al., Tree kernel-based relation extraction with context-sensitive structured parse tree information, In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007.

13.

Palangi

Deng

Shen

et al., Deep sentence embedding using the long short term memory network: Analysis and application to information retrieval, IEEE/ACM Transactions on Audio, Speech, and Language Processing 24(4) (2016), 694–707.

14.

Ebrahimi

and Dou

D.J.

, Chain based RNN for relation classification, In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015, pp. 1244–1249.

15.

Pennington

Socher

and Manning

C.D.

, Glove: Global Vectors for Word Representation, In: EMNLP 14 (2014), 1532–1543.

16.

Fundel

Küffner

and Zimmer

, RelEx – Relation extraction using dependency parse trees, Bioinformatics 23(3) (2007), 365–371.

17.

Hashimoto

Miwa

Tsuruoka

et al., Simple Customization of Recursive Neural Networks for Semantic Relation Classification, In: EMNLP, 2013, 1372–1376.

18.

Feng

Huang

et al., Semantic relation classification via convolutional neural networks with simple negative sampling, Computer Science 71(7) (2015), 941–949.

19.

Qian

L.H.

Zhou

G.D.

Kong

et al., Exploiting constituent dependencies for tree kernel-based semantic relation extraction, In: Proceedings of COLING, 2008, pp. 697–704.

20.

Lyyer

et al., A neural network for factoid question answering over paragraphs, In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 633–644.

21.

Mintz

Bills

Snow

et al., Distant supervision for relation extraction without labeled data, In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics, 2009, pp. 1003–1011.

22.

Miwa

and Bansal

, End-to-end Relation Extraction using LSTMs on Sequences and Tree Structures[J], arXiv, 1601.00770, 2016.

23.

Surdeanu

Tibshirani

Nallapati

et al., Multi-instance multi-label learning for relation extraction, In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 2012, pp. 455–465.

24.

Zeiler

, Adadelta: An adaptive learning rate method, Computer Science, 2012.

25.

Zhang

M.L.

and Zhou

Z.H.

, Adapting RBF neural networks to multi-instance learning, Neural Processing Letters 23(1) (2006), 1–26.

26.

Kambhatla

, Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations, In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, 2004.

27.

Kalchbrenner

Grefenstette

and Blunsom

, A convolutional neural network for modelling sentences, arXiv, 1404.2188, 2014.

28.

Q.V.

and Mikolov

, Distributed representations of sentences and documents, In: ICML 14 (2014), 1188–1196.

29.

Bunescu

and Mooney

, A shortest path dependency kernel for relation extraction, In: Proceedings of HLT/EMNLP, 2005, pp. 724–731.

30.

Bunescu

and Mooney

, Subsequence kernels for relation extraction, In: Proceedings of NIPS 18 (2006), 171–178.

31.

Collobert

Weston

Bottou

et al., Natural language processing (almost) from scratch, The Journal of Machine Learning Research 12 (2011), 2493–2537.

32.

Grishman

, Information Extraction: Capabilities and Challenges, Oxford University Press, August 2011.

33.

Hochreiter

, The vanishing gradient problem during learning recurrent neural nets and problem solutions, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6(2) (1998), 107–116.

34.

Hoffmann

Zhang

Ling

et al., Knowledge-based weak supervision for information extraction of overlapping relations, In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011, pp. 541–550.

35.

McDonald

and Nivre

, Characterizing the Errors of Data-Driven Dependency Parsing Models, In: EMNLP-CoNLL, 2007, 122–131.

36.

Riedel

Yao

and McCallum

, Modeling relations and their mentions without labeled text, Machine Learning and Knowledge Discovery in Databases, Springer Berlin Heidelberg, 2010, 148–163.

37.

Takamatsu

Sato

and Nakagawa

, Reducing wrong labels in distant supervision for relation extraction, In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics, 2012, pp. 721–729.

38.

Socher

Pennington

Huang

E.H.

et al., Semi-supervised recursive autoencoders for predicting sentiment distributions, In: Proceedings of the Conference on Empirical Methods in Natural Language Association for Computational Linguistics, 2011, pp. 151–161.

39.

Dietterich

Lathrop

R.H.

and Lozano-Pérez

, Solving the multiple instance problem with axis-parallel rectangles, Artificial Intelligence 89(1) (1997), 31–71.

40.

Mikolov

Chen

Corrado

et al., Efficient estimation of word representations in vector space, Computer Science, 2013.

41.

Wong

Liu

and Bennamoun

, Ontology learning from text: A look back and into the future, ACM Computing Surveys (CSUR) 44(4) (2012), 20.

42.

Zhao

R.L.

and Grishman

, Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction, In: Proceedings of Association for Computational Linguistics, 2013.

43.

Zaremba

and Sutskever

, Learning to execute, arXiv, 1410.4615, 2014.

44.

Mou

et al., Classifying relations via long short term memory networks along shortest dependency paths, In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2015.

45.

Kim

, Convolutional neural networks for sentence classification, arXiv, 1408.5882, 2014.

Distant supervised relation extraction via long short term memory networks with sentence embedding

Abstract

Keywords

Get full access to this article

References