Abstract
Distant supervision is a widely applied approach in field of relation extraction, which could automatically generate large amounts of labeled training corpus with minimal manual effort. However, the labeled training corpus may have many false positive instances, which would hurt the performance of relation extraction. Moreover, in traditional distant supervised approaches, extraction models adopt human-design features with complicated natural language processing (NLP) preprocessing. It may cause poor performance either. To address these two shortcomings, in this work, we propose a novel Long Short Term Memory (LSTM) network integrated with multi-instance learning. Our approach is supposed to learn and extract features automatically from the data itself and treats distant supervision as a multi-instance learning problem to settle the problem of false positive instances. Experimental results demonstrate that our proposed approach is effective and achieve better performance than traditional methods.
Get full access to this article
View all access options for this article.
