Abstract
We present a method for incorporating global features in named entity recognizers using reranking techniques and the combination of two state-of-the-art NER learning algorithms: conditional random fields (CRFs) and support vector machines (SVMs). The reranker employs two kinds of features: flat and structured features. The former are generated by a polynomial kernel encoding entity features whereas tree kernels are used to model dependencies amongst tagged candidate examples. The experiments on two standard corpora in two languages, i.e. the Italian EVALITA 2009 and the English CoNLL 2003 datasets, show a large improvement on CRFs in F-measure, i.e., from 80.34% to 84.33% and from 84.86% to 87.99%, respectively. Our analysis reveals that (i) both kernels provide a comparable improvement over the CRFs baseline; and (ii) their combination improves CRFs much more than the sum of the individual contributions, suggesting an interesting synergy.
Get full access to this article
View all access options for this article.
