Abstract
The problem of cross-document person profiling aimed at identifying and linking person entities across Web pages and extracting their relevant structured information. In this paper, we specifically focus on the core task of person profiling problem, namely the attribute extraction task. For attribute extraction, the existing approaches face several challenges that two important of them include (i) syntactic and structure variation, and (ii) cross-sentence and cross-document information extraction. To alleviate these deficiencies and improve performance of existing methods, we propose a semantic attribute extraction approach relying on probabilistic reasoning. Our approach produces structured, meaningful profiles in which the resulting textual facts are linked to their possible actual meaning in a distant ontology. We evaluate our approach on standard profile extraction datasets. Experimental results demonstrate that our approach achieves better results when compared with several baselines and state of the art counterparts. The results justify that our approach is a promising solution to the problem of person profiling.
Get full access to this article
View all access options for this article.
