Abstract
Most extractive multi-document summarization (MDS) methods relies on extraction of content relevant sentences ignoring sentence relationships. In this work, we propose a unified framework for extractive MDS that also considers sentence relationships. We argue that adding a sentence to the summary increases summary score by relevance score of the new sentence plus some additional score which depends on the relationships of new sentence with other summary sentences. The quantification of additional score depends on how coherent the new sentence is with respect to the existing sentences in the summary. Simultaneously, some score is decreased from the summary score due to the redundancy which depends on overlap between new and existing summary sentences. To find the exact solution, sentence extraction problem is modeled as integer linear problem. The sentence relevance score is found using content and surface features of the sentence using topic model and regression framework. To find the relative coherence score, transition probabilities in the entity grid model are used. Redundancy between sentences is found using support vector regression that uses sentence overlapping features. The proposed method is evaluated on DUC datasets over query based multi-document summarization task. DUC 2006 dataset is used as training and development set for tuning parameters. Experimental results produce ROUGE score comparable to the state-of-the-art methods demonstrating the effectiveness of the proposed method.
Get full access to this article
View all access options for this article.
