Text summarization based on transformer using sentences ranking and time penalty

Abstract

Text summarization systems often struggle with selecting salient content, avoiding repetition, and handling out-of-vocabulary entities. We address these issues with a two-stage approach: a supervised sentence-ranking head (SRM-head) first selects the top- $N$ sentences, and a Transformer generator then produces the summary. The generator is augmented with a time penalty in encoder–decoder attention to discourage reattending to recently focused source positions, and with a pointer mechanism that copies salient spans, thereby improving entity and number fidelity. Experiments on CNN/DailyMail and WikiHow, plus an additional evaluation on XSum, show that our model attains competitive ROUGE scores against recent pretrained systems while using lightweight, modular components.

Keywords

text summarization encoder–decoder Seq–Seq time penalty mechanism pointer network

Get full access to this article

View all access options for this article.

References

Wang

. Surveying the landscape of text summarization with deep learning: a comprehensive review. Discret Math Algorithms Appl 2024; 16: 2330004:1.

Wahab

MHH

Ali

Hamid

NAWA

, et al. A review on optimization-based automatic text summarization approach. IEEE Access 2024; 12: 4892–4909.

Mendoza

Bonilla

Noguera

, et al. Extractive single-document summarization based on genetic operators and guided local search. Expert Syst Appl 2014; 41: 4158–4169.

Sutskever

Vinyals

. Sequence to sequence learning with neural networks. In: Proceedings of the 28th international conference on neural information processing systems-volume 2 , 2014, pp.3104–3112. MIT Press.

Yao

Zhang

Luo

, et al. Deep reinforcement learning for extractive document summarization. Neurocomputing 2018; 284: 52–62. DOI: 10.1016/J.NEUCOM.2018.01.020.

Verma

Nidhi

. Extractive summarization using deep learning. Res Comput Sci 2018; 147: 107–117.

Joshi

Fidalgo

Alegre

, et al. Deepsumm: exploiting topic models and sequence to sequence networks for extractive text summarization. Expert Syst Appl 2023; 211: 118442.

Kondath

Suseelan

Idicula

. Extractive summarization of malayalam documents using latent Dirichlet allocation: an experience. J Intell Syst 2022; 31: 393–406.

Brito

Lübbering

Biesner

, et al. Towards supervised extractive text summarization via RNN-based sequence classification. CoRR abs/1911.06121, 2019. DOI: 10.1016/j.future.2019.04.045.

10.

Chopra

Auli

Rush

. Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies , 2016, pp.93–98. Association for Computational Linguistics.

11.

Vaswani

Shazeer

Parmar

, et al. Attention is All you need. In: Proceedings of the 31st international conference on neural information processing systems , 2017, pp.6000–6010. Curran Associates Inc.

12.

Chen

Zhuge

. Extractive summarization of documents with images based on multi-modal RNN. Future Gener Comput Syst 2019; 99: 186–196.

13.

Madatov

Bekchanov

Vicic

. Uzbek text summarization based on TF-IDF. CoRR abs/2303.00461, 2023, https://doi.org/10.48550/arXiv.2303.00461.

14.

. Extractive text summarization using word frequency algorithm for English text. In: Working notes of FIRE 2022 - forum for information retrieval evaluation, Kolkata, India, 9–13 December 2022, Volume 3395, 2022, pp.403–408. https://ceur-ws.org/Vol-3395/T6-4.pdf.

15.

Manh

Thanh

Minh

. Extractive multi-document summarization using k-means, centroid-based method, MMR, and sentence position. In: Proceedings of the tenth international symposium on information and communication technology, Ha Noi, Ha Long Bay, Vietnam, 4–6 December 2019, 2019, pp.29–35. DOI: 10.1145/3368926.3369688.

16.

Schumann

. Unsupervised abstractive sentence summarization using length controlled variational autoencoder. CoRR abs/1809.05233, 2018, http://arxiv.org/abs/1809.05233.

17.

Luo

Chen

Jiang

, et al. Gap sentences generation with textrank for Chinese text summarization. In: Proceedings of the 5th international conference on algorithms, computing and artificial intelligence, ACAI 2022, Sanya, China, 23–25 December 2022. 2022, pp.67:1–67:5. DOI: 10.1145/3579654.3579725.

18.

Akülker

Turhan

. Extractive text summarization for turkish: Implementation of TF-IDF and pagerank algorithms. In: Arai K (ed.) Intelligent systems and applications - oroceedings of the 2022 intelligent systems conference, IntelliSys 2022, Amsterdam, The Netherlands, 1–2 September 2022. Volume 3. Lecture Notes in Networks and Systems, volume 544, 2022, pp.688–704. DOI: 10.1007/978-3-031-16075-2_51.

19.

Zheng

Lapata

. Sentence centrality revisited for unsupervised summarization. In: Korhonen A, Traum DR and Màrquez L (eds.) Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, 28 July–2 August 2019, Volume 1: Long Papers, 2019, pp.6236–6247. DOI: 10.18653/V1/P19-1628.

20.

Ekmekci

Hagerman

Howald

. Specificity-based sentence ordering for multi-document extractive risk summarization. CoRR abs/1909.10393, 2019, http://arxiv.org/abs/1909.10393.

21.

Song

Huang

Ruan

. Abstractive text summarization using LSTM-CNN based deep learning. Multim Tools Appl 2019; 78: 857–875.

22.

Liu

Lapata

. Hierarchical transformers for multi-document summarization. arXiv preprint arXiv:1905.13164, 2019.

23.

Nallapati

Zhai

Zhou

. SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence , 2017, pp.3075–3081. AAAI Press.

24.

Zhou

Yang

Wei

, et al. Neural document summarization by jointly learning to score and select sentences. arXiv preprint arXiv:1807.02305, 2018.

25.

Sun

, et al. Improving semantic relevance for sequence-to-sequence learning of Chinese social media text summarization. CoRR abs/1706.02459, 2017, http://arxiv.org/abs/1706.02459.

26.

See

Liu

Manning

. Get to the point: summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368, 2017.

27.

Devlin

Chang

Lee

, et al. BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

28.

Koupaee

Wang

. Wikihow: a large scale text summarization dataset. arXiv preprint arXiv:1810.09305, 2018.

29.

Rush

Chopra

Weston

. A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685. 2015.

30.

Liu

Litvak

, et al. In conclusion not repetition: comprehensive abstractive summarization with diversified attention based on determinantal point processes. arXiv preprint arXiv:1909.10852, 2019.

31.

Ailem

Zhang

Sha

. Topic augmented generator for abstractive summarization. arXiv preprint arXiv:1908.07026, 2019.

32.

Lewis

Liu

Goyal

, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics , 2020, pp.7871–7880. Association for Computational Linguistics.

33.

Zhang

Zhao

Saleh

, et al. PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. In: Proceedings of the 37th international conference on machine learning , 2020, pp.11328–11339. PMLR.

34.

Raffel

Shazeer

Roberts

, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 2020; 21: 1–67.