Sage Journals: Discover world-class research

Abstract

Information accessibility has been transformed in the field of Artificial Intelligence (AI), particularly in natural language processing (NLP), owing to the widespread use of technologies such as ChatGPT. This paper explores the field of human-directed AI solutions with particular emphasis on newspaper summarization, a useful tool in today's busy world. Utilizing comprehensive models (LLMs), we explore extractive and abstractive summarization methods. To maximize the LLM performance, our strategy entails creating a customized news dataset enhanced with human-centric summaries and using cutting-edge data preparation techniques. To improve the accuracy assessment, we present a modified evaluation metric called Sem-rouge, which augments established units of measurement. In a comparative analysis, it was noticed that the proposed metric can highlight both syntactic and semantic similarities; hence, the metric is suitable for both extractive and abstractive summarization methods. We highlight the significance of dataset selection, data processing methods, and assessment criteria in fine-tuning auto-generated summaries using rigorous comparison analysis. Further studies will focus on improving semantic similarity techniques, integrating advanced models such as The BERT algorithm or Generative Pre-trained Transformer algorithm, and overcoming challenges such as overfitting. Finally, our study emphasizes the importance of meticulously training models and modifying them frequently to enhance automated summarization skills.

Keywords

information systems information retrieval retrieval tasks and goals summarization extractive summarization

Get full access to this article

View all access options for this article.

References

Aruneshwari

R. R

., Anandkumar

K. M

., & Kavitha

. (2024, January). A comprehensive review of text summarization. In AIP Conference Proceedings (Vol. 2802, pp. 140003). AIP Publishing.

Barzilay

McKeown

K. R.

(2005). Sentence fusion for multidocument news summarization. Computational Linguistics, 31(3), 297–328. https://doi.org/10.1162/089120105774321091

Bernard Rous. (2012). Major update to ACM's Computing Classification System. Commun. ACM 55, 11 (November 2012), 12. https://doi.org/10.1145/2366316.2366320

Bhat

I. K.

Mohd

Hashmy

(2018). Sumitup: A hybrid single-document text summarizer. In Soft computing: Theories and applications: Proceedings of SoCTA 2016 (Vol. 1 , pp. 619–634). Springer Singapore.

Cajueiro

D. O.

Nery

A. G.

Tavares

De Melo

M. K.

Reis

S. A. D.

Weigang

Celestino

V. R.

(2023). A comprehensive review of automatic text summarization techniques: method, data, evaluation and coding. ArXiv preprint arXiv, 2301, 03403.

Chang

Y. J.

Chung

Y. J.

Shih

Y. H.

Chang

H. C.

Lin

T. H.

(2017, September). What do smartphone users do when they sense phone notifications? Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers, pp. 904–909.

Chen

Bolton

Manning

C. D.

(2016). A thorough examination of the CNN/Daily Mail reading comprehension task. ArXiv preprint arXiv, 1606, 02858.

Chen

Montaño

E. T.

Puzon

(2017). An examination of the CNN/daily mail neural summarization task. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 2358–2367.

Devlin

Chang

M. W.

Lee

Toutanova

(2018). Bert: Pre-training of deep bidirectional transformers for language understanding. ArXiv preprint arXiv, 1810, 04805.

10.

Grover

., & Leskovec

. (2016, August). Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864.

11.

Hovy

Lin

C. Y.

(1998, October 13–15). Automated text summarization and the SUMMARIST system. In TIPSTER TEXT PROGRAM PHASE III: Proceedings of a workshop, Baltimore, Maryland, pp. 197–214.

12.

Jain

Arora

Morato

Yadav

Kumar

K. V.

(2022). Automatic text summarization for hindi using real coded genetic algorithm. Applied Sciences, 12(13), 6584. https://doi.org/10.3390/app12136584

13.

Joshi

Fidalgo

Alegre

Alaiz-Rodriguez

(2022). Ranksum—an unsupervised extractive text summarization based on rank fusion. Expert Systems with Applications, 200, 116846. https://doi.org/10.1016/j.eswa.2022.116846

14.

Joshi

Fidalgo

Alegre

Al Nabki

M. W.

(2018). Extractive text summarization in dark web: A preliminary study. International Conference of Applications of Intelligent Systems.

15.

Joshi

Fidalgo

Alegre

Fernández-Robles

(2023). Deepsumm: Exploiting topic models and sequence to sequence networks for extractive text summarization. Expert Systems with Applications, 211, 118442. https://doi.org/10.1016/j.eswa.2022.118442

16.

Kadriu

Obradovic

(2021). Extractive approach for text summarization using graphs. ArXiv preprint arXiv, 2106, 10955.

17.

Koh

H. Y.

Liu

Pan

(2022). An empirical survey on long document summarization: Datasets, models, and metrics. ACM Computing Surveys, 55(8), 1–35. https://doi.org/10.1145/3545176

18.

Kryściński

Keskar

N. S.

McCann

Xiong

Socher

(2019). Neural text summarization: A critical evaluation. ArXiv preprint arXiv, 1908, 08960.

19.

Lewis

Liu

Goyal

Ghazvininejad

Mohamed

Levy

Stoyanov

Zettlemoyer

(2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. ArXiv preprint arXiv, 1910, 13461.

20.

Liu

(2019). Fine-tune BERT for extractive summarization. Arxiv Preprint ArXiv, 1903, 10318.

21.

Merchant

Pande

. (2018, September). Nlp based latent semantic analysis for legal text summarization. In 2018 International conference on advances in computing, communications and informatics (ICACCI), pp. 1803–1807. IEEE.

22.

Mishra

Sethi

Agilandeeswari

(2022, December). Two phase ensemble learning based extractive summarization for short documents. In International conference on soft computing and pattern recognition (pp. 129–142). Springer Nature Switzerland.

23.

Mridha

M. F.

Lima

A. A.

Nur

Das

S. C.

Hasan

Kabir

M. M.

(2021). A survey of automatic text summariza-tion: Progress, process and challenges. IEEE Access, 9, 156043–156070. https://doi.org/10.1109/ACCESS.2021.3129786

24.

Pandya

(2019). Automatic text summarization of legal cases: A hybrid approach. ArXiv preprint arXiv, 1908, 09119.

25.

Paulus

Xiong

Socher

(2017). A deep reinforced model for abstractive summarization. ArXiv preprint arXiv, 1705, 04304.

26.

Porwal

Bewoor

Deshpande

(2023). Transformer Based Implementation for Automatic Book Summarization. ArXiv preprint arXiv, 2301, 07057.

27.

Raffel

Shazeer

Roberts

Lee

Narang

Matena

Liu

P. J.

(2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1), 5485–5551.

28.

Saiyyad

M. M.

Patil

N. N.

(2024). Text summarization using deep learning techniques: A review. Engineering Proceedings, 59(1), 194.

29.

Sharma

(2022). Automatic text summarization methods: A comprehensive review. SN Computer Science, 4(1), 33. https://doi.org/10.1007/s42979-022-01446-w

30.

Umadevi

(2020). Document comparison based on tf-idf metric. International Research Journal of Engineering and Technology (IRJET, 7(02), 1546–1550.

31.

Vaswani

Shazeer

Parmar

Uszkoreit

Jones

Gomez

A. N.

Kaiser

Ł.

Polosukhin

(2017). Attention is all you need. Advances In Neural Information Processing Systems, 30.

32.

Wang

(2023). Surveying the landscape of text summarization with deep learning: A comprehensive review. arXiv preprint arXiv, 2310, 09411.

Sem-Rouge: Graph-Based Embedding for Automated Text Summarization with Using Large Language Models

Abstract

Keywords

Get full access to this article

References