WAGRank: A word ranking model based on word attention graph for keyphrase extraction

Abstract

Keyphrase extraction is an essential task of identifying representative words or phrases in document processing. Main traditional models rely on each word frequency feature in a document and its associated corpus. There are two major limitations of the word frequency method: first, it fails to fully exploit semantic information in the document, that is, it is a bag-of-word method; second, it tends to be influenced by local word frequency in the short current text when the linked corpus is not available or incomplete. This paper proposes WAGRank, a novel unsupervised ranking model on a word attention graph, where nodes are words and edges are semantic relations between words. To assign edge weights, two interpretable statistical methods of assessing correlation strength between words are designed using attention mechanism. WAGRank depends on word semantics rather than frequency only in the current text, using external knowledge stored in a pre-trained language model. WAGRank was evaluated on two publicly available datasets against twelve baselines, presenting its effectiveness and robustness. Besides, the Granger causality test illustrated that word attention has a statistically significant predictive effect on word frequency, providing a more reasonable explanation for word frequency analysis.

Keywords

Keyphrase extraction attention mechanism graph-based model pre-trained language model semantic feature

Get full access to this article

View all access options for this article.

References

Hasan

. Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 1: Long Papers), 2014, pp.1262–1273.

Hulth

Megyesi

. A study on automatically extracted keywords in text categorization. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics, 2006, pp.537–544.

Alami Merrouni

Frikh

Ouhbi

. Automatic keyphrase extraction: a survey and trends. J Intell Inf Syst 2020; 54: 391–424.

Papagiannopoulou

Tsoumakas

. A review of keyphrase extraction. Wiley Interdiscip Rev Data Min Knowl Discovery 2020; 10: e1339.

Turney

. Learning algorithms for keyphrase extraction. Inf Retr Boston 2000; 2: 303–336.

Cancho

RFI

Solé

. The small world of human language. Proc R Soc London Ser B Biol Sci 2001; 268: 2261–2265.

Song

Feng

Jing

. A survey on recent advances in keyphrase extraction from pre-trained language models, Findings of the association for computational linguistics: EACL 2023, 2023, pp.2108–2119.

Spärck Jones

. A statistical interpretation of term specificity and its application in retrieval. J Doc 2004; 60: 493–502.

Salton

Buckley

. Term-weighting approaches in automatic text retrieval. Inf Process Manag 1988; 24: 513–523.

10.

Rose

Engel

Cramer

, et al. Automatic keyword extraction from individual documents, Text mining: applications and theory, 2010, pp.1–20.

11.

Giamblanco

Siddavaatam

. Keyword and keyphrase extraction using newton’s law of universal gravitation. In: 2017 IEEE 30th Canadian conference on electrical and computer engineering (CCECE), 2017, pp.1–4. IEEE.

12.

Saxena

Mangal

Jain

. KeyGames: a game theoretic approach to automatic keyphrase extraction. In: Proceedings of the 28th international conference on computational linguistics, 2020, pp.2037–2048.

13.

Vaswani

Shazeer

Parmar

, et al. Attention is all you need, Advances in neural information processing systems, Vol. 30, 2017.

14.

. An introductory survey on attention mechanisms in NLP problems. In: Intelligent systems and applications: proceedings of the 2019 intelligent systems conference (IntelliSys) Volume 2, 2020, pp.432–448. Springer.

15.

Duan

Zhao

Zhou

, et al. A study of pre-trained language models in natural language processing. In: 2020 IEEE international conference on smart cloud (SmartCloud), 2020, pp.116–121. IEEE.

16.

Harris

. Distributional structure. Word 1954; 10: 146–162.

17.

Alrehamy

Walker

. Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction. Soft comput 2018; 22: 7041–7057.

18.

Campos

Mangaravite

Pasquali

, et al. YAKE! keyword extraction from single documents using multiple local features. Inf Sci (Ny) 2020; 509: 257–289.

19.

Rabby

Azad

Mahmud

, et al. TeKET: a tree-based unsupervised keyphrase extraction technique. Cognit Comput 2020; 12: 811–833.

20.

Mihalcea

Tarau

. Textrank: bringing order into text. In: Proceedings of the 2004 conference on empirical methods in natural language processing, 2004, pp.404–411.

21.

Brin

. The PageRank citation ranking: bringing order to the web. Proc ASIS 1998; 98: 161–172.

22.

Wan

Xiao

. Single document keyphrase extraction using neighborhood knowledge.. In: AAAI, Vol. 8, 2008, pp.855–860.

23.

Bougouin

Boudin

Daille

. Topicrank: graph-based topic ranking for keyphrase extraction. In: International joint conference on natural language processing (IJCNLP), 2013, pp.543–551.

24.

Sterckx

Demeester

Deleu

, et al. Topical word importance for fast keyphrase extraction. In: Proceedings of the 24th international conference on world wide web, 2015, pp.121–122.

25.

Florescu

Caragea

. A position-biased pagerank algorithm for keyphrase extraction. In: Proceedings of the AAAI conference on artificial intelligence, Vol. 31, 2017.

26.

Boudin

. Unsupervised keyphrase extraction with multipartite graphs, arXiv preprint arXiv:1803.08721, 2018.

27.

Mikolov

Chen

Corrado

, et al. Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781, 2013.

28.

Pennington

Socher

Manning

. Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp.1532–1543.

29.

Peters

Neumann

Iyyer

, et al. Deep contextualized word representations, ArXiv abs/1802.05365, 2018. https://api.semanticscholar.org/CorpusID:3626819.

30.

Radford

Narasimhan

Salimans

, et al. Improving language understanding by generative pre-training, 2018.

31.

Devlin

Chang

M-W

Lee

, et al. Bert: pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805, 2018.

32.

Raffel

Shazeer

Roberts

, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 2020; 21: 5485–5551.

33.

Papagiannopoulou

Tsoumakas

. Local word vectors guiding keyphrase extraction. Inf Process Manag 2018; 54: 888–902.

34.

Bennani-Smires

Musat

Hossmann

, et al. Simple unsupervised keyphrase extraction using sentence embeddings, arXiv preprint arXiv:1801.04470, 2018.

35.

Sun

Qiu

Zheng

, et al. SIFRank: a new baseline for unsupervised keyphrase extraction based on pre-trained language model. IEEE Access 2020; 8: 10896–10906.

36.

Zhang

Chen

Wang

, et al. MDERank: a masked document embedding rank approach for unsupervised keyphrase extraction. In: Muresan S, Nakov P and Villavicencio A (eds) Findings of the association for computational linguistics: ACL 2022. Association for Computational Linguistics, Dublin, Ireland, 2022, pp.396–409. https://doi.org/10.18653/v1/2022.findings-acl.34. https://aclanthology.org/2022.findings-acl.34.

37.

Song

Feng

Jing

. Utilizing BERT intermediate layers for unsupervised keyphrase extraction. In: Proceedings of the 5th international conference on natural language and speech processing (ICNLSP 2022), 2022, pp.277–281.

38.

Manning

Schutze

. Foundations of statistical natural language processing. Cambridge: MIT Press, 1999.

39.

Kusner

Sun

Kolkin

, et al. From word embeddings to document distances. In: International conference on machine learning, 2015, pp.957–966. PMLR.

40.

Levy

Goldberg

. Neural word embedding as implicit matrix factorization, Advances in neural information processing systems, Vol. 27, 2014.

41.

Nair

Hinton

. Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp.807–814.

42.

Klein

Manning

. Accurate unlexicalized parsing. In: Proceedings of the 41st annual meeting of the association for computational linguistics, 2003, pp.423–430.

43.

Augenstein

Das

Riedel

, et al. Semeval 2017 task 10: scienceie-extracting keyphrases and relations from scientific publications, arXiv preprint arXiv:1704.02853, 2017.

44.

Hulth

. Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 conference on Empirical methods in natural language processing, 2003, pp.216–223.

45.

Boudin

. PKE: an open source python-based keyphrase extraction toolkit. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: system demonstrations, Osaka, Japan, 2016, pp.69–73. http://aclweb.org/anthology/C16-2015.

46.

Granger

. Investigating causal relations by econometric models and cross-spectral methods. Econom J Econom Soc 1969; 37: 424–438.

47.

Zong

. Statistical natural language processing (2nd edition). Beijing: Tsinghua University Press, 2013.

48.

Brockwell

Davis

. Introduction to time series and forecasting. New York: Springer, 2002.

49.

Dickey

Fuller

. Likelihood ratio statistics for autoregressive time series with a unit root. Econom J Econom Soc 1981; 49: 1057–1072.

50.

Kang

Shin

. SAMRank: unsupervised keyphrase extraction using self-attention map in BERT and GPT-2. In: Proceedings of the 2023 conference on empirical methods in natural language processing, 2023, pp.10188–10201.

51.

. WikiRank: improving keyphrase extraction based on background knowledge, arXiv preprint arXiv:1803.09000, 2018.