A novel focal-loss and class-weight-aware convolutional neural network for the classification of in-text citations

Abstract

We argue that citations, as they have different reasons and functions, should not all be treated in the same way. Using the large, annotated dataset of about 10K citation contexts annotated by human experts, extracted from the Association for Computational Linguistics repository, we present a deep learning–based citation context classification architecture. Unlike all existing state-of-the-art feature-based citation classification models, our proposed convolutional neural network (CNN) with fastText-based pre-trained embedding vectors uses only the citation context as its input to outperform them in both binary- (important and non-important) and multi-class (Use, Extends, CompareOrContrast, Motivation, Background, Other) citation classification tasks. Furthermore, we propose using focal-loss and class-weight functions in the CNN model to overcome the inherited class imbalance issues in citation classification datasets. We show that using the focal-loss function with CNN adds a factor of $(1 - p_{t})^{γ}$ to the cross-entropy function. Our model improves on the baseline results by achieving an encouraging 90.6 F1 score with 90.7% accuracy and a 72.3 F1 score with a 72.1% accuracy score, respectively, for binary- and multi-class citation classification tasks.

Keywords

Citation classification class imbalance convolutional neural network deep learning focal loss

Get full access to this article

View all access options for this article.

References

Boyack

van Eck

Colavizza

et al. Characterizing in-text citations in scientific articles: a large-scale analysis. J Informetr 2018; 12(1): 59–73.

Bornmann

Wray

Haunschild

. Citation concept analysis (CCA): a new form of citation analysis revealing the usefulness of concepts for other researchers illustrated by exemplary case studies including classic books by Thomas S Kuhn and Karl R Popper. Scientometrics 2020; 122(2): 1051–1074.

Saier

Färber

. UnarXive: a large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata. Scientometrics 2020; 125: 1–24.

Safder

Hassan

S-U

Visvizi

et al. Deep learning-based extraction of algorithmic metadata in full-text scholarly documents. Inf Process Manag 2020; 57(6): 102269.

Hassan

S-U

Imran

Iqbal

et al. Deep context of citations using machine-learning models in scholarly full-text articles. Scientometrics 2018; 177: 1–18.

Safder

Hassan

S-U

. Bibliometric-enhanced information retrieval: a novel deep feature engineering approach for algorithm searching from full-text publications. Scientometrics 2019; 119: 1–21.

Tahamtan

Bornmann

. What do citation counts measure? An updated review of studies on citations in scientific documents published between 2006 and 2018. Scientometrics 2019; 121(3): 1635–1684.

Gross

. College libraries and chemical education. Science 1927; 66(1713): 385–389.

Brittain

. Information and its users: a review with special reference to the social science. Newyork: John Wiley, 1970.

10.

Cole

. Social stratification in science. Chicago: Univ Chicago Press, 1973.

11.

Borgman

. Bibliometrics and scholarly communication: editor’s introduction. Commun Res 1989; 16(5): 583–599.

12.

Luukkonen

. Is scientists’ publishing behaviour rewardseeking? Scientometrics 1992; 24(2): 297–319.

13.

Frost

. The use of citations in literary research: a preliminary classification of citation functions. Libr Q 1979; 49(4): 399–414.

14.

Moravcsik

Murugesan

. Some results on the function and quality of citations. Soc Stud Sci 1975; 5(1): 86–92.

15.

Zhu

Turney

Lemire

et al. Measuring academic influence: not all citations are equal. J Assoc Inf Sci Technol 2015; 66(2): 408–427.

16.

Valenzuela

Etzioni

. Identifying meaningful citations. In: Proceedings of the workshops at the 29th AAAI conference on artificial intelligence, Austin, TX, 25-30 January 2015, pp. 21-26. Palo Alto, CA: AAAI.

17.

Pride

Knoth

. Incidental or influential?-challenges in automatically detecting citation importance using publication full texts. In: Proceedings of the international conference on theory and practice of digital libraries, Lyon, 25-27 August 2017, pp. 572–578. Cham: Springer.

18.

Teufel

Siddharthan

Tidhar

. Automatic classification of citation function. In: Proceedings of the 2006 conference on empirical methods in natural language processing, Sydney, NWS, Australia, 22-23 July 2006, pp. 103–110. New York: ACM.

19.

Abu-Jbara

Ezra

Radev

. Purpose and polarity of citation: towards nlp-based bibliometrics. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, Atlanta, Georgia, 9-14 June 2013, pp. 596–606. New York: ACM.

20.

Hassan

S-U

Akram

Haddawy

. Identifying important citations using contextual information from full text. In: Proceedings of the 17th ACM/IEEE joint conference on digital libraries, Toronto, ON, Canada, 19-23 June 2017, pp. 41–48. New York: ACM/IEEE.

21.

Hassan

S-U

Iqbal

Imran

et al. Mining the context of citations in scientific publications. In: Proceedings of the international conference on Asian digital libraries, Hamilton, 19-22 November 2018, pp. 316–322. Cham: Springer.

22.

Jurgens

Kumar

Hoover

et al. Measuring the evolution of a scientific field through citation frames. Trans Assoc Comput Linguist 2018; 6: 391–406.

23.

Teufel

Siddharthan

Tidhar

. An annotation scheme for citation function. In: Proceedings of the 7th sigdial workshop on discourse and dialogue, Sydney, NSW, Australia, 15-16 July 2006, pp80–87. New York: ACM.

24.

Taşkın

. A content-based citation analysis study based on text categorization. Scientometrics 2018; 114(1): 335–357.

25.

Cole

. The growth of scientific knowledge: theories of deviance as a case study. In: Coser

(ed.) The idea of social structure: papers in honor of Robert K Merton. New York: Harcourt Brace Jovanovich, 1975, pp. 175–220.

26.

Garfield

. High impact science and the case of Arthur Jensen. Curr Contents 1978; 41: 5–15.

27.

Oppenheim

Renn

. Highly cited old papers and the reasons why they continue to be cited. J Am Soc Inf Sci 1978; 29(5): 225–231.

28.

Chubin

Moitra

. Content analysis of references: adjunct or alternative to citation counting? Soc Stud Sci 1975; 5(4): 423–441.

29.

Krampen

Burkard

Montada

. Wissenschaftsforschung in der psychologie. Gottingen: Hogrefe Verlag für Psychologie, 2002.

30.

Dong

Schäfer

. Ensemble-style self-training on citation classification. In: Proceedings of the 5th international joint conference on natural language processing, Chiang Mai, Thailand, 8-13 November 2011, pp. 623–631. Bangkok, Thailand: Asian Federation of Natural Language Processing

31.

Hassan

S-U

Safder

Akram

et al. A novel machine-learning approach to measuring scientific knowledge flows using citation context analysis. Scientometrics 2018; 116(2): 973–996.

32.

Small

. Characterizing highly cited method and non-method papers using citation contexts: the role of uncertainty. J Informetr 2018; 12(2): 461–480.

33.

Bakhti

Niu

Nyamawe

. Semi-automatic annotation for citation function classification. In: Proceedings of the 2018 international conference on control, artificial intelligence, robotics & optimization (ICCAIRO), Prague, Czech Republic, 19-21 May 2018, pp. 43–47. https://www.computer.org/csdl/proceedings-article/iccairo/2018/08698361/19wAYlwrbyg

34.

Wang

Leng

Ren

et al. Sentiment classification based on linguistic patterns in citation context. Curr Sci 2019; 10: 606–616.

35.

Tuarob

Kang

Wettayakom

et al. Automatic classification of algorithm citation functions in scientific literature. IEEE Trans Knowl Data Eng 2020; 32(10): 1881–1896.

36.

Garfield

Merton

. Citation indexing: its theory and application in science, technology, and humanities, volume. 8. New York: Wiley, 1979.

37.

Pham

Hoffmann

. A new approach for scientific citation classification using cue phrases. In: Proceedings of the Australasian joint conference on artificial intelligence, Perth, WA, Australia, 3-5 December 2003, pp. 759–771. Berlin, Heidelberg: Springer.

38.

Kim

. Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, 25-29 October 2014, pp. 1746-1751. Stroudsburg, PA: ACL.

39.

Mikolov

Sutskever

Chen

et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of the advances in neural information processing systems, Lake Tahoe, CA, 12-17 December 2013, pp. 3111–3119. Red Hook, NY: Curram Associates Inc.

40.

Collobert

Weston

Bottou

et al. Natural language processing (almost) from scratch. J Mach Learn Res 2011; 12: 2493–2537.

41.

Szegedy

Liu

Jia

et al. Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, 7-12 June 2015, pp. 1-9. New York: IEEE.

42.

Lin

T-Y

Goyal

Girshick

et al. Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, Venice, 22-29 October 2017, pp. 2980–2988. New York: IEEE.

43.

Schmidhuber

. Deep learning in neural networks: an overview. Neural Netw 2015; 61: 85–117.

44.

Socher

Huang

Pennin

et al. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Proceedings of the advances in neural information processing systems, Granada, 12-15 December 2011, pp. 801–809. Red Hook, NY: Curran Associates Inc.

45.

Iyyer

Manjunatha

Boyd-Graber

et al. Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), volume. 1, Beijing, China, 26-31 July 2015, pp. 1681–1691. Stroudsburg, PA: ACL.

46.

Paul

Nawaz

Korkontzelos

et al. News search using discourse analytics. In: Proceedings of the 2013 digital heritage international congress (DigitalHeritage), volume. 1, Marseille, 28 October-1 November2013, pp. 597-604. New York: IEEE.