ViReader: A Wikipedia-based Vietnamese reading comprehension system using transfer learning

Abstract

Machine Reading Comprehension has attracted significant interest in research on natural language understanding, and large-scale datasets and neural network-based methods have been developed for this task. However, most developments of resources and methods in machine reading comprehension have been investigated using two resource-rich languages, English and Chinese. This article proposes a system called ViReader for open-domain machine reading comprehension in Vietnamese by using Wikipedia as the textual knowledge source, where the answer to any particular question is a textual span derived directly from texts on Vietnamese Wikipedia. Our system combines a sentence retriever component, based on techniques of information retrieval to extract the relevant sentences, with a transfer learning-based answer extractor trained to predict answers based on Wikipedia texts. Experiments on multiple datasets for machine reading comprehension in Vietnamese and other languages demonstrate that (1) our ViReader system is highly competitive with prevalent machine learning-based systems, and (2) multi-task learning by using a combination consisting of the sentence retriever and answer extractor is an end-to-end reading comprehension system. The sentence retriever component of our proposed system retrieves the sentences that are most likely to provide the answer response to the given question. The transfer learning-based answer extractor then reads the document from which the sentences have been retrieved, predicts the answer, and returns it to the user. The ViReader system achieves new state-of-the-art performances, with values of 70.83% EM (exact match) and 89.54% F1, outperforming the BERT-based system by 11.55% and 9.54% , respectively. It also obtains state-of-the-art performance on UIT-ViNewsQA (another Vietnamese dataset consisting of online health-domain news) and BiPaR (a bilingual dataset on English and Chinese novel texts). Compared with the BERT-based system, our system achieves significant improvements (in terms of F1) with 7.65% for English and 6.13% for Chinese on the BiPaR dataset. Furthermore, we build a ViReader application programming interface that programmers can employ in Artificial Intelligence applications.

Keywords

Machine reading comprehension question answering transfer learning sentence transformer

Get full access to this article

View all access options for this article.

References

Harabagiu

S.M.

, Moldovan

D.I.

, Clark

, Bowden

, Williams

and Bensley

, Answer Mining by Combining Extraction Techniques with Abductive Reasoning, TREC (2003), 375–382.

Ryu

P.M.

, Jang

M.G.

and Kim

H.K.

, Open domain question answering using Wikipedia-based knowledge model, Information Processing & Management 50(5) (2014), 683–692.

Chen

, Fisch

, eston

and Bordes

, Reading Wikipedia to Answer Open-Domain Questions, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (2017).

Noraset

, Lowphansirikul

and Tuarob

, WabiQA: A Wikipedia-Based Thai Question-Answering System, Information Processing & Management 58(1) (2021), 102431.

Richardson

, Burges

C.J.

and Renshaw

, MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, (2013).

Yang

, Yih

W.T.

and Meek

, WikiQA: A Challenge Dataset for Open-Domain Question Answering, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, (2015).

Clark

and Oren

, My Computer Is An Honor Student—But How Intelligent Is It? Standardized tests as ameasure of AI, AI Magazine 37(1) (2016), 5–12.

Cui

, Liu

, Che

, Xiao

, Chen

, Ma

, Wang

and Hu

, A Span-Extraction Dataset for Chinese Machine Reading Comprehension, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), (2019).

Rajpurkar

, Zhang

, Lopyrev

and Liang

, SQuAD: 100,000+ Questions for Machine Comprehension of Text, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, (2016).

10.

d’Hoffschmidt

, Belblidia

, Brendlé

, Heinrich

and Vidal

, FQuAD: French Question Answering Dataset, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (2020).

11.

Lim

, Kim

and Lee

, Korquad1. 0: Korean qa dataset for machine reading comprehension, arXiv preprint arXiv:1909.07005, (2019).

12.

Efimov

, Chertok

, Boytsov

and Braslavski

, SberQuAD–Russian Reading Comprehension Dataset: Description and Analysis, Experimental IR Meets Multilinguality, Multimodality, and Interaction: 3.

13.

Nguyen

, Nguyen

and Nguyen

, A Vietnamese Dataset for Evaluating Machine Reading Comprehension, Proceedings of the 28th International Conference on Computational Linguistics (COLING), (2020).

14.

Seo

, Kembhavi

, Farhadi

and Hajishirzi

, Bidirectional Attention Flow for Machine Comprehension, arXiv preprint arXiv:1611.01603, (2016).

15.

A.W.

, Dohan

, Luong

M.T.

, Zhao

, Chen

, Norouzi

and Le

Q.V.

, QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension, International Conference on Learning Representations (2018).

16.

Devlin

, Chang

M.W.

, Lee

and Toutanova

, BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), (2019).

17.

Lan

, Chen

, Goodman

, Gimpel

, Sharma

and Soricut

, Albert: A Lite Bert for Self-Supervised Learning of Language Representations, International Conference on Learning Representations (2019).

18.

Conneau

, Khandelwal

, Goyal

, Chaudhary

, Wenzek

, Guzmán

, Grave

, Ott

, Zettlemoyer

and Stoyanov

, Unsupervised Cross-lingual Representation Learning at Scale, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020).

19.

Lee

, Yun

, Kim

, Ko

and Kang

, Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018).

20.

Yang

, Xie

, Lin

, Li

, Tan

, Xiong

, Li

and Lin

, End-to-End Open-Domain Question Answering with BERTserini, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), (2019).

21.

Nguyen

, Huynh

, Nguyen

D.V.

, Nguyen

A.G.T.

and Nguyen

N.L.T.

, NewVietnamese Corpus for Machine Reading Comprehension of Health News Articles, arXiv preprint arXiv:2006.11138 (2020).

22.

Duke

N.K.

and Pearson

P.D.

, Effective Practices for Developing Reading Comprehension, Journal of Education 189(1-2) (2009), 107–122.

23.

Khoshsima

and Rezaeian Tiyar

, The Effect of Summarizing Strategy on Reading Comprehension of IranianIntermediate EFL Learners, International Journal of Language and Linguistics 2(3) (2014), 134.

24.

Van Nguyen

, Tran

K.V.

, Luu

S.T.

, Nguyen

A.G.T.

and Nguyen

N.L.T.

, Enhancing Lexical-Based Approach withExternal Knowledge for Vietnamese Multiple-Choice Machine Reading Comprehension, IEEE Access 8(2020), 201404–201417.

25.

Mihalcea

, Graph-based ranking algorithms for sentence extraction, applied to text summarization, Proceedings of the ACL interactive poster and demonstration sessions, (2004).

26.

Tas

and Kiyani

, A survey automatic text summarization, Press Academia Procedia 5(1) (2007), 205–213.

27.

Narayan

, Cohen

S.B.

and Lapata

, Ranking Sentences for Extractive Summarization with Reinforcement Learning, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), (2018).

28.

Bilotti

M.W.

, Ogilvie

, Callan

and Nyberg

, Structured Retrieval for Question Answering, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, (2007).

29.

Sultan

M.A.

, Castelli

and Florian

, A Joint Model for Answer Sentence Ranking and Answer Extraction, Transactions of the Association for Computational Linguistics 4 (2016), 113–125.

30.

Neto

J.L.

, Santos

A.D.

, Kaestner

C.A.

, Alexandre

and Santos

, Document Clustering and Text Summarization, Proceedings of the 4th International Conference Practical Applications of Knowledge Discovery and Data Mining, The Practical Application Company, (2000).

31.

Mihalcea

and Tarau

, Textrank: Bringing order into text, Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (2004).

32.

Robertson

and Zaragoza

, The Probabilistic Relevance Framework: BM25 and Beyond, Foundations and Trends in Information Retrieval, Now Publishers Inc, (2009).

33.

Reimers

and Gurevych

, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), (2019).

34.

Reimers

and Gurevych

, Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2020).

35.

Rajaraman

and Ullman

J.D.

, Mining of Massive Datasets, Cambridge University Press (2011).

36.

Hiemstra

, A Probabilistic Justification for Using TFxIDF Term Weighting in Information Retrieval, International Journal on Digital Libraries 3(2) (2000), 131–139.

37.

Christian

, Agus

M.P.

and Suhartono

, Single Document Automatic Text Summarization Using TermFrequency-Inverse Document Frequency (TF-IDF), ComTech: Computer, Mathematics and Engineering Applications 7(4) (2016), 285–294.

38.

Mihalcea

, Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization, Proceedings of the ACL Interactive Poster and Demonstration Sessions, (2004).

39.

Robertson

S.E.

, Walker

, Jones

, Hancock-Beaulieu

M.M.

and Gatford

, Okapi at TREC-3, NIST SpecialPublication Sp 109 (1995), 109.

40.

Robertson

S.E.

, Walker

, Beaulieu

and Willett

, Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC andInteractive Track, NIST Special Publication SP 500 (1999), 253–264.

41.

Zhong

, Liu

, Chen

, Wang

, Qiu

and Huang

, Extractive Summarization as Text Matching, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), (2020).

42.

Chang

W.C.

, Yu

H.F.

, Zhong

, Yang

and Dhillon

I.S.

, Taming Pretrained Transformers for Extreme Multi-label Text Classification, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (2020).

43.

Nguyen-Hoang

T.A.

, Nguyen

and Tran

Q.V.

, TSGVi: A Graph-Based Summarization System for Vietnamese Documents, Journal of Ambient Intelligence and Humanized Computing 3(4) (2012), 305–313.

44.

Chen

, Bolton

and Manning

C.D

, A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (2016).

45.

Wang

and Jiang

, Machine Comprehension Using Match-LSTM and Answer Pointer, arXiv preprint arXiv:1608.07905 (2016).

46.

Wang

, Yang

, Wei

, Chang

and Zhou

, Gated Self-Matching Networks for Reading Comprehension and Question Answering, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (2017).

47.

Huang

H.Y.

, Zhu

, Shen

and Chen

, Fusion-Net: Fusing via Fully-aware Attention with Application to Machine Comprehension, International Conference on Learning Representations, (2018).

48.

Weissenborn

, Wiese

and Seiffe

, Making Neural QA as Simple as Possible but not Simpler, Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), (2017).

49.

Hsu

T.Y.

, Liu

C.L.

and Lee

H.Y.

, Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model, Proceedings of the EMNLPIJCNLP (2019).

50.

Vossen

, Agerri

, Aldabe

, Cybulska

, van Erp

, Fokkens

, Laparra

, Minard

A.L.

, Aprosio

A.P.

, Rigau

and Rospocher

, News Reader: Using Knowledge Resources in a Cross-Lingual Reading Machine to GenerateMore knowledge from Massive Streams of News, Knowledge-Based Systems 110 (2016), 60–85.

51.

Shin

, Jin

, Jung

and Lee

K.H.

, Predicate Constraints Based Question Answering Over Knowledge Graph, Information Processing & Management 56(3) (2019), 445–462.

52.

Tran

V.M.

, Nguyen

V.D.

, Tran

O.T.

, Pham

U.T.T.

and Ha

T.Q.

, An Experimental Study of Vietnamese Question Answering System, 2009 International Conference on Asian Language Processing, IEEE, (2009).

53.

Nguyen

D.Q.

and Pham

S.B.

, A Vietnamese Question Answering System, 2009 International Conference on Knowledge and Systems Engineering, IEEE, (2009).

54.

Tran

M.V.

, Le

D.T.

, Tran

X.T.

and Nguyen

T.T

, A Model of Vietnamese Person Named Entity Question Answering System, Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation, (2012).

55.

Tran

D.H.

, Chu

C.X.

, Pham

S.B.

and Le Nguyen

, Learning Based Approaches for Vietnamese Question Classification Using Keywords Extraction from the Web, Proceedings of the Sixth International Joint Conference on Natural Language Processing (2013).

56.

, Nguyen

and Nguyen

, Phân Loại Câu Hởi Tiếng Việt Ứng Dụng cho HệThống Hởi Ðáp Mở, PROCEEDING of Publishing House for Science and Technology (2019).

57.

Bach

N.X.

, Thien

T.H.N.

and Phuong

T.M.

, Question Analysis for Vietnamese Legal Question Answering, 2017 9th International Conference on Knowledge and Systems Engineering (KSE), IEEE, (2017).

58.

Duong

H.T.

and Ho

B.Q.

, A Vietnamese Question Answering System in Vietnam’s Legal Documents, IFIP International Conference on Computer Information Systems and Industrial Management, Springer, Berlin, Heidelberg, (2015).

59.

Le-Hong

and Bui

D.T.

, A Factoid Question Answering System for Vietnamese, Companion Proceedings of the The Web Conference (2018), 2018.

60.

Nguyen

H.T.

, Duong

P.H.

and Cambria

, Learning Short-Text Semantic Similarity with Word Embeddings and ExternalKnowledge Sources, Knowledge-Based Systems 182 (2019), 104842.

61.

Nogueira

, Yang

, Cho

and Lin

, Multi-Stage Document Ranking with BERT, arXiv preprint arXiv:1910.14424 (2019).

62.

Han

, Wang

, Bendersky

and Najork

, Learning-To-Rank with BERT in TF-Ranking, arXiv preprint arXiv:2004.08476 (2020).

63.

Jing

, Xiong

and Yan

, BiPaR: A Bilingual Parallel Dataset for Multilingual and Cross-lingual Reading Comprehension on Novels, Proceedings of the EMNLP-IJCNLP (2019).