Nguyen Duy Nhat, Do Nguyen Thuan Phong

Main Article Content

Abstract

The development of industry 4.0 in the world is creating challenges in Artificial Intelligence (AI) in general and Natural Language Processing (NLP) in particular. Machine Reading Comprehension (MRC) is an NLP task with real-world applications that require machines to determine the correct answers to questions based on a given document. MRC systems must not only answer questions when possible but also determine when the document supports no answer and abstain from answering. In this paper, we present our proposed system to solve this task at the VLSP shared task 2021: Vietnamese Machine Reading Comprehension with UIT-ViQuAD 2.0. We present the MRC4MRC model to address that task. The model is made up of two separate components with the same automatic reading function in the MRC4MRC model. Our MRC4MRC based on the XLM-RoBERTa pre-trained language model achieves 79.13% in F1-score (F1) and 69.72% in EM (Exact Match) on the public test set. Our experiments also show that the XLM-RoBERTa language model is better than the powerful PhoBERT language model on UIT-ViQuAD 2.0.