Quan Chu Quoc, Vi Ngo Van

Main Article Content

Abstract

Named entity recognition (NER) is a widely studied task in natural language processing. Recently, a growing number of studies have focused on the nested NER. The span-based methods consider the named entity recognition as span classification task, can deal with nested entities naturally. But they suffer from class imbalance problem because the number of non-entity spans accounts for the majority of total spans. To address this issue, we propose a two stage model for nested NER. We utilize an entity proposal module to filter an easy non-entity spans for efficient training. In addition, we combine all variants of the model to improve overall accuracy of our system. Our method achieves 1st place on the Vietnamese NER shared task at the 8th International Workshop on Vietnamese Language and Speech Processing (VLSP) with F1-score of 62.71 on the private test dataset. For research purposes, our source code is available at https://github.com/quancq/VLSP2021_NER