ASR - VLSP 2021: Automatic Speech Recognition with Blank Label Re-weighting

Ta Bao Thang; Dang Dinh Son; Le Dang Linh; Dang Xuan Vuong; Duong Quang Tien

doi:10.25073/2588-1086/vnucsce.321

Ta Bao Thang, Dang Dinh Son, Le Dang Linh, Dang Xuan Vuong, Duong Quang Tien

PDF

Published Jun 30, 2022

DOI: https://doi.org/10.25073/2588-1086/vnucsce.321

How to Cite

THANG, Ta Bao et al. ASR - VLSP 2021: Automatic Speech Recognition with Blank Label Re-weighting. VNU Journal of Science: Computer Science and Communication Engineering, [S.l.], v. 38, n. 1, june 2022. ISSN 2588-1086. Available at: <//jcsce.vnu.edu.vn/index.php/jcsce/article/view/321>. Date accessed: 26 aug. 2025. doi: https://doi.org/10.25073/2588-1086/vnucsce.321.

ABNT APA BibTeX CBE EndNote - EndNote format (Macintosh & Windows) MLA ProCite - RIS format (Macintosh & Windows) RefWorks Reference Manager - RIS format (Windows only) Turabian

Issue

Vol 38 No 1: Special Issue: The 8th International Workshop on Vietnamese Language and Speech Processing (VLSP 2021)

Section

Special Issue on Vietnamese Language and Speech Processing (VLSP2021)

Abstract

End-to-end models have significant potential in most languages and recently proved the robustness in ASR tasks. Many robust architectures are proposed, and among many techniques, Recurrent Neural Network - Transducer (RNN-T) shows remarkable success. However, with background noise or reverb in spontaneous speech, this architecture generally suffers from high deletion error problems. For this reason, we propose the blank label re-weighting technique to improve the state-of-the-art Conformer transducer model. Our proposed system adopts the Stochastic Weight Averaging approach, stabilizing the training process. Our work achieved the first rank with a 4.17% of word error rate in Task 2 of the VLSP 2021 Competition.

Article Sidebar

Article Details

Main Article Content

Abstract