ASR - VLSP 2021: Conformer with Gradient Mask and Stochastic Weight Averaging for Vietnamese Automatic Speech Recognition

Dang Dinh Son; Le Dang Linh; Dang Xuan Vuong; Duong Quang Tien; Ta Bao Thang

Dang Dinh Son, Le Dang Linh, Dang Xuan Vuong, Duong Quang Tien, Ta Bao Thang

PDF

Published Jun 30, 2022

How to Cite

SON, Dang Dinh et al. ASR - VLSP 2021: Conformer with Gradient Mask and Stochastic Weight Averaging for Vietnamese Automatic Speech Recognition. VNU Journal of Science: Computer Science and Communication Engineering, [S.l.], v. 38, n. 1, june 2022. ISSN 2588-1086. Available at: <//jcsce.vnu.edu.vn/index.php/jcsce/article/view/322>. Date accessed: 26 aug. 2025.

ABNT APA BibTeX CBE EndNote - EndNote format (Macintosh & Windows) MLA ProCite - RIS format (Macintosh & Windows) RefWorks Reference Manager - RIS format (Windows only) Turabian

Issue

Vol 38 No 1: Special Issue: The 8th International Workshop on Vietnamese Language and Speech Processing (VLSP 2021)

Section

Special Issue on Vietnamese Language and Speech Processing (VLSP2021)

Abstract

Recent years have witnessed the strong growth of Automatic Speech Recognition (ASR) studies due to its wide range of applications. However, there are few efforts put into the Vietnamese language. This paper introduces an end-to-end approach using Conformer and pseudo labeling for Vietnamese ASR systems. Besides, our approach is equipped with Gradient Mask and Stochastic Weight Averaging method to improve the training performance. The experiment results portrayed that our method achieved the best performance (8.28% Syllable Error Rate) and outperformed all other competitors in Task 1 of the 2021 VLSP Competition on Vietnamese Automatic Speech Recognition.

Article Sidebar

Article Details

Main Article Content

Abstract