vieCap4H Challenge 2021: A transformer-based method for Healthcare Image Captioning in Vietnamese

Bui Cao Doanh; Trinh Thi Thanh Truc; Nguyen Trong Thuan; Nguyen Duc Vu; Nguyen Duy Vo

doi:10.25073/2588-1086/vnucsce.371

Bui Cao Doanh, Trinh Thi Thanh Truc, Nguyen Trong Thuan, Nguyen Duc Vu, Nguyen Duy Vo

PDF

Published Dec 16, 2022

DOI: https://doi.org/10.25073/2588-1086/vnucsce.371

How to Cite

DOANH, Bui Cao et al. vieCap4H Challenge 2021: A transformer-based method for Healthcare Image Captioning in Vietnamese. VNU Journal of Science: Computer Science and Communication Engineering, [S.l.], v. 38, n. 2, dec. 2022. ISSN 2588-1086. Available at: <//jcsce.vnu.edu.vn/index.php/jcsce/article/view/371>. Date accessed: 21 oct. 2025. doi: https://doi.org/10.25073/2588-1086/vnucsce.371.

ABNT APA BibTeX CBE EndNote - EndNote format (Macintosh & Windows) MLA ProCite - RIS format (Macintosh & Windows) RefWorks Reference Manager - RIS format (Windows only) Turabian

Issue

Vol 38 No 2: Special Issue: The 8th International Workshop on Vietnamese Language and Speech Processing (VLSP 2021)

Section

Special Issue on Vietnamese Language and Speech Processing (VLSP2021)

Abstract

The automatic image caption generation is attractive to both Computer Vision and Natural Language Processing research community because it lies in the gap between these two fields. Within the vieCap4H contest organized by VLSP 2021, we participate and present a Transformer-based solution for image captioning in the healthcare domain. In detail, we use grid features as visual presentation and pre-training a BERT-based language model from PhoBERT-base pre-trained model to obtain language presentation used in the Adaptive Decoder module in the RSTNet model. Besides, we indicate a suitable schedule with the self-critical training sequence (SCST) technique to achieve the best results. Through experiments, we achieve an average of 30.3% BLEU score on the public-test round and 28.9% on the private-test round, which ranks 3rd and 4th, respectively. Source code is available at https://github.com/caodoanh2001/uit-vlsp-viecap4h-solution.

Article Sidebar

Article Details

Main Article Content

Abstract