Thu Trang Nguyen, Phuc Bao Pham, Son Nguyen, Hieu Dinh Vo

Main Article Content

Abstract

Abstract: Accurately localizing vulnerable statements is critical for ensuring software security.
Multiple vulnerability localization techniques have been proposed and have demonstrated promising
results. However, their effectiveness is often limited by the quality issues of the training data, such
as label noise and class imbalance, which are inherent in real-world datasets. To address these challenges, this paper introduces VL-Refine, a local post-hoc refinement approach designed to enhance
the robustness and reliability of existing vulnerability localization techniques. VL-Refine operates
on top of the initial vulnerability predictions produced by any localization model and applies a local
verification mechanism to validate and refine the vulnerability assessment for each code statement.
To evaluate the performance of VL-Refine, we conduct extensive experiments on both function-level
and commit-level settings. Our experiment results show that VL-Refine consistently enhances the
performance of multiple state-of-the-art vulnerability localization methods. Notably, VL-Refine can
improve classification accuracy by up to 46% and enable developers to discover up to 43% more
vulnerabilities under fixed inspection efforts.
Keywords: Vulnerability localization, software security, label noise, class imbalance.