Viet Cuong TA, Thu Uyen Do

Main Article Content


In this paper, we study the efficiency of Graph Transformer Network for noisy label propagation in the task of classifying video anomaly actions. Given a weak supervised dataset, our methods focus on improving the quality of generated labels and use the labels for training a video classifier with deep network. From a full-length video, the anomaly properties of each segmented video can be decided through their relationship with other video. Therefore, we employ a label propagation mechanism with Graph Transformer Network. Our network combines both the feature-based relationship and temporal-based relationship to project the output features of the anomaly video to a hidden dimension. By learning in the new dimension, the video classifier can improve the quality of noisy, generated labels. Our experiments on three benchmark dataset show that the accuracy of our methods are better and more stable than other tested baselines.