LEAF: Learning Endoscopy with Ablation-Aware Features for Colorectal Cancer Classification
Main Article Content
Abstract
Abstract: Colorectal cancer screening pipelines increasingly rely on automatic video analysis to
assist endoscopists, yet existing systems typically specialize in a single task and generalize poorly
outside their training domains. This paper presents LEAF (Learning Endoscopy with Ablation-aware
Features), a backbone-agnostic multi-task learning framework that jointly predicts fine-grained lesion
classes, pathological severity, and anatomical regions from still frames. On the public CRCCD V1
dataset, LEAF achieves 93.86% accuracy and 93.78% F1 when fine-tuned from ImageNet, outperforming the strongest single-task baselines by up to 2.1 absolute points. When trained from scratch,
LEAF reaches 77.44% accuracy on the external HyperKvasir benchmark, surpassing all single-task
counterparts by 1.9–6.4 points. Our contributions include (i) a flexible hard-parameter-sharing architecture with empirically calibrated loss weights that generalizes across backbone choices, (ii) a comprehensive benchmark covering EfficientNet, ResNet, DenseNet, and Swin Transformer baselines,
and (iii) a cross-dataset evaluation protocol demonstrating improved generalization. The framework
delivers consistent accuracy gains over single-task learners while maintaining computational efficiency suitable for clinical deployment.
Keywords: Medical image analysis, endoscopic imaging, colorectal cancer, deep learning,
multi-task learning