Enhancing Suicide Risk Classification: A Multi-Stage Framework with Sentence-Level Waterfall Architecture for Clinical Notes Analysis
Main Article Content
Abstract
Abstract: Suicide remains a leading cause of preventable death worldwide, yet current computational models for suicide risk assessment often struggle to filter clinically irrelevant information from
electronic health records (EHRs) and to provide interpretable outputs that clinicians can act upon with
confidence. We present a multi-stage framework with a novel sentence-level waterfall architecture
that incrementally filters irrelevant and conflicting content while preserving key suicide-related indicators. This design enables direct linkage between predictions and specific textual evidence, offering
transparent, sentence-level reasoning for each classification decision. At the hospital-stay level, the
framework integrates sentence-level outputs through two complementary strategies, cascading inference and a generative language model, providing comprehensive assessments that reflect temporal
changes and multiple clinical perspectives.
Evaluation on the benchmark ScAN suicide attempt dataset demonstrates substantial gains over
existing models, achieving a macro F1-score of 0.93 and particularly strong improvements in challenging categories, raising F1-scores for unsure and negative cases from 0.52 to 0.83. Detailed ablation and error analyses confirm that the sentence-level waterfall design is essential for both predictive
accuracy and clinical interpretability, highlighting its robustness against noisy, heterogeneous EHR
data. The source code in our study are available at https://github.com/phuongmt3/ESAP.
Keywords: Depression, Suicide Attempt, Mental Health, Electronic Health Records, Clinical Notes
Analysis, Natural Language Processing