🤖 AI Summary
This study addresses the challenge of under-identification and under-reporting of gender-based violence (GBV) in clinical settings, where implicit cues in narrative records often evade capture by structured data. To tackle this issue, the authors introduce FrameNet semantic frames—applied for the first time to GBV-related clinical text—to semantically annotate free-text electronic health records. They develop a detection model using support vector machines (SVMs) that integrates these semantic representations with conventional structured demographic features. Comparative experiments demonstrate that incorporating semantic information substantially improves performance, yielding an F1-score increase of over 0.3 compared to models relying solely on demographic variables. These findings validate the efficacy of semantic modeling of clinical narratives for early GBV detection and offer a novel pathway for enhancing public health interventions.
📝 Abstract
Gender-based violence (GBV) is a major public health issue, with the World Health Organization estimating that one in three women experiences physical or sexual violence by an intimate partner during her lifetime. In Brazil, although healthcare professionals are legally required to report such cases, underreporting remains significant due to difficulties in identifying abuse and limited integration between public information systems. This study investigates whether FrameNet-based semantic annotation of open-text fields in electronic medical records can support the identification of patterns of GBV. We compare the performance of an SVM classifier for GBV cases trained on (1) frame-annotated text, (2) annotated text combined with parameterized data, and (3) parameterized data alone. Quantitative and qualitative analyses show that models incorporating semantic annotation outperform categorical models, achieving over 0.3 improvement in F1 score and demonstrating that domain-specific semantic representations provide meaningful signals beyond structured demographic data. The findings support the hypothesis that semantic analysis of clinical narratives can enhance early identification strategies and support more informed public health interventions.