🤖 AI Summary
Focalization annotation in literary narratives is highly subjective, exhibits substantial inter-annotator disagreement, and suffers from a severe scarcity of labeled data. Method: This paper pioneers a systematic investigation of large language models’ (LLMs) zero-shot capability for automatic focalization identification, evaluating five state-of-the-art LLMs—including GPT-4o—using structured prompting and log-probability analysis to surpass rule- and dictionary-based baselines. Contribution/Results: GPT-4o achieves a mean F1 score of 84.79%, matching the performance of trained human experts. Crucially, log-probabilities are leveraged to quantify annotation uncertainty—a novel method enabling robust, scalable computational narratology. The approach is applied to 16 Stephen King novels, uncovering macro-level patterns in narrative perspective evolution. This work establishes a new, scalable, and interpretable paradigm for automated narrative analysis grounded in LLMs and probabilistic reasoning.
📝 Abstract
Focalization, the perspective through which narrative is presented, is encoded via a wide range of lexico-grammatical features and is subject to reader interpretation. Even trained annotators frequently disagree on correct labels, suggesting this task is both qualitatively and computationally challenging. In this work, we test how well five contemporary large language model (LLM) families and two baselines perform when annotating short literary excerpts for focalization. Despite the challenging nature of the task, we find that LLMs show comparable performance to trained human annotators, with GPT-4o achieving an average F1 of 84.79%. Further, we demonstrate that the log probabilities output by GPT-family models frequently reflect the difficulty of annotating particular excerpts. Finally, we provide a case study analyzing sixteen Stephen King novels, demonstrating the usefulness of this approach for computational literary studies and the insights gleaned from examining focalization at scale.