Evaluating the Impact of LLM-Assisted Annotation in a Perspectivized Setting: the Case of FrameNet Annotation

📅 2025-10-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the efficacy and limitations of large language models (LLMs) in supporting FrameNet-style semantic annotation for linguistic resource construction. We conduct systematic experiments comparing three annotation paradigms—manual, fully automatic (LLM-only), and semi-automatic (LLM-generated initial annotations followed by human verification)—along three dimensions: annotation efficiency, semantic frame coverage, and frame diversity. To our knowledge, this is the first evaluation of LLM-assisted semantic role labeling within a perspectivist NLP framework. Results show that the semi-automatic approach achieves coverage statistically equivalent to manual annotation (p > 0.95), increases frame diversity by 18.3%, and reduces annotation time by 42%. In contrast, the fully automatic method, while fastest, incurs a 37% accuracy drop, with errors concentrated in metaphorical and peripheral frames. These findings underscore the irreplaceable role of human–AI collaboration in high-quality semantic resource development and provide a reproducible methodology and empirical benchmark for LLM-augmented language engineering.

Technology Category

Application Category

📝 Abstract
The use of LLM-based applications as a means to accelerate and/or substitute human labor in the creation of language resources and dataset is a reality. Nonetheless, despite the potential of such tools for linguistic research, comprehensive evaluation of their performance and impact on the creation of annotated datasets, especially under a perspectivized approach to NLP, is still missing. This paper contributes to reduction of this gap by reporting on an extensive evaluation of the (semi-)automatization of FrameNet-like semantic annotation by the use of an LLM-based semantic role labeler. The methodology employed compares annotation time, coverage and diversity in three experimental settings: manual, automatic and semi-automatic annotation. Results show that the hybrid, semi-automatic annotation setting leads to increased frame diversity and similar annotation coverage, when compared to the human-only setting, while the automatic setting performs considerably worse in all metrics, except for annotation time.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM-assisted FrameNet annotation in perspectivized NLP settings
Assessing automation impact on annotation coverage, diversity and time
Comparing manual, automatic and semi-automatic semantic role labeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based semantic role labeling for FrameNet
Hybrid semi-automatic annotation increases frame diversity
Comparative evaluation of manual automatic semi-automatic methods
🔎 Similar Papers
No similar papers found.
Frederico Belcavello
Frederico Belcavello
Federal University of Juiz de Fora | FrameNet Brasil Computational Linguistics Lab
linguisticscommunicationsframe semanticsTV
Ely Matos
Ely Matos
Universidade Federal de Juiz de Fora
Computational Cognitive LinguisticsWeb Development
A
Arthur Lorenzi
Federal University of Juiz de Fora | FrameNet Brasil
L
Lisandra Bonoto
Federal University of Juiz de Fora | FrameNet Brasil
L
Lívia Ruiz
Federal University of Juiz de Fora | FrameNet Brasil
L
Luiz Fernando Pereira
Federal University of Juiz de Fora | FrameNet Brasil
V
Victor Herbst
Federal University of Juiz de Fora | FrameNet Brasil
Y
Yulla Navarro
Federal University of Juiz de Fora | FrameNet Brasil
H
Helen de Andrade Abreu
Federal University of Juiz de Fora | FrameNet Brasil
L
Lívia Dutra
Federal University of Juiz de Fora | FrameNet Brasil, Gothenburg University
Tiago Timponi Torrent
Tiago Timponi Torrent
Professor of Linguistics, Universidade Federal de Juiz de Fora
Computational LinguisticsCognitive LinguisticsFrame SemanticsConstruction GrammarNatural Language Understanding