Evaluating the Impact of LLM-Assisted Annotation in a Perspectivized Setting: the Case of FrameNet Annotation

📅 2025-10-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates the efficacy and limitations of large language models (LLMs) in supporting FrameNet-style semantic annotation for linguistic resource construction. We conduct systematic experiments comparing three annotation paradigms—manual, fully automatic (LLM-only), and semi-automatic (LLM-generated initial annotations followed by human verification)—along three dimensions: annotation efficiency, semantic frame coverage, and frame diversity. To our knowledge, this is the first evaluation of LLM-assisted semantic role labeling within a perspectivist NLP framework. Results show that the semi-automatic approach achieves coverage statistically equivalent to manual annotation (p > 0.95), increases frame diversity by 18.3%, and reduces annotation time by 42%. In contrast, the fully automatic method, while fastest, incurs a 37% accuracy drop, with errors concentrated in metaphorical and peripheral frames. These findings underscore the irreplaceable role of human–AI collaboration in high-quality semantic resource development and provide a reproducible methodology and empirical benchmark for LLM-augmented language engineering.

Technology Category

Application Category

📝 Abstract

The use of LLM-based applications as a means to accelerate and/or substitute human labor in the creation of language resources and dataset is a reality. Nonetheless, despite the potential of such tools for linguistic research, comprehensive evaluation of their performance and impact on the creation of annotated datasets, especially under a perspectivized approach to NLP, is still missing. This paper contributes to reduction of this gap by reporting on an extensive evaluation of the (semi-)automatization of FrameNet-like semantic annotation by the use of an LLM-based semantic role labeler. The methodology employed compares annotation time, coverage and diversity in three experimental settings: manual, automatic and semi-automatic annotation. Results show that the hybrid, semi-automatic annotation setting leads to increased frame diversity and similar annotation coverage, when compared to the human-only setting, while the automatic setting performs considerably worse in all metrics, except for annotation time.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM-assisted FrameNet annotation in perspectivized NLP settings

Assessing automation impact on annotation coverage, diversity and time

Comparing manual, automatic and semi-automatic semantic role labeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based semantic role labeling for FrameNet

Hybrid semi-automatic annotation increases frame diversity

Comparative evaluation of manual automatic semi-automatic methods

🔎 Similar Papers

No similar papers found.

Authors to Follow