Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

118K/year

🤖 AI Summary

This study investigates how simplification instructions affect semantic completeness and user comprehension in large language models’ (LLMs) definitions of homonyms. We identify a critical problem: over-simplification frequently omits essential senses, leading to misinterpretation. To address this, we construct the first multilingual evaluation dataset specifically designed for assessing homonym definition quality and empirically demonstrate that simplification prompts significantly degrade models’ sense coverage. Methodologically, we innovatively adapt Direct Preference Optimization (DPO) to the definition generation task and fine-tune Llama 3.1 8B accordingly. Results show that DPO-finetuned models substantially improve multi-sense identification and balanced expression across diverse prompting conditions, outperforming baselines significantly in both LLM-as-Judge and human evaluations. Our approach provides a scalable, preference-driven technical pathway to enhance lexical definition accuracy and user-centered adaptability.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) can provide accurate word definitions and explanations for any context. However, the scope of the definition changes for different target groups, like children or language learners. This is especially relevant for homonyms, words with multiple meanings, where oversimplification might risk information loss by omitting key senses, potentially misleading users who trust LLM outputs. We investigate how simplification impacts homonym definition quality across three target groups: Normal, Simple, and ELI5. Using two novel evaluation datasets spanning multiple languages, we test DeepSeek v3, Llama 4 Maverick, Qwen3-30B A3B, GPT-4o mini, and Llama 3.1 8B via LLM-as-Judge and human annotations. Our results show that simplification drastically degrades definition completeness by neglecting polysemy, increasing the risk of misunderstanding. Fine-tuning Llama 3.1 8B with Direct Preference Optimization substantially improves homonym response quality across all prompt types. These findings highlight the need to balance simplicity and completeness in educational NLP to ensure reliable, context-aware definitions for all learners.

Problem

Research questions and friction points this paper is trying to address.

How simplification reduces word sense awareness in LLM definitions

Impact of oversimplification on homonym definition quality

Balancing simplicity and completeness in educational NLP

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates simplification impact on homonym definitions

Uses novel multilingual datasets for testing

Improves quality via Direct Preference Optimization

🔎 Similar Papers

An In-depth Evaluation of Large Language Models in Sentence Simplification with Error-based Human Assessment