🤖 AI Summary
To address hallucination in multilingual large language models (MLLMs) stemming from uneven knowledge distribution, this paper proposes a causality-based trustworthy refusal mechanism. It pioneers the integration of causal intervention and counterfactual reasoning into multilingual refusal decision-making, explicitly disentangling confounding biases in feedback generation. The approach constructs a causal graph to model interdependencies among language, knowledge, and response quality; quantifies the causal importance of multiple candidate responses via do-calculus; and introduces a dual-path adaptation framework—Casual-native and Causal-multi—to ensure interpretable, cross-lingually consistent “active refusal.” Evaluated on bilingual百科 and commonsense QA benchmarks, the method achieves a 12.7% absolute gain in refusal accuracy and attains an 89.4% F1 score on feedback selection, significantly outperforming strong baselines. Crucially, decisions are fully attributable via causal effect estimation, enabling transparent, auditable refusal behavior.
📝 Abstract
Large Language Models (LLMs) often exhibit knowledge disparities across languages. Encouraging LLMs to extit{abstain} when faced with knowledge gaps is a promising strategy to reduce hallucinations in multilingual settings. Current abstention strategies for multilingual scenarios primarily rely on generating feedback in various languages using LLMs and performing self-reflection. However, these methods can be adversely impacted by inaccuracies and biases in the generated feedback. To address this, from a causal perspective, we introduce extit{CausalAbstain}, a method that helps LLMs determine whether to utilize multiple generated feedback responses and how to identify the most useful ones. Extensive experiments demonstrate that extit{CausalAbstain} effectively selects helpful feedback and enhances abstention decisions with interpretability in both native language ( extsc{Casual-native}) and multilingual ( extsc{Causal-multi}) settings, outperforming strong baselines on two benchmark datasets covering encyclopedic and commonsense knowledge QA tasks. Our code and data are open-sourced at https://github.com/peachch/CausalAbstain.