Safe Semantics, Unsafe Interpretations: Tackling Implicit Reasoning Safety in Large Vision-Language Models

📅 2025-08-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses an emerging safety issue in large vision-language models (VLMs): *implicit reasoning safety*—where benign multimodal inputs trigger harmful outputs due to latent, defective cross-modal reasoning within the model. We formally define this concept and introduce SSUI, the first dedicated benchmark for evaluating implicit reasoning vulnerabilities under image-text compositions. Methodologically, we integrate multimodal safety analysis, implicit reasoning path tracing, and in-context learning (ICL)-based intervention. Experiments demonstrate that lightweight ICL prompts—without model fine-tuning—significantly suppress such unsafe behaviors, establishing ICL as an efficient, parameter-free defense paradigm. Our contributions include: (1) a novel conceptual framework for VLM safety, (2) SSUI—a standardized evaluation benchmark exposing implicit reasoning flaws, and (3) a practical, deployable mitigation strategy. This work advances VLM safety assessment and robustness enhancement through new theoretical insight, empirical grounding, and actionable methodology.

Technology Category

Application Category

📝 Abstract
Large Vision-Language Models face growing safety challenges with multimodal inputs. This paper introduces the concept of Implicit Reasoning Safety, a vulnerability in LVLMs. Benign combined inputs trigger unsafe LVLM outputs due to flawed or hidden reasoning. To showcase this, we developed Safe Semantics, Unsafe Interpretations, the first dataset for this critical issue. Our demonstrations show that even simple In-Context Learning with SSUI significantly mitigates these implicit multimodal threats, underscoring the urgent need to improve cross-modal implicit reasoning.
Problem

Research questions and friction points this paper is trying to address.

Addressing implicit reasoning safety in LVLMs
Mitigating unsafe outputs from benign multimodal inputs
Improving cross-modal implicit reasoning for safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Implicit Reasoning Safety concept
Develops SSUI dataset for safety testing
Uses In-Context Learning to mitigate threats
🔎 Similar Papers
No similar papers found.
W
Wei Cai
Peking University, Institute of Artificial Intelligence (TeleAI), China Telecom
J
Jian Zhao
Institute of Artificial Intelligence (TeleAI), China Telecom, Northwestern Polytechnical University
Yuchu Jiang
Yuchu Jiang
Southeast University
Large Language ModelsComputer Vision
T
Tianle Zhang
Institute of Artificial Intelligence (TeleAI), China Telecom
X
Xuelong Li
Institute of Artificial Intelligence (TeleAI), China Telecom