FiLoRA: Focus-and-Ignore LoRA for Controllable Feature Reliance

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the opacity of internal feature dependency mechanisms in multimodal foundation models and the challenge of actively controlling these mechanisms without altering task semantics. The authors propose an instruction-conditioned, parameter-efficient adaptation framework that, for the first time, treats natural language instructions as computational control signals. By integrating feature-group-aligned LoRA modules with an instruction-driven gating mechanism, the method enables selective enhancement or suppression of core versus spurious feature groups. Notably, this approach operates without modifying the label space or training objectives. Experiments on image–text and audio–video benchmarks demonstrate that the framework can causally steer internal model computations, substantially improving robustness against spurious feature interference.

Technology Category

Application Category

📝 Abstract
Multimodal foundation models integrate heterogeneous signals across modalities, yet it remains poorly understood how their predictions depend on specific internal feature groups and whether such reliance can be deliberately controlled. Existing studies of shortcut and spurious behavior largely rely on post hoc analyses or feature removal, offering limited insight into whether reliance can be modulated without altering task semantics. We introduce FiLoRA (Focus-and-Ignore LoRA), an instruction-conditioned, parameter-efficient adaptation framework that enables explicit control over internal feature reliance while keeping the predictive objective fixed. FiLoRA decomposes adaptation into feature group-aligned LoRA modules and applies instruction-conditioned gating, allowing natural language instructions to act as computation-level control signals rather than task redefinitions. Across text--image and audio--visual benchmarks, we show that instruction-conditioned gating induces consistent and causal shifts in internal computation, selectively amplifying or suppressing core and spurious feature groups without modifying the label space or training objective. Further analyses demonstrate that FiLoRA yields improved robustness under spurious feature interventions, revealing a principled mechanism to regulate reliance beyond correlation-driven learning.
Problem

Research questions and friction points this paper is trying to address.

multimodal foundation models
feature reliance
spurious correlations
controllable adaptation
internal feature groups
Innovation

Methods, ideas, or system contributions that make the work stand out.

FiLoRA
instruction-conditioned gating
feature reliance control
parameter-efficient adaptation
multimodal foundation models
🔎 Similar Papers
No similar papers found.
Hyunsuk Chung
Hyunsuk Chung
University of Melbourne
Data MiningKnowledge-based SystemMultimodal UnderstandingKnowledge Capture
C
Caren Han
University of Melbourne, Melbourne, Australia
Y
Yerin Choi
Brain Science Institute, Korea Institute of Science and Technology, Seoul, Republic of Korea; Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
S
Seungyeon Ji
Brain Science Institute, Korea Institute of Science and Technology, Seoul, Republic of Korea; Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
J
Jinwoo Kim
University of Melbourne, Melbourne, Australia
Eun-Jung Holden
Eun-Jung Holden
Professor, The University of Melbourne
Geodata ScienceAIIndustrial AI ApplicationsData FusionKnowledge Discovery
Kyungreem Han
Kyungreem Han
Korea Institute of Science and Technology
Molecular/Quantum MechanicsBiophysicsArtificial IntelligencePhilosophy of MindFree Will