SPICE: Submodular Penalized Information-Conflict Selection for Efficient Large Language Model Training

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the rapid decay of information gain in conventional information-theoretic data selection methods for instruction tuning, which stems from gradient conflicts among samples and impedes effective retention of critical knowledge. To overcome this limitation, the authors propose SPICE, a novel approach that, for the first time, integrates explicit quantification of gradient conflict into an information-based selection framework. By maximizing Fisher information while penalizing conflicting gradients, and leveraging ε-decomposition theory to analyze the impact of conflict on submodularity, SPICE introduces a conflict-aware greedy selection strategy. The method further incorporates a proxy model for acceleration, early stopping, and submodular optimization. Evaluated across eight benchmarks, SPICE matches or surpasses full-data fine-tuning and six state-of-the-art baselines using only 10% of the data, substantially improving both data efficiency and model performance.

Technology Category

Application Category

📝 Abstract
Information-based data selection for instruction tuning is compelling: maximizing the log-determinant of the Fisher information yields a monotone submodular objective, enabling greedy algorithms to achieve a $(1-1/e)$ approximation under a cardinality budget. In practice, however, we identify alleviating gradient conflicts, misalignment between per-sample gradients, is a key factor that slows down the decay of marginal log-determinant information gains, thereby preventing significant loss of information. We formalize this via an $\varepsilon$-decomposition that quantifies the deviation from ideal submodularity as a function of conflict statistics, yielding data-dependent approximation factors that tighten as conflicts diminish. Guided by this analysis, we propose SPICE, a conflict-aware selector that maximizes information while penalizing misalignment, and that supports early stopping and proxy models for efficiency. Empirically, SPICE selects subsets with higher log-determinant information than original criteria, and these informational gains translate into performance improvements: across 8 benchmarks with LLaMA2-7B and Qwen2-7B, SPICE uses only 10% of the data, yet matches or exceeds 6 methods including full-data tuning. This achieves performance improvements with substantially lower training cost.
Problem

Research questions and friction points this paper is trying to address.

gradient conflict
information loss
data selection
instruction tuning
submodularity
Innovation

Methods, ideas, or system contributions that make the work stand out.

submodular optimization
gradient conflict
data selection
Fisher information
instruction tuning
🔎 Similar Papers
No similar papers found.
P
Powei Chang
Bilibili Inc.
J
Jinpeng Zhang
Bilibili Inc.
B
Bowen Chen
Bilibili Inc.
C
Chenyu Wang
Bilibili Inc.
C
Chenlu Guo
Bilibili Inc.
Y
Yukang Gao
Bilibili Inc.
J
JianXiang Xiang
Bilibili Inc.
Y
Yue Gao
Bilibili Inc.
C
Chaoqun Sun
Bilibili Inc.
Yiyi Chen
Yiyi Chen
PhD Candidate, Aalborg University
Machine LearningDeep LearningNatural Language Processing
D
Dongying Kong
Bilibili Inc.