Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning

📅 2025-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of high-quality training data and low filtering efficiency in instruction tuning, this paper proposes the Low-Confidence Gold (LCG) framework. LCG innovatively leverages models’ erroneous predictions on low-confidence samples to reconstruct high-value supervision signals; integrates centroid-based clustering with confidence-guided sampling to enhance semantic representativeness and diversity of instruction pairs; and employs a lightweight semi-supervised classifier for efficient data refinement. Evaluated on MT-Bench and other multidimensional benchmarks, LCG achieves significant gains (+3.2 points) over state-of-the-art methods using only 6K samples, while reducing training overhead by ~60%. Its core contribution lies in transforming low-confidence samples into high-quality supervision sources—challenging the conventional “high-confidence implies high-quality” data filtering paradigm and establishing a new principle for instruction-data curation.

Technology Category

Application Category

📝 Abstract
The effectiveness of instruction fine-tuning for Large Language Models is fundamentally constrained by the quality and efficiency of training datasets. This work introduces Low-Confidence Gold (LCG), a novel filtering framework that employs centroid-based clustering and confidence-guided selection for identifying valuable instruction pairs. Through a semi-supervised approach using a lightweight classifier trained on representative samples, LCG curates high-quality subsets while preserving data diversity. Experimental evaluation demonstrates that models fine-tuned on LCG-filtered subsets of 6K samples achieve superior performance compared to existing methods, with substantial improvements on MT-bench and consistent gains across comprehensive evaluation metrics. The framework's efficacy while maintaining model performance establishes a promising direction for efficient instruction tuning.
Problem

Research questions and friction points this paper is trying to address.

Improving instruction fine-tuning efficiency
Filtering low-confidence training datasets
Enhancing model performance with LCG
Innovation

Methods, ideas, or system contributions that make the work stand out.

Centroid-based clustering for data filtering
Confidence-guided selection of instruction pairs
Lightweight classifier for semi-supervised learning
🔎 Similar Papers
No similar papers found.
Hongyi Cai
Hongyi Cai
University of Malaya
Data-centric AIAI for EfficiencyComputer Vision
J
Jie Li
University of Science and Technology Beijing
W
Wenzhen Dong
The Chinese University of Hong Kong