Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning

📅 2025-02-26

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the scarcity of high-quality training data and low filtering efficiency in instruction tuning, this paper proposes the Low-Confidence Gold (LCG) framework. LCG innovatively leverages models’ erroneous predictions on low-confidence samples to reconstruct high-value supervision signals; integrates centroid-based clustering with confidence-guided sampling to enhance semantic representativeness and diversity of instruction pairs; and employs a lightweight semi-supervised classifier for efficient data refinement. Evaluated on MT-Bench and other multidimensional benchmarks, LCG achieves significant gains (+3.2 points) over state-of-the-art methods using only 6K samples, while reducing training overhead by ~60%. Its core contribution lies in transforming low-confidence samples into high-quality supervision sources—challenging the conventional “high-confidence implies high-quality” data filtering paradigm and establishing a new principle for instruction-data curation.

Technology Category

Application Category

📝 Abstract

The effectiveness of instruction fine-tuning for Large Language Models is fundamentally constrained by the quality and efficiency of training datasets. This work introduces Low-Confidence Gold (LCG), a novel filtering framework that employs centroid-based clustering and confidence-guided selection for identifying valuable instruction pairs. Through a semi-supervised approach using a lightweight classifier trained on representative samples, LCG curates high-quality subsets while preserving data diversity. Experimental evaluation demonstrates that models fine-tuned on LCG-filtered subsets of 6K samples achieve superior performance compared to existing methods, with substantial improvements on MT-bench and consistent gains across comprehensive evaluation metrics. The framework's efficacy while maintaining model performance establishes a promising direction for efficient instruction tuning.

Problem

Research questions and friction points this paper is trying to address.

Improving instruction fine-tuning efficiency

Filtering low-confidence training datasets

Enhancing model performance with LCG

Innovation

Methods, ideas, or system contributions that make the work stand out.

Centroid-based clustering for data filtering

Confidence-guided selection of instruction pairs

Lightweight classifier for semi-supervised learning

🔎 Similar Papers

No similar papers found.

Authors to Follow