Problem
Research questions and friction points this paper is trying to address.
Improving instruction fine-tuning efficiency
Filtering low-confidence training datasets
Enhancing model performance with LCG
Innovation
Methods, ideas, or system contributions that make the work stand out.
Centroid-based clustering for data filtering
Confidence-guided selection of instruction pairs
Lightweight classifier for semi-supervised learning