LLM Data Selection and Utilization via Dynamic Bi-level Optimization

📅 2025-07-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the inefficiency and high computational cost of static data selection in large language model (LLM) training, this paper proposes the Dynamic Weighted Mining (DWM) framework, which jointly optimizes data weights and model parameters during training. DWM employs a bilevel optimization mechanism to adaptively update data weights on a per-batch basis, thereby revealing—systematically and for the first time—the evolutionary patterns of model data preferences across training stages. Experiments demonstrate that DWM significantly outperforms random sampling baselines in final model performance. Moreover, the learned data weighting strategy exhibits strong transferability across diverse data selection methods and model scales, enabling plug-and-play enhancement of multiple training paradigms—including curriculum learning, loss-based reweighting, and mixture-of-experts scheduling. This work establishes a new paradigm for efficient, adaptive data scheduling in LLM training, advancing both empirical effectiveness and theoretical understanding of dynamic dataset optimization.

Technology Category

Application Category

📝 Abstract
While large-scale training data is fundamental for developing capable large language models (LLMs), strategically selecting high-quality data has emerged as a critical approach to enhance training efficiency and reduce computational costs. Current data selection methodologies predominantly rely on static, training-agnostic criteria, failing to account for the dynamic model training and data interactions. In this paper, we propose a new Data Weighting Model (DWM) to adjust the weight of selected data within each batch to achieve a dynamic data utilization during LLM training. Specially, to better capture the dynamic data preference of the trained model, a bi-level optimization framework is implemented to update the weighting model. Our experiments demonstrate that DWM enhances the performance of models trained with randomly-selected data, and the learned weighting model can be transferred to enhance other data selection methods and models of different sizes. Moreover, we further analyze how a model's data preferences evolve throughout training, providing new insights into the data preference of the model during training.
Problem

Research questions and friction points this paper is trying to address.

Dynamic data selection for efficient LLM training
Bi-level optimization for adaptive data weighting
Analyzing evolving data preferences in model training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic bi-level optimization for data weighting
Adjusts data weights per batch dynamically
Transfers learned weights across models
🔎 Similar Papers
No similar papers found.
Y
Yang Yu
School of Artificial Intelligence, University of Chinese Academy of Sciences
K
Kai Han
Huawei Noah’s Ark Lab
H
Hang Zhou
Huawei Noah’s Ark Lab, College of Intelligence and Computing, Tianjin University
Yehui Tang
Yehui Tang
Shanghai Jiao Tong University
Machine LearningQuantum AI & AI4Science
K
Kaiqi Huang
School of Artificial Intelligence, University of Chinese Academy of Sciences, The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences
Yunhe Wang
Yunhe Wang
Noah's Ark Lab, Huawei Technologies
Deep LearningLanguage ModelMachine LearningComputer Vision
Dacheng Tao
Dacheng Tao
Nanyang Technological University
artificial intelligencemachine learningcomputer visionimage processingdata mining