Governance-Aware Hybrid Fine-Tuning for Multilingual Large Language Models

📅 2025-12-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of parameter-efficient fine-tuning (PEFT) methods in low-resource multilingual settings—particularly their suboptimal trade-offs among accuracy, probabilistic calibration, and cross-lingual generalization—this paper proposes a computationally efficient hybrid fine-tuning framework. Methodologically, it introduces: (1) a layer-wise hybrid parameter update mechanism integrating gradient-aligned low-rank adaptation with structured orthogonal transformations; (2) sub-layer unitary constraints to stabilize deep-model optimization; and (3) a lightweight, label-free data curation pipeline incorporating language identification, near-duplicate removal, and unsupervised quality filtering. Evaluated on XNLI and FLORES benchmarks, our approach consistently outperforms mainstream PEFT baselines, achieving significant improvements in calibration accuracy and robustness to orthographic variations, while reducing training overhead by over 30%. The framework thus delivers superior cost-quality trade-offs for low-resource multilingual adaptation.

Technology Category

Application Category

📝 Abstract
We present a governance-aware hybrid fine-tuning framework for multilingual, low-resource adaptation of large language models. The core algorithm combines gradient-aligned low-rank updates with structured orthogonal transformations through layer-wise mixing and introduces unitary constraints in selected sub-layers to stabilize deep optimization. In tandem with lightweight, label-free data governance steps, including language identification, near-duplicate removal, and quality filtering, the framework targets accuracy, calibration, and cross-language parity under tight compute budgets. Across XNLI and FLORES, the hybrid approach delivers consistent gains over strong PEFT baselines while maintaining directional balance and improving probability calibration, as shown in Tables II and III. It is more resilient to lightweight orthographic variants, as shown in Table IV, and benefits additively from simple governance steps, as shown in Table V. Training footprint measurements indicate modest overhead and a favorable cost-quality frontier, as shown in Table VI and Figure 2. Together, these results show that hybrid and unitary PEFT provide a stable and accessible path to resource-efficient multilingual adaptation when paired with practical data governance.
Problem

Research questions and friction points this paper is trying to address.

Governance-aware hybrid fine-tuning for multilingual LLMs
Combines gradient-aligned low-rank updates with orthogonal transformations
Targets accuracy, calibration, cross-language parity under compute constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid fine-tuning combines gradient-aligned low-rank updates with orthogonal transformations
Introduces unitary constraints in selected sub-layers to stabilize deep optimization
Uses lightweight data governance steps like language identification and quality filtering
🔎 Similar Papers
No similar papers found.
Haomin Qi
Haomin Qi
University of California, San Diego
Generative AIDeep LearningNatural Language Processing
C
Chengbo Huang
Columbia University, New York City, NY , USA
Z
Zihan Dai
University of Copenhagen, Copenhagen, Denmark
Y
Yunkai Gao
Duke University, Durham, NC, USA