Governance-Aware Hybrid Fine-Tuning for Multilingual Large Language Models

📅 2025-12-19

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

To address the limitations of parameter-efficient fine-tuning (PEFT) methods in low-resource multilingual settings—particularly their suboptimal trade-offs among accuracy, probabilistic calibration, and cross-lingual generalization—this paper proposes a computationally efficient hybrid fine-tuning framework. Methodologically, it introduces: (1) a layer-wise hybrid parameter update mechanism integrating gradient-aligned low-rank adaptation with structured orthogonal transformations; (2) sub-layer unitary constraints to stabilize deep-model optimization; and (3) a lightweight, label-free data curation pipeline incorporating language identification, near-duplicate removal, and unsupervised quality filtering. Evaluated on XNLI and FLORES benchmarks, our approach consistently outperforms mainstream PEFT baselines, achieving significant improvements in calibration accuracy and robustness to orthographic variations, while reducing training overhead by over 30%. The framework thus delivers superior cost-quality trade-offs for low-resource multilingual adaptation.

Technology Category

Application Category

📝 Abstract

We present a governance-aware hybrid fine-tuning framework for multilingual, low-resource adaptation of large language models. The core algorithm combines gradient-aligned low-rank updates with structured orthogonal transformations through layer-wise mixing and introduces unitary constraints in selected sub-layers to stabilize deep optimization. In tandem with lightweight, label-free data governance steps, including language identification, near-duplicate removal, and quality filtering, the framework targets accuracy, calibration, and cross-language parity under tight compute budgets. Across XNLI and FLORES, the hybrid approach delivers consistent gains over strong PEFT baselines while maintaining directional balance and improving probability calibration, as shown in Tables II and III. It is more resilient to lightweight orthographic variants, as shown in Table IV, and benefits additively from simple governance steps, as shown in Table V. Training footprint measurements indicate modest overhead and a favorable cost-quality frontier, as shown in Table VI and Figure 2. Together, these results show that hybrid and unitary PEFT provide a stable and accessible path to resource-efficient multilingual adaptation when paired with practical data governance.

Problem

Research questions and friction points this paper is trying to address.

Governance-aware hybrid fine-tuning for multilingual LLMs

Combines gradient-aligned low-rank updates with orthogonal transformations

Targets accuracy, calibration, cross-language parity under compute constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid fine-tuning combines gradient-aligned low-rank updates with orthogonal transformations

Introduces unitary constraints in selected sub-layers to stabilize deep optimization

Uses lightweight data governance steps like language identification and quality filtering

🔎 Similar Papers

A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers