🤖 AI Summary
To address the limitations of parameter-efficient fine-tuning (PEFT) methods in low-resource multilingual settings—particularly their suboptimal trade-offs among accuracy, probabilistic calibration, and cross-lingual generalization—this paper proposes a computationally efficient hybrid fine-tuning framework. Methodologically, it introduces: (1) a layer-wise hybrid parameter update mechanism integrating gradient-aligned low-rank adaptation with structured orthogonal transformations; (2) sub-layer unitary constraints to stabilize deep-model optimization; and (3) a lightweight, label-free data curation pipeline incorporating language identification, near-duplicate removal, and unsupervised quality filtering. Evaluated on XNLI and FLORES benchmarks, our approach consistently outperforms mainstream PEFT baselines, achieving significant improvements in calibration accuracy and robustness to orthographic variations, while reducing training overhead by over 30%. The framework thus delivers superior cost-quality trade-offs for low-resource multilingual adaptation.
📝 Abstract
We present a governance-aware hybrid fine-tuning framework for multilingual, low-resource adaptation of large language models. The core algorithm combines gradient-aligned low-rank updates with structured orthogonal transformations through layer-wise mixing and introduces unitary constraints in selected sub-layers to stabilize deep optimization. In tandem with lightweight, label-free data governance steps, including language identification, near-duplicate removal, and quality filtering, the framework targets accuracy, calibration, and cross-language parity under tight compute budgets. Across XNLI and FLORES, the hybrid approach delivers consistent gains over strong PEFT baselines while maintaining directional balance and improving probability calibration, as shown in Tables II and III. It is more resilient to lightweight orthographic variants, as shown in Table IV, and benefits additively from simple governance steps, as shown in Table V. Training footprint measurements indicate modest overhead and a favorable cost-quality frontier, as shown in Table VI and Figure 2. Together, these results show that hybrid and unitary PEFT provide a stable and accessible path to resource-efficient multilingual adaptation when paired with practical data governance.