MSLEF: Multi-Segment LLM Ensemble Finetuning in Recruitment

📅 2025-09-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low parsing accuracy caused by heterogeneous resume formats, this paper proposes a multi-paragraph large language model (LLM) ensemble fine-tuning framework. The method employs a segment-aware architecture that assigns field-specific weights to different resume sections and integrates outputs from fine-tuned models—including Gemma-9B, LLaMA-3.1-8B, and Phi-4-14B—via a higher-order aggregator (Gemini-2.5-Flash), combined with weighted voting and multi-task evaluation. Compared to the best-performing single-model baseline, our framework achieves consistent improvements across all evaluation metrics: Exact Match (EM), F1, BLEU, ROUGE, and Recruitment Similarity (RS), with RS increasing significantly by 7%. This demonstrates substantially enhanced cross-structural generalization capability, effectively supporting accurate candidate representation in real-world recruitment scenarios.

Technology Category

Application Category

📝 Abstract
This paper presents MSLEF, a multi-segment ensemble framework that employs LLM fine-tuning to enhance resume parsing in recruitment automation. It integrates fine-tuned Large Language Models (LLMs) using weighted voting, with each model specializing in a specific resume segment to boost accuracy. Building on MLAR , MSLEF introduces a segment-aware architecture that leverages field-specific weighting tailored to each resume part, effectively overcoming the limitations of single-model systems by adapting to diverse formats and structures. The framework incorporates Gemini-2.5-Flash LLM as a high-level aggregator for complex sections and utilizes Gemma 9B, LLaMA 3.1 8B, and Phi-4 14B. MSLEF achieves significant improvements in Exact Match (EM), F1 score, BLEU, ROUGE, and Recruitment Similarity (RS) metrics, outperforming the best single model by up to +7% in RS. Its segment-aware design enhances generalization across varied resume layouts, making it highly adaptable to real-world hiring scenarios while ensuring precise and reliable candidate representation.
Problem

Research questions and friction points this paper is trying to address.

Enhancing resume parsing accuracy in recruitment automation
Overcoming single-model limitations with multi-segment ensemble framework
Adapting to diverse resume formats and structures effectively
Innovation

Methods, ideas, or system contributions that make the work stand out.

Segment-aware ensemble with weighted voting per resume part
Specialized LLMs for different resume segments integration
High-level aggregator LLM for complex sections handling