NOWJ@COLIEE 2025: A Multi-stage Framework Integrating Embedding Models and Large Language Models for Legal Retrieval and Entailment

📅 2025-09-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the suboptimal performance of case retrieval and entailment identification in legal information processing. We propose a multi-stage collaborative framework integrating embedding models and large language models (LLMs). Methodologically, it comprises two stages: (1) a lexical-semantic pre-ranking stage leveraging BM25 and multi-source embedding models (e.g., BERT, BGE-m3, LLM2Vec); and (2) a context-aware re-ranking and entailment inference stage using LLMs including Qwen-2, QwQ-32B, and DeepSeek-V3. To our knowledge, this is the first systematic integration of traditional retrieval and generative modeling for the COLIEE 2025 tasks, enabling end-to-end semantic filtering and inference co-optimization. Experiments demonstrate state-of-the-art performance: our approach achieves first place in the legal case entailment task (F1 = 0.3195) and shows strong competitiveness across four additional tasks—statute retrieval, judgment prediction, and others—validating the efficacy of the hybrid paradigm.

Technology Category

Application Category

📝 Abstract
This paper presents the methodologies and results of the NOWJ team's participation across all five tasks at the COLIEE 2025 competition, emphasizing advancements in the Legal Case Entailment task (Task 2). Our comprehensive approach systematically integrates pre-ranking models (BM25, BERT, monoT5), embedding-based semantic representations (BGE-m3, LLM2Vec), and advanced Large Language Models (Qwen-2, QwQ-32B, DeepSeek-V3) for summarization, relevance scoring, and contextual re-ranking. Specifically, in Task 2, our two-stage retrieval system combined lexical-semantic filtering with contextualized LLM analysis, achieving first place with an F1 score of 0.3195. Additionally, in other tasks--including Legal Case Retrieval, Statute Law Retrieval, Legal Textual Entailment, and Legal Judgment Prediction--we demonstrated robust performance through carefully engineered ensembles and effective prompt-based reasoning strategies. Our findings highlight the potential of hybrid models integrating traditional IR techniques with contemporary generative models, providing a valuable reference for future advancements in legal information processing.
Problem

Research questions and friction points this paper is trying to address.

Integrating embedding models and LLMs for legal retrieval
Advancing legal case entailment with multi-stage frameworks
Combining IR techniques with generative models for law
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-stage framework integrating embedding models
Combined lexical-semantic filtering with LLM analysis
Hybrid models merging traditional IR with generative AI
🔎 Similar Papers
No similar papers found.
H
Hoang-Trung Nguyen
VNU University of Engineering and Technology, Hanoi, Vietnam
T
Tan-Minh Nguyen
Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Xuan-Bach Le
Xuan-Bach Le
VNU University of Engineering and Technology, Hanoi, Vietnam
T
Tuan-Kiet Le
VNU University of Engineering and Technology, Hanoi, Vietnam
K
Khanh-Huyen Nguyen
VNU University of Engineering and Technology, Hanoi, Vietnam
H
Ha-Thanh Nguyen
Center for Juris-Informatics, ROIS-DS Research and Development Center for Large Language Models, NII, Tokyo, Japan
Thi-Hai-Yen Vuong
Thi-Hai-Yen Vuong
VNU University of Engineering and Technology, Vietnam National University, Hanoi
Data minningNLPLegal NLPSymbolic AI
L
Le-Minh Nguyen
Japan Advanced Institute of Science and Technology, Ishikawa, Japan