LLMs for Law: Evaluating Legal-Specific LLMs on Contract Understanding

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

A systematic evaluation of domain-specific large language models (LLMs) for legal text understanding—particularly contract classification—remains lacking. Method: This work presents the first comprehensive benchmark of ten legal-domain LLMs against seven general-purpose LLMs across three English contract understanding tasks, employing a multi-task evaluation framework emphasizing text classification and semantic understanding. Contribution/Results: Legal-specialized models significantly outperform general-purpose models, especially on tasks requiring fine-grained legal reasoning. Legal-BERT and Contracts-BERT achieve new state-of-the-art (SOTA) results on two tasks despite their relatively small parameter counts. CaseLaw-BERT and LexLM demonstrate strong baseline performance. Collectively, this study establishes a critical benchmark and provides empirically grounded guidance for model selection in contract understanding systems, advancing the development of precise, task-adapted legal AI.

Technology Category

Application Category

📝 Abstract

Despite advances in legal NLP, no comprehensive evaluation covering multiple legal-specific LLMs currently exists for contract classification tasks in contract understanding. To address this gap, we present an evaluation of 10 legal-specific LLMs on three English language contract understanding tasks and compare them with 7 general-purpose LLMs. The results show that legal-specific LLMs consistently outperform general-purpose models, especially on tasks requiring nuanced legal understanding. Legal-BERT and Contracts-BERT establish new SOTAs on two of the three tasks, despite having 69% fewer parameters than the best-performing general-purpose LLM. We also identify CaseLaw-BERT and LexLM as strong additional baselines for contract understanding. Our results provide a holistic evaluation of legal-specific LLMs and will facilitate the development of more accurate contract understanding systems.

Problem

Research questions and friction points this paper is trying to address.

Evaluating legal-specific LLMs on contract understanding tasks

Comparing legal and general-purpose LLMs in contract classification

Identifying top-performing models for nuanced legal comprehension

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates 10 legal-specific LLMs on contract tasks

Legal-specific LLMs outperform general-purpose models

Legal-BERT and Contracts-BERT achieve new SOTAs

🔎 Similar Papers

No similar papers found.

Authors to Follow