A Report on Financial Regulations Challenge at COLING 2025

📅 2024-12-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing financial large language models (FinLLMs) lack rigorous evaluation frameworks for complex regulatory rule comprehension and compliance reasoning. Method: We introduce COLING 2025 Financial Regulatory Challenge—the first multi-task benchmark dedicated to financial regulatory understanding—comprising nine novel tasks covering core compliance scenarios such as clause provenance tracing and conflict detection. It employs zero-shot and few-shot evaluation across mainstream models (e.g., Llama, Qwen, Phi) and ensures data quality via expert validation. Contribution/Results: Aggregated results from 23 participating teams reveal that state-of-the-art FinLLMs achieve an average accuracy of less than 62% on critical tasks, exposing severe limitations in regulatory semantic understanding and compliance reasoning. This work establishes the first systematic characterization of FinLLMs’ capability boundaries in regulatory interpretation, providing a foundational benchmark and actionable insights for developing trustworthy financial AI systems.

Technology Category

Application Category

📝 Abstract
Financial large language models (FinLLMs) have been applied to various tasks in business, finance, accounting, and auditing. Complex financial regulations and standards are critical to financial services, which LLMs must comply with. However, FinLLMs' performance in understanding and interpreting financial regulations has rarely been studied. Therefore, we organize the Regulations Challenge, a shared task at COLING 2025. It encourages the academic community to explore the strengths and limitations of popular LLMs. We create 9 novel tasks and corresponding question sets. In this paper, we provide an overview of these tasks and summarize participants' approaches and results. We aim to raise awareness of FinLLMs' professional capability in financial regulations.
Problem

Research questions and friction points this paper is trying to address.

Financial Language Models
Complex Financial Rules
Capabilities and Limitations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Financial Large Language Models
Regulatory Challenge
Real-world Applicability
🔎 Similar Papers
No similar papers found.
K
Keyi Wang
Columbia University
J
Jaisal Patel
Rensselaer Polytechnic Institute
C
Charlie Shen
Columbia University
D
Daniel Kim
Rensselaer Polytechnic Institute
A
Andy Zhu
Rensselaer Polytechnic Institute
A
Alex Lin
Rensselaer Polytechnic Institute
L
Luca Borella
FINOS, Linux Foundation
Cailean Osborne
Cailean Osborne
University of Oxford
open sourceartificial intelligenceAI governance
M
Matt White
PyTorch Foundation; GM of AI, Linux Foundation
Steve Yang
Steve Yang
Associate Professor of Fintech, Stevens Institute of Technology
Operations ResearchPortfolio TheoryAlgorithmic TradingFinancial Econometrics
Kairong Xiao
Kairong Xiao
Columbia Business School
Financial IntermediationIndustrial OrganizationMonetary EconomicsPolitical Economy
X
Xiao-Yang Liu Yanglet
Columbia University, Rensselaer Polytechnic Institute