MoCA-Agent: A Market-of-Claims Code Agent for Financial and Numerical Reasoning

πŸ“… 2026-06-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of subtle factual, formulaic, or numerical errors in financial and tabular question answering, which often yield plausible yet incorrect answers. The authors propose a verification mechanism grounded in an atomic claim market: complex questions are decomposed into typed atomic claims, which specialized trading agents buy and sell via a market mechanism to express confidence. These agents’ aggregated, confidence-weighted accept/reject decisions are then synthesized into executable Python programs, subsequently refined by a code-aware verifier. By replacing conventional free-form debate with a structured market-based approach, the method substantially enhances robustness in high-stakes numerical reasoning. It achieves state-of-the-art performance across ten benchmarks, including FinQA (78.3%), FinanceMath (76.0%), MultiHiertt (71.2%), ESGenius (86.9%), and FinChart-Bench (85.6% average).
πŸ“ Abstract
Financial and tabular question answering requires more than fluent reasoning: answers must be grounded in the exact facts, formulas, units, signs, and scales that support them. A single misread cell or incorrect operation can silently produce a plausible but wrong result. We introduce \textsc{MOCA-Agent}, a market-of-claims code agent that replaces free-form multi-agent debate with claim-level verification. The system decomposes each question into typed atomic claims, asks specialist trader agents to buy or sell those claims, clears their orders into confidence-weighted accept/reject decisions, and synthesizes an executable Python program from market-supported evidence. A code-aware verifier then checks the program for execution, structural consistency, and common financial reasoning errors, with at most one market-aware repair round. Across ten public benchmarks spanning financial numerical reasoning, general tabular reasoning, ESG question answering, and multimodal chart reasoning, \textsc{MOCA-Agent} achieves strong performance using a fixed Qwen3.6-27B backbone, including $78.3\%$ on FinQA, $76.0\%$ on FinanceMath, $71.2\%$ on MultiHiertt, $86.9\%$ on ESGenius, and $85.6\%$ average on FinChart-Bench. These results show that aggregating evidence at the level of atomic claims, rather than whole answers, improves robustness in high-stakes numerical reasoning.\footnote{The code and data are available: https://github.com/UBC-NLP/MoCA-Agent.
Problem

Research questions and friction points this paper is trying to address.

financial reasoning
tabular question answering
numerical reasoning
fact grounding
claim verification
Innovation

Methods, ideas, or system contributions that make the work stand out.

market-of-claims
atomic claims
code agent
financial reasoning
claim-level verification
πŸ”Ž Similar Papers
No similar papers found.