MoCA-Agent: A Market-of-Claims Code Agent for Financial and Numerical Reasoning

📅 2026-06-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of subtle factual, formulaic, or numerical errors in financial and tabular question answering, which often yield plausible yet incorrect answers. The authors propose a verification mechanism grounded in an atomic claim market: complex questions are decomposed into typed atomic claims, which specialized trading agents buy and sell via a market mechanism to express confidence. These agents’ aggregated, confidence-weighted accept/reject decisions are then synthesized into executable Python programs, subsequently refined by a code-aware verifier. By replacing conventional free-form debate with a structured market-based approach, the method substantially enhances robustness in high-stakes numerical reasoning. It achieves state-of-the-art performance across ten benchmarks, including FinQA (78.3%), FinanceMath (76.0%), MultiHiertt (71.2%), ESGenius (86.9%), and FinChart-Bench (85.6% average).

📝 Abstract

Financial and tabular question answering requires more than fluent reasoning: answers must be grounded in the exact facts, formulas, units, signs, and scales that support them. A single misread cell or incorrect operation can silently produce a plausible but wrong result. We introduce \textsc{MOCA-Agent}, a market-of-claims code agent that replaces free-form multi-agent debate with claim-level verification. The system decomposes each question into typed atomic claims, asks specialist trader agents to buy or sell those claims, clears their orders into confidence-weighted accept/reject decisions, and synthesizes an executable Python program from market-supported evidence. A code-aware verifier then checks the program for execution, structural consistency, and common financial reasoning errors, with at most one market-aware repair round. Across ten public benchmarks spanning financial numerical reasoning, general tabular reasoning, ESG question answering, and multimodal chart reasoning, \textsc{MOCA-Agent} achieves strong performance using a fixed Qwen3.6-27B backbone, including $78.3\%$ on FinQA, $76.0\%$ on FinanceMath, $71.2\%$ on MultiHiertt, $86.9\%$ on ESGenius, and $85.6\%$ average on FinChart-Bench. These results show that aggregating evidence at the level of atomic claims, rather than whole answers, improves robustness in high-stakes numerical reasoning.\footnote{The code and data are available: https://github.com/UBC-NLP/MoCA-Agent.

Problem

Research questions and friction points this paper is trying to address.

financial reasoning

tabular question answering

numerical reasoning

fact grounding

claim verification

Innovation

Methods, ideas, or system contributions that make the work stand out.

market-of-claims

atomic claims

code agent