Geoint-R1: Formalizing Multimodal Geometric Reasoning with Dynamic Auxiliary Constructions

📅 2025-08-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing geometric reasoning models struggle to dynamically construct and verify auxiliary elements—such as auxiliary lines or points—thereby limiting their capability for formal proof generation. To address this, we propose the first unified multimodal framework integrating auxiliary construction, Lean4-based formal reasoning, and interactive visualization. Our method parses multimodal (text-and-diagram) inputs using a multimodal large language model, models geometric entities via dynamic graph structures, and enables automatic introduction of auxiliary elements through cross-modal alignment, culminating in Lean4-verifiable proofs. For systematic evaluation, we introduce GeoInt, the first multimodal geometric reasoning benchmark comprising 1,885 problems. Experiments demonstrate that our approach significantly outperforms prior models on complex problems requiring explicit auxiliary constructions, achieving new state-of-the-art verification accuracy and reasoning robustness.

Technology Category

Application Category

📝 Abstract
Mathematical geometric reasoning is essential for scientific discovery and educational development, requiring precise logic and rigorous formal verification. While recent advances in Multimodal Large Language Models (MLLMs) have improved reasoning tasks, existing models typically struggle with formal geometric reasoning, particularly when dynamically constructing and verifying auxiliary geometric elements. To address these challenges, we introduce Geoint-R1, a multimodal reasoning framework designed to generate formally verifiable geometric solutions from textual descriptions and visual diagrams. Geoint-R1 uniquely integrates auxiliary elements construction, formal reasoning represented via Lean4, and interactive visualization. To systematically evaluate and advance formal geometric reasoning, we propose the Geoint benchmark, comprising 1,885 rigorously annotated geometry problems across diverse topics such as plane, spatial, and solid geometry. Each problem includes structured textual annotations, precise Lean4 code for auxiliary constructions, and detailed solution steps verified by experts. Extensive experiments demonstrate that Geoint-R1 significantly surpasses existing multimodal and math-specific reasoning models, particularly on challenging problems requiring explicit auxiliary element constructions.
Problem

Research questions and friction points this paper is trying to address.

Formalizing multimodal geometric reasoning with dynamic auxiliary constructions
Addressing limitations in formal geometric reasoning by MLLMs
Creating a benchmark for verifiable geometric problem-solving
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates auxiliary elements construction
Uses Lean4 for formal reasoning
Includes interactive visualization
🔎 Similar Papers
No similar papers found.
J
Jingxuan Wei
Shenyang institute of computing technology, Chinese academy of sciences
C
Caijun Jia
Shenyang institute of computing technology, Chinese academy of sciences
Q
Qi Chen
Shenyang institute of computing technology, Chinese academy of sciences
H
Honghao He
Shenyang institute of computing technology, Chinese academy of sciences
Linzhuang Sun
Linzhuang Sun
University of Chinese Academy of Sciences
Multimodal Reasoning
Conghui He
Conghui He
Shanghai AI Laboratory
Data-centric AILLMDocument Intelligence
Lijun Wu
Lijun Wu
Shanghai AI Laboratory
MLLLMAI4Science
B
Bihui Yu
Shenyang institute of computing technology, Chinese academy of sciences
C
Cheng Tan
Shanghai AI Laboratory