🤖 AI Summary
Existing geometric reasoning models struggle to dynamically construct and verify auxiliary elements—such as auxiliary lines or points—thereby limiting their capability for formal proof generation. To address this, we propose the first unified multimodal framework integrating auxiliary construction, Lean4-based formal reasoning, and interactive visualization. Our method parses multimodal (text-and-diagram) inputs using a multimodal large language model, models geometric entities via dynamic graph structures, and enables automatic introduction of auxiliary elements through cross-modal alignment, culminating in Lean4-verifiable proofs. For systematic evaluation, we introduce GeoInt, the first multimodal geometric reasoning benchmark comprising 1,885 problems. Experiments demonstrate that our approach significantly outperforms prior models on complex problems requiring explicit auxiliary constructions, achieving new state-of-the-art verification accuracy and reasoning robustness.
📝 Abstract
Mathematical geometric reasoning is essential for scientific discovery and educational development, requiring precise logic and rigorous formal verification. While recent advances in Multimodal Large Language Models (MLLMs) have improved reasoning tasks, existing models typically struggle with formal geometric reasoning, particularly when dynamically constructing and verifying auxiliary geometric elements. To address these challenges, we introduce Geoint-R1, a multimodal reasoning framework designed to generate formally verifiable geometric solutions from textual descriptions and visual diagrams. Geoint-R1 uniquely integrates auxiliary elements construction, formal reasoning represented via Lean4, and interactive visualization. To systematically evaluate and advance formal geometric reasoning, we propose the Geoint benchmark, comprising 1,885 rigorously annotated geometry problems across diverse topics such as plane, spatial, and solid geometry. Each problem includes structured textual annotations, precise Lean4 code for auxiliary constructions, and detailed solution steps verified by experts. Extensive experiments demonstrate that Geoint-R1 significantly surpasses existing multimodal and math-specific reasoning models, particularly on challenging problems requiring explicit auxiliary element constructions.