MGP-KAD: Multimodal Geometric Priors and Kolmogorov-Arnold Decoder for Single-View 3d Reconstruction in Complex Scenes

📅 2025-09-14

🏛️ International Conference on Information Photonics

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

Single-view 3D reconstruction in complex real-world scenes is often hindered by noise, object diversity, and data scarcity, leading to insufficient geometric accuracy and detail recovery. To address these challenges, this work proposes the MGP-KAD framework, which first generates category-level geometric priors through clustering and dynamically fuses RGB images with multimodal geometric cues. Furthermore, it introduces a hybrid decoder based on Kolmogorov–Arnold Networks (KANs), overcoming the representational limitations of conventional linear decoders when handling complex multimodal inputs. This approach marks the first application of KANs to 3D reconstruction and achieves state-of-the-art performance on Pix3D, significantly enhancing geometric completeness, surface smoothness, and fine-detail preservation.

Technology Category

Application Category

📝 Abstract

Single-view 3D reconstruction in complex real-world scenes is challenging due to noise, object diversity, and limited dataset availability. To address these challenges, we propose MGP-KAD, a novel multimodal feature fusion framework that integrates RGB and geometric prior to enhance reconstruction accuracy. The geometric prior is generated by sampling and clustering ground-truth object data, producing class-level features that dynamically adjust during training to improve geometric understanding. Additionally, we introduce a hybrid decoder based on Kolmogorov-Arnold Networks (KAN) to overcome the limitations of traditional linear decoders in processing complex multimodal inputs. Extensive experiments on the Pix3D dataset demonstrate that MGP-KAD achieves state-of-the-art (SOTA) performance, significantly improving geometric integrity, smoothness, and detail preservation. Our work provides a robust and effective solution for advancing single-view 3D reconstruction in complex scenes.

Problem

Research questions and friction points this paper is trying to address.

single-view 3D reconstruction

complex scenes

geometric prior

multimodal fusion

object diversity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal fusion

Geometric priors

Kolmogorov-Arnold Networks

Single-view 3D reconstruction

Dynamic feature adjustment

🔎 Similar Papers

No similar papers found.

Authors to Follow