MATT-GS: Masked Attention-based 3DGS for Robot Perception and Object Detection

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In industrial settings, robots exhibit low perception accuracy for fine structures—such as screws and cables—and are highly susceptible to background interference. To address this, we propose a 3D Gaussian Splatting (3DGS)-based reconstruction method integrating dynamic background removal with fine-grained attention enhancement. Our approach innovatively embeds a mask-guided Sobel gradient attention mechanism into the 3DGS rendering pipeline and couples it with U2-Net for robust, real-time background segmentation. A multi-objective optimization framework—combining L1 loss, SSIM, and PSNR—further improves geometric fidelity and texture sharpness. Experiments demonstrate that, compared to standard 3DGS, our method achieves a 32.7% improvement in microstructure modeling accuracy within complex smart-factory environments. Moreover, object recognition and grasp pose estimation robustness are significantly enhanced. This work establishes a new paradigm for high-accuracy, lightweight, and real-time 3D perception tailored to industrial robotics applications.

Technology Category

Application Category

📝 Abstract
This paper presents a novel masked attention-based 3D Gaussian Splatting (3DGS) approach to enhance robotic perception and object detection in industrial and smart factory environments. U2-Net is employed for background removal to isolate target objects from raw images, thereby minimizing clutter and ensuring that the model processes only relevant data. Additionally, a Sobel filter-based attention mechanism is integrated into the 3DGS framework to enhance fine details - capturing critical features such as screws, wires, and intricate textures essential for high-precision tasks. We validate our approach using quantitative metrics, including L1 loss, SSIM, PSNR, comparing the performance of the background-removed and attention-incorporated 3DGS model against the ground truth images and the original 3DGS training baseline. The results demonstrate significant improves in visual fidelity and detail preservation, highlighting the effectiveness of our method in enhancing robotic vision for object recognition and manipulation in complex industrial settings.
Problem

Research questions and friction points this paper is trying to address.

Enhance robotic perception in industrial environments
Improve object detection using masked 3D Gaussian Splatting
Capture fine details for high-precision industrial tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked attention-based 3D Gaussian Splatting
U2-Net for background removal
Sobel filter-based attention mechanism
🔎 Similar Papers
No similar papers found.