MLP Splatting: Object-Centric Neural Fields

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Existing 3D representations struggle to simultaneously achieve high-quality novel view synthesis and object-level editability. This work proposes MLP-Splatting, which introduces compact, standalone MLPs as local neural primitives to model radiance and opacity, enabling efficient rendering through sparse voxel composition. The method automatically decomposes scenes into semantically coherent objects or parts using only RGB supervision—without requiring segmentation masks—and supports interactive editing, open-vocabulary scene interaction, and instant segmentation. Compared to semantic 3D Gaussian Splatting, MLP-Splatting reduces memory consumption to 1/15 and accelerates rendering by 3×, while preserving high-fidelity synthesis and fine-grained object-level manipulation capabilities.

📝 Abstract

3D representations are fundamental to scene rendering, understanding, and interaction. Recent approaches, such as 3D Gaussian Splatting and Neural Radiance Fields, achieve impressive photorealistic novel-view synthesis, but lack the ability to easily decompose scene elements into a few primitives, requiring additional segmentation or grouping for object-level manipulation. We present MLP-Splatting, a method that enables scene decomposition via a few expressive light-field primitives while providing photorealistic novel-view synthesis. MLP-Splatting models each primitive as an independent compact MLP with localized spatial support that predicts radiance and opacity. In contrast to low-level Gaussian primitives or a single global radiance field, our neural primitives provide greater expressive capacity while remaining spatially localized. Rendering is performed through efficient sparse volumetric compositing over ray-primitive interactions. Our primitives are supervised using RGB supervision alone, which yields primitives that represent local scene regions often corresponding to objects or object parts, enabling interactive object-level editing without segmentation masks by selecting a handful of primitives. Our method, augmented with optional semantic feature distillation, enables open-vocabulary scene interaction and open-set instant segmentation. Compared to state-of-the-art methods, we achieve substantially lower memory usage (1/15$\times$) and faster rendering (3$\times$), as we show in our experiments compared to semantic 3DGS methods. Project Page: https://shinjeongkim.com/mlp-splatting

Problem

Research questions and friction points this paper is trying to address.

scene decomposition

object-centric representation

3D scene representation

novel-view synthesis

interactive editing

Innovation

Methods, ideas, or system contributions that make the work stand out.

MLP-Splatting

object-centric representation

neural primitives