MP-Mat: A 3D-and-Instance-Aware Human Matting and Editing Framework with Multiplane Representation

📅 2025-04-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenging multi-instance portrait matting problem in complex scenes—characterized by fine details (e.g., hair), thin boundaries, and severe depth-based occlusions causing pixel ambiguity. We propose the first unified framework enabling instance-level alpha matte estimation and real-time editing. Methodologically, we introduce a novel dual-view multi-plane representation: geometrically, depth-stratified planes enable 3D-aware segmentation; instance-wise, both foreground and background are jointly modeled as editable instances—marking the first explicit incorporation of background modeling into matting frameworks. Key technical components include depth-guided multi-plane feature encoding, instance-decoupled dual-branch rendering, and an end-to-end differentiable matte-color joint prediction network. Our method sets new state-of-the-art accuracy on multiple benchmarks. Moreover, it achieves zero-shot image editing performance surpassing dedicated supervised models, demonstrating significant superiority in portrait compositing, background replacement, and relighting tasks.

Technology Category

Application Category

📝 Abstract
Human instance matting aims to estimate an alpha matte for each human instance in an image, which is challenging as it easily fails in complex cases requiring disentangling mingled pixels belonging to multiple instances along hairy and thin boundary structures. In this work, we address this by introducing MP-Mat, a novel 3D-and-instance-aware matting framework with multiplane representation, where the multiplane concept is designed from two different perspectives: scene geometry level and instance level. Specifically, we first build feature-level multiplane representations to split the scene into multiple planes based on depth differences. This approach makes the scene representation 3D-aware, and can serve as an effective clue for splitting instances in different 3D positions, thereby improving interpretability and boundary handling ability especially in occlusion areas. Then, we introduce another multiplane representation that splits the scene in an instance-level perspective, and represents each instance with both matte and color. We also treat background as a special instance, which is often overlooked by existing methods. Such an instance-level representation facilitates both foreground and background content awareness, and is useful for other down-stream tasks like image editing. Once built, the representation can be reused to realize controllable instance-level image editing with high efficiency. Extensive experiments validate the clear advantage of MP-Mat in matting task. We also demonstrate its superiority in image editing tasks, an area under-explored by existing matting-focused methods, where our approach under zero-shot inference even outperforms trained specialized image editing techniques by large margins. Code is open-sourced at https://github.com/JiaoSiyi/MPMat.git}.
Problem

Research questions and friction points this paper is trying to address.

Estimating alpha matte for human instances in images
Handling complex cases with multiple instances and occlusions
Enabling instance-level image editing with multiplane representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D-aware multiplane representation for scene geometry
Instance-level multiplane representation for matting
Reusable representation for efficient image editing
🔎 Similar Papers
No similar papers found.
S
Siyi Jiao
Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, China
Wenzheng Zeng
Wenzheng Zeng
National University of Singapore
Computer Vision
Y
Yerong Li
Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, China
Huayu Zhang
Huayu Zhang
Senior Engineer, Huawei Technologies Co., Ltd
Distributed SystemNetwork ScienceMachine LearningOptimizationGraph Theory
C
Changxin Gao
Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, China
Nong Sang
Nong Sang
Huazhong University of Science and Technology
Computer Vision and Pattern Recognition
M
Mike Zheng Shou
Show Lab, National University of Singapore