PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters

📅 2026-03-04

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the challenge of effectively extending pretrained 2D foundation models to 3D tasks without retraining, architectural modifications, or additional parameters. The authors propose PlaneCycle, a method that progressively fuses 3D features by cyclically applying spatial aggregation across three orthogonal planes—HW, DW, and DH—and alternately processing 2D slices along the network depth dimension. This approach uniquely enables training-free, parameter-free, and architecture-agnostic transfer from 2D to 3D, preserving the pretrained model’s inductive biases while unlocking its latent 3D understanding capabilities. Experiments with DINOv3 demonstrate that PlaneCycle surpasses both slice-wise 2D baselines and strong 3D models under linear probing, approaching full fine-tuning performance; with full fine-tuning, it matches standard 3D architectures, validating its effectiveness across six classification and three segmentation benchmarks.

Technology Category

Application Category

📝 Abstract

Large-scale 2D foundation models exhibit strong transferable representations, yet extending them to 3D volumetric data typically requires retraining, adapters, or architectural redesign. We introduce PlaneCycle, a training-free, adapter-free operator for architecture-agnostic 2D-to-3D lifting of foundation models. PlaneCycle reuses the original pretrained 2D backbone by cyclically distributing spatial aggregation across orthogonal HW, DW, and DH planes throughout network depth, enabling progressive 3D fusion while preserving pretrained inductive biases. The method introduces no additional parameters and is applicable to arbitrary 2D networks. Using pretrained DINOv3 models, we evaluate PlaneCycle on six 3D classification and three 3D segmentation benchmarks. Without any training, the lifted models exhibit intrinsic 3D fusion capability and, under linear probing, outperform slice-wise 2D baselines and strong 3D counterparts, approaching the performance of fully trained models. With full fine-tuning, PlaneCycle matches standard 3D architectures, highlighting its potential as a seamless and practical 2D-to-3D lifting operator. These results demonstrate that 3D capability can be unlocked from pretrained 2D foundation models without structural modification or retraining. Code is available at https://github.com/HINTLab/PlaneCycle.

Problem

Research questions and friction points this paper is trying to address.

2D-to-3D lifting

foundation models

training-free

3D representation

architecture-agnostic

Innovation

Methods, ideas, or system contributions that make the work stand out.

training-free

2D-to-3D lifting

foundation models