Object Agnostic 3D Lifting in Space and Time

📅 2024-12-02
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses category-agnostic 2D-to-3D pose lifting: reconstructing high-accuracy, temporally consistent 3D poses solely from time-series 2D keypoint sequences across diverse animal categories, without category-specific priors. We propose the first unified framework jointly modeling category invariance and spatiotemporal dependencies—leveraging cross-category representation transfer for generalization across anatomically similar animals, and enforcing local temporal window consistency alongside a spatiotemporal graph neural network to capture long-range dynamics. Training is fully synthetic; we introduce and publicly release the first large-scale synthetic dataset featuring realistic 3D skeletal articulations and motion sequences for multiple animal species (e.g., cats, dogs, horses). Our method achieves state-of-the-art performance across all evaluated animal categories, simultaneously attaining superior single-frame accuracy and video-level temporal smoothness.

Technology Category

Application Category

📝 Abstract
We present a spatio-temporal perspective on category-agnostic 3D lifting of 2D keypoints over a temporal sequence. Our approach differs from existing state-of-the-art methods that are either: (i) object-agnostic, but can only operate on individual frames, or (ii) can model space-time dependencies, but are only designed to work with a single object category. Our approach is grounded in two core principles. First, general information about similar objects can be leveraged to achieve better performance when there is little object-specific training data. Second, a temporally-proximate context window is advantageous for achieving consistency throughout a sequence. These two principles allow us to outperform current state-of-the-art methods on per-frame and per-sequence metrics for a variety of animal categories. Lastly, we release a new synthetic dataset containing 3D skeletons and motion sequences for a variety of animal categories.
Problem

Research questions and friction points this paper is trying to address.

3D lifting of 2D keypoints
temporal sequence analysis
object-agnostic approach
Innovation

Methods, ideas, or system contributions that make the work stand out.

Category-agnostic 3D lifting
Temporal context window
Synthetic dataset release
🔎 Similar Papers
No similar papers found.