PhD Thesis: 'An Empirical Study of Autoregressive Pre-training from Videos' (May 2025)
'Gaussian Masked Autoencoders': Introduced 3D Gaussians as intermediate representations for zero-shot segmentation and related tasks
'Scaling Properties of Diffusion Models For Perceptual Tasks' (CVPR 2025): Studied scaling of diffusion models for depth, flow, and amodal segmentation
'Humanoid Locomotion as Next Token Prediction' (NeurIPS 2024): Framed real-world humanoid control as next-token prediction
'Humans in 4D: Reconstructing and Tracking Humans with Transformers' (ICCV 2023): Fully transformer-based human mesh recovery system
'On the Benefits of 3D Pose and Tracking for Human Action Recognition': Investigated the role of 3D pose in action recognition