- Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation
- Feature Compression for Vision-Language Models: Query-Driven Encoding in Bandwidth-Constrained Environments
- When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
- Enhancing Robustness of Vision-Language Models through Orthogonality Learning and self-Regularization
Research Experience
Worked as a computer vision scientist at MeiTuan, collaborating closely with Dr. Lin Ma and Dr. Zequn Jie, focusing on 2D/3D/4D Label-Efficient Detection and Segmentation.
Education
Obtained dual B.Sc (main Applied Physics) and M.Sc (Computer Science) degrees from Shenzhen University. Currently pursuing a PhD at the Multimedia and Human Understanding Group (MHUG) at the University of Trento, supervised by Prof. Nicu Sebe.
Background
Research interests include Computer Vision and Deep Learning, particularly in Label-Efficient Learning, Multi-Modal Perception and Reasoning, Spatial Foundation Models, and Generative Models. Currently a PhD student at the Multimedia and Human Understanding Group (MHUG), supervised by Prof. Nicu Sebe.
Miscellany
Personal motto: Pushing a long-termist, strike the tough yet right things.