Scholar
Chaoyou Fu
Google Scholar ID: 4A1xYQwAAAAJ
Nanjing University
Multimodal LLM
LLM
Biometrics
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
4,869
H-index
27
i10-index
35
Publications
20
Co-authors
5
list available
Contact
GitHub
Open ↗
Publications
35 items
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
2026
Cited
0
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?
2026
Cited
0
Benchmarking PhD-Level Coding in 3D Geometric Computer Vision
2026
Cited
0
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
2026
Cited
0
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
2026
Cited
0
MAC: A Conversion Rate Prediction Benchmark Featuring Labels Under Multiple Attribution Mechanisms
2026
Cited
0
BABE: Biology Arena BEnchmark
2026
Cited
0
MME-CC: A Challenging Multi-Modal Evaluation Benchmark of Cognitive Capacity
2025
Cited
0
Load more
Resume (English only)
Academic Achievements
Pioneered the VITA series of multimodal LLMs (VITA-1.0/-1.5, Long-VITA, VITA-Audio, VITA-VLA, VITA-E)
Developed the MME benchmark series (MME, Video-MME, MME-RealWorld) for multimodal LLM evaluation
Founded the Awesome-MLLM community
Serves as Associate Editor for Pattern Recognition, Area Chair for ICLR
Member of CSIG Youth Committee and Executive Committee of CCF-AI & CCF-CV
Awards: CAS President’s Special Award, IEEE Biometrics Council Best Doctoral Dissertation Award
WAIC Yunfan Award, Xiaomi Young Scholar - Technology Innovation Award
Beijing Outstanding PhD Dissertation, CAS Outstanding PhD Dissertation
CVPR 2023 Outstanding Reviewer
Published in top venues including NeurIPS, CVPR, TPAMI, and National Science Review
Open-source projects widely recognized (e.g., VITA-1.5 with 2k+ Stars, MLLM Survey with 10k+ Stars)
Background
Researcher, Assistant Professor, and PhD Supervisor at the School of Intelligent Science and Technology, Nanjing University
Selected for the China Association for Science and Technology's 'Young Talent Support Program'
Leading the Multimodal Intelligence Group (NJU-MiG) at Nanjing University
Research focuses on Multimodal Large Language Models (Multimodal LLMs) and Large Language Models (LLMs)
Over 5,600 citations on Google Scholar, with a single first-author paper exceeding 1,000 citations
Open-source projects have accumulated over 20,000 GitHub Stars
Co-authors
5 total
Xing Sun
Tencent Youtu Lab
Co-author 2
Caifeng Shan
Philips Research
Tieniu Tan
Institute of Automation, Chinese Academy of Sciences
Liang Wang
National Lab of Pattern Recognition
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up