Yi Liu(刘熠)
Scholar

Yi Liu(刘熠)

Google Scholar ID: gGPehK4AAAAJ
Honor Device Co., Ltd
Deep learningVideo Understanding
Citations & Impact
All-time
Citations
1,380
 
H-index
7
 
i10-index
6
 
Publications
14
 
Co-authors
9
list available
Contact
Publications
14 items
Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
  • - Publications:
  • - MagicVL-2B: Empowering Vision-Language Models on Mobile Devices with Lightweight Visual Encoders via Curriculum Learning, arXiv, 2025 (AAAI 2026 under review, 1st author)
  • - MagicGen: A Universal Multimodal Data Synthesis Agent for Domain-Specific Vision-Language Model Tuning, arXiv, 2025 (In process, 1st corresponding)
  • - E-VRAG: Enhancing Long Video Understanding with Resource-Efficient Retrieval Augmented Generation, arXiv, 2025 (AAAI 2026 under review, 1st corresponding)
  • - VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking, arXiv, 2025 (NeurIPS 2025 under review, 2nd corresponding)
  • - LvBench: A Benchmark for Long-form Video Understanding with Versatile Multi-modal Question Answering, International Journal of Computer Vision, 2025 (IJCV, CAS 1st, IF=9.3, co-first 3rd)
  • - MLLM-TA: Leveraging Multimodal Large Language Models for Precise Temporal Video Grounding, IEEE Signal Processing Letters, 2024 (SPL, CAS 2nd, IF=3.9, 1st author)
  • - MVBench: A Comprehensive Multi-modal Video Understanding Benchmark, Computer Vision and Pattern Recognition, 2024 (CVPR, CCF-A, 6th author)
  • - F2S-Net: Learning Frame-To-Segment Prediction for Online Action Detection, Journal of Real-Time Image Processing, 2024 (JRTIP, CAS 3rd, IF=3.0, 1st author)
  • - Dual masked modeling for weakly-supervised temporal boundary discovery, IEEE Transactions on Multimedia, 2023 (TMM, CAS 1st, IF=9.7, co-first 2nd)
  • - Learning Discriminative Feature Representation for Open Set Action Recognition, ACM International Conference on Multimedia, 2023 (ACM MM, CCF-A, co-first 2nd)
  • - InternVideo: General Video Foundation Models via Generative and Discriminative Learning, arXiv, 2022 (SCIS under review, 9th author)
  • - FineAction: A Fine-Grained Video Dataset for Temporal Action Localization, IEEE Transactions on Image Processing, 2022 (TIP, CAS 1st, IF=13.7, 1st author)
  • - VideoPipe 2022 Challenge: Real-World Video Understanding for Urban Pipe Inspection, International Conference on Pattern Recognition, 2022 (ICPR, CCF-C, 1st author)
  • - Short video scene online start detection task and method research, Integrated Technology, 2021 (co-first 2nd)
  • - Awards:
  • - 1st Prize in ECCV 2022 Ego4D Episodic Memory Challenge, Moments Queries Track
  • - 1st Prize in ECCV 2022 Ego4D Episodic Memory Challenge, Looking At Me Track
Research Experience
  • - Research intern at Shanghai AI Laboratory from 2022 to 2023
Education
  • - Ph.D. degree: University of Chinese Academy of Sciences (UCAS), MMLab@SIAT, supervised by Prof. Yu Qiao and Prof. Yali Wang, 2024
  • - B.Eng. degree: Huazhong University of Science and Technology (HUST), Wuhan, China, 2019
Background
  • - Research Interests: Vision-Language Models, Video Understanding
  • - About Me: Now I work at Honor Device Co., Ltd as the project leader (PL) of the On-device VLM Group, focusing on Vision-Language Models and Video Understanding.
Miscellany
  • - Workshops organizer:
  • - ECCV 2022 DeeperAction Challenge, Track 1: Temporal Action Localization
  • - ICPR 2022 VideoPipe Challenge, Track 2: Temporal Defect Localization
  • - ICCV 2021 DeeperAction Challenge, Track 1: Temporal Action Localization