Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
- 2025: 'Efficient and Workload-Aware LLM Serving via Runtime Layer Swapping and KV Cache Resizing', Authors: Zhaoyuan Su, Tingfeng Lan, Zirui Wang, Juncheng Yang, Yue Cheng
- 2025: 'Towards Efficient LLM Storage Reduction via Tensor Deduplication and Delta Compression', Authors: Zirui Wang, Tingfeng Lan, Zhaoyuan Su, Juncheng Yang, Yue Cheng
- 2024: 'Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to Ask', Authors: Zhaoyuan Su, Ammar Ahmed, Zirui Wang, Ali Anwar, Yue Cheng
- 2021: 'Temporal Cue Guided Video Highlight Detection with Low-Rank Audio-Visual Fusion', Authors: Qinghao Ye*, Xiyue Shen*, Yuan Gao*, Zirui Wang*, Qi Bi, Ping Li, Guang Yang
Research Experience
- Working in the DS2 Lab, focusing on serverless AI, storage systems for AI, and serverless computing.
Education
- 2024 - Present: PhD in Computer Science at the University of Virginia, supervised by Prof. Yue Cheng
- 2022 - 2024: MS in Computer Science at Boston University
- 2018 - 2022: BS in Computer Science at Hangzhou Dianzi University
Background
- Research interests: machine learning systems, cache systems, and distributed systems
- Field: Computer Science
- Brief introduction: dedicated to building efficient and scalable systems for next-generation data-intensive applications. Focuses on addressing challenges in real-world storage systems.