Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
1. Efficient LLM Serving on Hybrid Real-time and Best-effort Requests (2025).
2. OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training (2025).
3. Cdmpp: A device-model agnostic framework for latency prediction of tensor programs (2024).
4. Llm-pq: Serving llm on heterogeneous clusters with phase-aware partition and adaptive quantization (2024).
5. POSTER: LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization (2024).
Research Experience
Open for full-time roles, especially in LLM inference optimization (kernel development, custom chip design, quantization, distributed inference). Also interested in game development and blockchain, with hands-on project experience in V/AR and game projects, and papers on blockchain.
Education
PhD in Machine Learning System, University of Hong Kong; BSc in Computer Science and Technology, The Chinese University of Hong Kong (Shenzhen)
Background
Research Interests: Machine Learning System, Games, Blockchain. Professional Field: Focuses on efficient inference and training of large foundation models, including vision-language models and large language models, as well as quantization techniques. Bio: PhD Candidate at the University of Hong Kong.