Scholar
Xinyu Duan
Google Scholar ID: Z1XYinwAAAAJ
Huawei Cloud
LLM
Inference Optimization
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
417
H-index
12
i10-index
13
Publications
20
Co-authors
0
Contact
No contact links provided.
Publications
10 items
Cross-Resolution Distribution Matching for Diffusion Distillation
2026
Cited
0
$A^3$: Attention-Aware Accurate KV Cache Fusion for Fast Large Language Model Serving
2025
Cited
0
CaliDrop: KV Cache Compression with Calibration
2025
Cited
0
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
2025
Cited
0
Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification
2025
Cited
0
Accurate KV Cache Quantization with Outlier Tokens Tracing
2025
Cited
0
Taming the Titans: A Survey of Efficient LLM Inference Serving
2025
Cited
0
Beware of Calibration Data for Pruning Large Language Models
International Conference on Learning Representations · 2024
Cited
2
Load more
Resume (English only)
Co-authors
0 total
Co-authors: 0 (list not available)
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up