Published papers include 'PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation', 'PodAgent: A Comprehensive Framework for Podcast Generation', and more, with some accepted by conferences such as ACL 2025 Findings, ACM Multimedia 2024, and INTERSPEECH 2023.
Research Experience
2018.05 - 2022.07: Applied Scientist at Microsoft (TTS Algorithm Team); 2016.08 - 2018.04: Research Intern at Microsoft Research Asia (Speech Group & IEG).
Education
PhD: The Chinese University of Hong Kong (2022-present), Supervisor: Prof. Tan Lee; M.S. and B.S.: South China University of Technology.
Background
Currently a fourth-year PhD student in the DSP & Speech Technology Laboratory (DSP-STL) at The Chinese University of Hong Kong (CUHK), under the supervision of Prof. Tan Lee. Research focuses on long-form audio and speech generation as well as multimodal agents. Previously worked as an applied scientist at Microsoft.
Miscellany
Planning to graduate in 2026 and actively seeking new opportunities in academic or industry research positions.