Published over 10 first-author papers in top international AI conferences such as NeurIPS, ACL, and AAAI. Some notable works include: ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting, ASAudio: A Survey of Advanced Spatial Audio Research, MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations, VersBand: Versatile Framework for Song Generation with Prompt-based Control, TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis, GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks.
Research Experience
Current position as a Research Scientist at ByteDance. Previous visiting scholar experiences at the University of Rochester and the University of Massachusetts Amherst.
Education
Ph.D. in Computer Science and Technology from Zhejiang University, supervised by Prof. Zhou Zhao. Dual Bachelor's Degrees in Computer Science and Automation from Chu Kochen Honors College, Zhejiang University. Visiting Scholar at the University of Rochester, working with Prof. Zhiyao Duan, and at the University of Massachusetts Amherst, working with Prof. Przemyslaw Grabowicz.
Background
Research Interests: Multi-Modal Generative AI (Spatial Audio/Music/Singing/Speech). Background: Currently a Research Scientist at ByteDance.
Miscellany
Email: aaron9834@icloud.com. LinkedIn, DBLP, Github, Google Scholar, and ORCID profiles are available.