Yu Zhang
Scholar

Yu Zhang

Google Scholar ID: kA9A6LsAAAAJ
ByteDance
Spatial AudioSinging Voice SynthesisMusic GenerationSpeech Synthesis
Citations & Impact
All-time
Citations
128
 
H-index
6
 
i10-index
6
 
Publications
15
 
Co-authors
8
list available
Resume (English only)
Academic Achievements
  • Published over 10 first-author papers in top international AI conferences such as NeurIPS, ACL, and AAAI. Some notable works include: ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting, ASAudio: A Survey of Advanced Spatial Audio Research, MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations, VersBand: Versatile Framework for Song Generation with Prompt-based Control, TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis, GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks.
Research Experience
  • Current position as a Research Scientist at ByteDance. Previous visiting scholar experiences at the University of Rochester and the University of Massachusetts Amherst.
Education
  • Ph.D. in Computer Science and Technology from Zhejiang University, supervised by Prof. Zhou Zhao. Dual Bachelor's Degrees in Computer Science and Automation from Chu Kochen Honors College, Zhejiang University. Visiting Scholar at the University of Rochester, working with Prof. Zhiyao Duan, and at the University of Massachusetts Amherst, working with Prof. Przemyslaw Grabowicz.
Background
  • Research Interests: Multi-Modal Generative AI (Spatial Audio/Music/Singing/Speech). Background: Currently a Research Scientist at ByteDance.
Miscellany
  • Email: aaron9834@icloud.com. LinkedIn, DBLP, Github, Google Scholar, and ORCID profiles are available.