Zirui Wang
Scholar

Zirui Wang

Google Scholar ID: xfnk-58AAAAJ
Ph.D. Student, UC Berkeley
Machine LearningComputer VisionNatural Language Processing
Citations & Impact
All-time
Citations
427
 
H-index
7
 
i10-index
7
 
Publications
8
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • Publications: - YOLO-Count: Differentiable Object Counting for Text-to-Image Generation, ICCV 2025; - CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs, NeurIPS 2024, ECCV Workshop on Emergent Visual Abilities and Limits of Foundation Models; - Improving Language Understanding from Screenshots, Preprint 2024; - Language Models as Science Tutors, ICML 2024; - TokenCompose: Grounding Diffusion with Token-level Supervision, CVPR 2024; - OmniControlNet: Dual-stage Integration for Conditional Image Generation, CVPR 2024, Workshop in Generative Models for Computer Vision; Awards: Top Reviewer in NeurIPS 2024; Nominations: Nominated for Siebel Scholar 2025.
Research Experience
  • Joined UC Berkeley as a Ph.D. student on August 26, 2025; worked on multimodal pre-training, reasoning, and evaluation during his time at Princeton University.
Education
  • Ph.D. student at EECS, UC Berkeley, advised by Prof. Joseph Gonzalez, Prof. Trevor Darrell, and Prof. Ion Stoica; M.S.E. in Computer Science at Princeton University, advised by Prof. Danqi Chen; B.S. in Data Science from Halicioglu Data Science Institute (HDSI) and B.A. in Cognitive Science from CogSci Department at the University of California, San Diego (UCSD), advised by Prof. Zhuowen Tu and Prof. Zhiting Hu.
Background
  • Research Interests: grounded decision making, reasoning, and planning in large multimodal models. Field: Electrical Engineering and Computer Sciences (EECS).
Miscellany
  • Personal Interests: Not explicitly mentioned
Co-authors
0 total
Co-authors: 0 (list not available)