Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Publications:
- Qwen-Image Technical Report
- Manager: Aggregating Insights from Unimodal Experts in Two-Tower VLMs and MLLMs
- Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models
- ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
- BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning
Research Experience
Work Experience: Member of the Language Analysis Group at the Research Center for Social Computing and Interactive Robotics (HIT-SCIR).
Education
Degree: Ph.D.; University: Harbin Institute of Technology; Advisor: Prof. Wanxiang Che; Year: Fifth year; Major: Not specified.
Background
Research Interests: Task-oriented Dialogue Systems, Natural Language Processing (2020-2021), Vision-Language Learning (2022-2023), Multimodal Generation Model and Multimodal Large Language Model (2023-Present). Background: Xiao Xu is a fifth-year Ph.D. student at Harbin Institute of Technology, advised by Prof. Wanxiang Che. He is a member of the Language Analysis Group at the Research Center for Social Computing and Interactive Robotics (HIT-SCIR).
Miscellany
Personal Interests: Music, singing, animation, and all good things in life.