- Preprint: Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation, 2025
- Paper accepted by NeurIPS 2025: Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
- Paper accepted by ICLR 2025: Multi-modal Agent Tuning (MAT): A framework for auto-generating multimodal tool-usage trajectories
- Dataset FIRE accepted by NeurIPS 2024
Research Experience
- 2025/7-Present: Researcher at Tiktok AI Innovation Center, working on SFT & RL for GUI Agent
- 2023/3-2025/7: Research Engineer at Beijing Institute for General Artificial Intelligence (BIGAI), worked on Vision Language Model & Agentic Task Post-Training
- 2020/6-2023/3: Software Engineer at ByteDance, worked on Tiktok & Lark projects
Education
- Master's degree, New York University, 2018/9-2020/6, Data Science
- Bachelor's degree, The Ohio State University, 2013/8-2018/5, Biomedical Engineering
Background
Research interests include GUI agents, multimodal reasoning, etc. Specializes in data science and biomedical engineering.