Scholar
Xiaohan Wang
Google Scholar ID: H0kocJUAAAAJ
Meituan
Deep reinforcement learning
Optimization
Causal inference
Scheduling
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
613
H-index
13
i10-index
16
Publications
20
Co-authors
6
list available
Contact
No contact links provided.
Publications
16 items
TAPO: Tool-Aware Policy Optimization via Credit Transfer for Multimodal Search Agents
2026
Cited
0
VistaHop: Benchmarking Multi-hop Visual Reasoning for Visual DeepSearch
2026
Cited
0
Are Full Rollouts Necessary for On-Policy Distillation?
2026
Cited
0
ZipRL: Adaptive Multi-Turn Context Compression with Hindsight Response Replay
2026
Cited
0
Joint Training of Multi-Token Prediction in Reinforcement Learning via Optimal Coefficient Calibration
2026
Cited
0
When Self-Belief Misleads: Active Label Acquisition for Reinforcement Learning with Verifiable Rewards
2026
Cited
0
Implicit Hierarchical GRPO: Decoupling Tool Invocation from Execution for Tool-Integrated Mathematical Reasoning
2026
Cited
0
AMR-SD: Asymmetric Meta-Reflective Self-Distillation for Token-Level Credit Assignment
2026
Cited
0
Load more
Resume (English only)
Co-authors
6 total
Lin Zhang
Professor Ph.D., Director, Cloud Manufacturing Research Center, Beihang Universiy
Yuanjun Laili
Beihang University
Lihui Wang
Chair Professor of Sustainable Manufacturing, KTH
Xi Vincent Wang
KTH Royal Institute of Technology
Co-author 5
Co-author 6
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up