- d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning (ICML 2025)
- Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning (ICML 2025)
- Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback (RLC 2025)
- Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces (ICLR 2025)
- Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (COLM 2024)
- Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning (ICLR 2024 Generative Models for Decision Making Workshop)
- Guided Flows for Generative Modeling and Decision Making
- Dual RL: Unification and New Methods for Reinforcement and Imitation Learning (ICLR 2024, Spotlight)
- Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories (ICML 2023)
- ConserWeightive Behavioral Cloning for Reliable Offline Reinforcement Learning (NeurIPS 2022 Foundation Models for Decision Making Workshop)
- Latent State Marginalization as a Low-cost Approach for Improving Exploration (ICLR 2023)
- Online Decision Transformer (ICML 2022, Long Oral Presentation)
- Near-Optimal Confidence Sequences for Bounded Random Variables (ICML 2021, Spotlight)
- A Theorem of the Alternative for Personalized Federated Learning (Submitted)
Research Experience
- 2017-2019: Research Scientist at Facebook, working on building distributed training systems for Ads recommendation models
- Postdoc researcher in the Statistics Department, Wharton School, University of Pennsylvania, working on differential privacy and statistical inference
Education
- Ph.D. in Computer Science from the University of Chicago (2017), advised by Prof. John Lafferty
- Master's degree from Max Planck Institute for Informatics
- Bachelor's degree from Zhejiang University
Background
- Research Interests: Generative models and reinforcement learning, recently involving LLM reasoning
- Professional Fields: Machine learning, optimization, and statistics
- Brief Introduction: Currently a Research Scientist at Meta, focusing on developing novel computationally efficient methods with theoretical guarantees, particularly for nonconvex problems.