Preprint 'Ten Principles of AI Agent Economics' (arXiv 2025): proposes a framework for understanding AI agent decision-making and economic participation
Introduced Just-in-time Information Recommendation and released JIR-Arena, the first benchmark dataset (arXiv 2025)
Published 'TinyHelen's First Curriculum' on training tiny language models in simplified linguistic environments (arXiv 2024)
ICLR 2025 paper 'AgentOccam': a simple yet strong baseline for LLM-based web agents
NeurIPS 2024 D&B Track paper 'Bias and Volatility': statistical framework for evaluating LLM stereotypes and generation inconsistency (co-first author)
ICLR Workshop 2024 survey 'If LLM Is the Wizard, Then Code Is the Wand': explores how code empowers LLMs as intelligent agents (co-first author)