Published several papers, including 'Safe Learning Under Irreversible Dynamics via Asking for Help' and 'Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety.' His work has been presented at conferences such as NeurIPS and ICML.
Research Experience
Postdoctoral research fellow at the Center for Human-Compatible AI (CHAI) at UC Berkeley, mentored by Stuart Russell. Focuses on AI safety, especially on generalization, training models to recognize unfamiliar situations and behave cautiously. Spent two years doing science and product work at Lyft.
Education
Received his PhD from the Computer Science Department at Stanford University in 2021, advised by Ashish Goel, and supported by an NSF Graduate Research Fellowship.
Background
Research interests: AI safety, particularly in how models handle unfamiliar inputs. Concerned about a wide range of risks from AI, including serious LLM errors, critical infrastructure failures, societal-scale catastrophe, exacerbation of societal inequalities, and economic disruption. Aims to use his career to do good in the world.