Scholar

Benjamin Plaut

Google Scholar ID: 6ndk_nAAAAAJ

University of California, Berkeley

AI safetyEconomics and computation

Citations & Impact

All-time

Citations

496

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

6 items

2025

Cited

2025

Cited

2025

Cited

2025

Cited

2024

Cited

2024

Cited

Resume (English only)

Academic Achievements

Published several papers, including 'Safe Learning Under Irreversible Dynamics via Asking for Help' and 'Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety.' His work has been presented at conferences such as NeurIPS and ICML.

Research Experience

Postdoctoral research fellow at the Center for Human-Compatible AI (CHAI) at UC Berkeley, mentored by Stuart Russell. Focuses on AI safety, especially on generalization, training models to recognize unfamiliar situations and behave cautiously. Spent two years doing science and product work at Lyft.

Education

Received his PhD from the Computer Science Department at Stanford University in 2021, advised by Ashish Goel, and supported by an NSF Graduate Research Fellowship.

Background

Research interests: AI safety, particularly in how models handle unfamiliar inputs. Concerned about a wide range of risks from AI, including serious LLM errors, critical infrastructure failures, societal-scale catastrophe, exacerbation of societal inequalities, and economic disruption. Aims to use his career to do good in the world.

Miscellany