Scholar

Abhay Sheshadri

Google Scholar ID: dujRau4AAAAJ

Undergraduate, Georgia Institute of Technology

AI Safety

Google Scholar↗

Citations & Impact

All-time

Citations

153

H-index

5

i10-index

4

Publications

7

Co-authors

18

list available

Contact

No contact links provided.

Publications

5 items

Introspection Adapters: Training LLMs to Report Their Learned Behaviors

2026

Cited

0

AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors

2026

Cited

0

Why Do Some Language Models Fake Alignment While Others Don't?

2025

Cited

0

Obfuscated Activations Bypass LLM Latent-Space Defenses

arXiv.org · 2024

Cited

0

Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs

2024

Cited

72

Resume (English only)

Co-authors

18 total

Maths Undergrad @ University of Bristol

University College London

Research Manager, Anthropic Fellows Program, Program Manager, Constellation

PhD student, MIT

Dylan Hadfield-Menell

Massachusetts Institute of Technology

Asa Cooper Stickland

Research Scientist, UK AI Security Institute