AgoraResearch hub
ExploreLibraryProfile
Account
Thilo Hagendorff
Scholar

Thilo Hagendorff

Google Scholar ID: GlhI8lQAAAAJ
Research Group Leader, University of Stuttgart
AI SafetyAI EthicsMachine PsychologyLarge Language Models
Homepage↗Google Scholar↗
Citations & Impact
All-time
Citations
5,324
 
H-index
30
 
i10-index
43
 
Publications
20
 
Co-authors
0
 
Contact
CVOpen ↗GitHubOpen ↗
Publications
8 items
"Dark Triad"Model Organisms of Misalignment: Narrow Fine-Tuning Mirrors Human Antisocial Behavior
2026
Cited
0
Emergently Misaligned Language Models Show Behavioral Self-Awareness That Shifts With Subsequent Realignment
2026
Cited
0
Speciesism in AI: Evaluating Discrimination Against Animals in Large Language Models
2025
Cited
0
Large Reasoning Models Are Autonomous Jailbreak Agents
2025
Cited
0
On the Inevitability of Left-Leaning Political Bias in Aligned Language Models
2025
Cited
0
PRIDE -- Parameter-Efficient Reduction of Identity Discrimination for Equality in LLMs
2025
Cited
0
Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models
2025
Cited
0
Compromising Honesty and Harmlessness in Language Models via Deception Attacks
2025
Cited
0
Resume (English only)
Co-authors
0 total
Co-authors: 0 (list not available)

Welcome back

Sign in to Agora

Welcome back! Please sign in to continue.

Do not have an account?