Pierre Clavier
Scholar

Pierre Clavier

Google Scholar ID: -KnIaGsAAAAJ
Research Scientist at Cohere
Reinforcement LearningMachine LearningStatistics
Citations & Impact
All-time
Citations
247
 
H-index
6
 
i10-index
4
 
Publications
12
 
Co-authors
14
list available
Publications
1 items
Resume (English only)
Academic Achievements
  • 2025: Paper "ShiQ" accepted to NeurIPS 2025.
  • 2024: Two papers accepted to NeurIPS 2024 — TC-MDP (a new algorithm for Robust RL) and a theoretical paper on optimal sample complexity of Robust MDPs (co-authored with Laixi).
  • 2024: One paper accepted at ICML 2024 on Bandits with Variational Inference; one oral presentation at UAI 2024 on Robust MDP theory.
  • May 2024: Released three preprints — RRLS (a new benchmark for Robust RL), and two new algorithms: ExpectRL and TC-MDP.
  • January 2025: Contributed to Cohere’s AI model "Command A", fine-tuned using theoretically grounded RLHF algorithms CoPG and SRPO.
  • November 2024: Successfully defended PhD thesis.