Scholar
Soichiro Nishimori
Google Scholar ID: swJkeuUAAAAJ
The University of Tokyo
Reinforcement Learning
Statistical Reinforcement Learning
Game AI
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
68
H-index
3
i10-index
1
Publications
9
Co-authors
0
Contact
No contact links provided.
Publications
11 items
Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests
2026
Cited
0
Retry Policy Gradients in Continuous Action Spaces
2026
Cited
0
On Advantage Estimates for Max@K Policy Gradients
2026
Cited
0
OrderGrad: Optimizing Beyond the Mean with Order-Statistic Policy Gradient Estimation
2026
Cited
0
Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying
2026
Cited
0
Finite-Time Regret Analysis of Retry-Aware Bandits
2026
Cited
0
Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX
2026
Cited
0
Mitigating Reward Hacking in RLHF via Advantage Sign Robustness
2026
Cited
0
Load more
Resume (English only)
Co-authors
0 total
Co-authors: 0 (list not available)
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up