Yogesh Kulkarni
Scholar

Yogesh Kulkarni

Google Scholar ID: _GCLk8UAAAAJ
CS PhD Student @ Arizona State University
Multimodal LLMsReinforcement LearningVideo Understanding
Citations & Impact
All-time
Citations
31
 
H-index
4
 
i10-index
1
 
Publications
7
 
Co-authors
4
list available
Resume (English only)
Academic Achievements
  • ‘AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video’ (arXiv, 2025): Introduces AVATAR framework with off-policy architecture and Temporal Advantage Shaping (TAS) for improved multimodal reasoning on audio-visual benchmarks
  • ‘VideoPASTA: 7K Preference Pairs That Matter for Video-LLM Alignment’ (EMNLP 2025 Main Track): Uses adversarial preference pairs targeting spatial, temporal, and cross-frame errors to achieve high efficiency with only 7k samples
  • ‘ReGATE: Learning Faster and Better with Fewer Tokens in MLLMs’ (arXiv, 2025): Proposes Reference-Guided Adaptive Token Elision to accelerate MLLM training, achieving 2x speedup on MVBench
  • ‘VideoSAVi: Self-Aligned Video Language Models without Human Supervision’ (COLM 2025): Enables self-supervised alignment via model self-critique to generate preference data
  • ‘EnsembleNTLDetect: An Intelligent Framework for Electricity Theft Detection in Smart Grid’ (ICDM Workshop, 2021): Robust ensemble framework for detecting electricity theft using smart grid consumption data