Scholar

Yogesh Kulkarni

Google Scholar ID: _GCLk8UAAAAJ

CS PhD Student @ Arizona State University

Multimodal LLMsReinforcement LearningVideo Understanding

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

5 items

2025

Cited

2025

Cited

2025

Cited

2025

Cited

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

‘AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video’ (arXiv, 2025): Introduces AVATAR framework with off-policy architecture and Temporal Advantage Shaping (TAS) for improved multimodal reasoning on audio-visual benchmarks
‘VideoPASTA: 7K Preference Pairs That Matter for Video-LLM Alignment’ (EMNLP 2025 Main Track): Uses adversarial preference pairs targeting spatial, temporal, and cross-frame errors to achieve high efficiency with only 7k samples
‘ReGATE: Learning Faster and Better with Fewer Tokens in MLLMs’ (arXiv, 2025): Proposes Reference-Guided Adaptive Token Elision to accelerate MLLM training, achieving 2x speedup on MVBench
‘VideoSAVi: Self-Aligned Video Language Models without Human Supervision’ (COLM 2025): Enables self-supervised alignment via model self-critique to generate preference data
‘EnsembleNTLDetect: An Intelligent Framework for Electricity Theft Detection in Smart Grid’ (ICDM Workshop, 2021): Robust ensemble framework for detecting electricity theft using smart grid consumption data

Co-authors

4 total